PACKAGE | |STAT Data Manipulation and Analysis, by Gary Perlman |
---|---|
NAME | dm - data manipulation with conditional transformations |
SYNOPSIS | dm [Efile] [expressions] |
DESCRIPTION |
dm is a data manipulating program for column extraction from files,
possibly based on conditions, and production of algebraic combinations
of columns. dm reads whitespace separated fields on each line of its
input. dm takes a series of expressions, and for each line of the
input, dm reevaluates and prints the values of those expressions.
Numerical values of fields on a line can be accessed by the letter 'x'
followed by a column number. Character strings can be accessed by the
letter 's' followed by a column number. For example, for the input
line:
12 45.2s1 is the string '12', x2 is the number 45.2 (which is not the same as s2, the string '45.2'). Column Extraction. Columns are extracted with string expressions. To extract the 3rd, 8th, 1st and 2nd columns (in that order) from "file," one would type: dm s3 s8 s1 s2 < fileAlgebraic Transformations. To print, in order, the sum of the columns 1 and 2, the difference of columns 3 and 4, and the square root of the sum of squares of the 1st and 3rd columns, one could type the command: dm "x1+x2" "x3-x4" "(x1*x1+x3*x3)^.5"There are the usual mathematical functions that allow expressions like: dm "exp(x1) + log(log(x2))" "floor (x1/x2)" "sin x1"Testing Conditions. Expressions can be conditionally evaluated by comparing values. To print the ratio of x1 and x2, and check the value of x2 before division and print 'error' if x2 is 0, one could type: dm "if x2 = 0 then 'error' else x1/x2"To extract lines in which two columns are the same string, say the 5th and 2nd, one would type: dm "if s5 = s2 then INPUT else NEXT"Other Features. dm has comparison, algebraic, and logical operators, and special variables to take control in exceptional conditions. These include: INPUT, the current input line in string form; INLINE, the current input line number; N, the field count in INPUT; SUM, the sum of the numbers in INPUT; RAND, a uniform random number different for each line; NIL, an expression that causes no output; NEXT, which terminates evaluation on INPUT and goes to the next line; and EXIT, which terminates all processing. |
LIMITS | Input fields longer than 15 characters are truncated silently. The number of input columns, output expressions, and expression constants are limited to 100. |
SEE ALSO |
DM Tutorial and Manual
series to generate additive series. colex for column extraction. linex for line extraction. reverse to reverse lines or columns. perm to randomize or sort lines. maketrix to form matrix format files. transpose to transpose matrix format files. dsort to sort matrix format files. probdist for probability distribution functions. |
UPDATED | November 26, 1985 |
EXAMPLES | Some example inputs and outputs are available |