PACKAGE | |STAT Data Manipulation and Analysis, by Gary Perlman |
---|---|
NAME | probdist - probability distribution functions, random number generation |
SYNOPSIS | probdist [-qv] [-s seed] [ function distribution [parameters] value ] |
DESCRIPTION |
probdist is a family of probability distribution functions. functions
include:
probdist prob F 2 12 3.46or a random sample of normal-z values: probdist rand n 100A single request can be supplied on the command line: probdist rand z 20or several can be supplied to the standard input. Blank input lines are ignored. > probdist prob binomial 20 3/4 18 0.091260 critical t 8 .05 2.306004 Functions and distributions can be abbreviated with single letters; only the first letter is used, and case does not matter. The normal-z distribution can be requested with z or n. The chi-square distribution can be requested with c or x. Probabilities are between 0 and 1, usually not including those values. When supplying probabilities to the program, they can be input in decimal form (e.g., .05), or as a ratio of two integers (e.g., 1/20). The ratio form must be used to specify the probability of a success in the binomial distribution. |
OPTIONS |
The following standard help options are supported. The program exits after displaying the help.
|
EXAMPLES |
dm is useful for converting the output from probdist.
Normal sample with mean 20 and standard deviation 10:
probdist random z 100 | dm "x1*10+20"Uniform random integers between 20 and 29 (inclusive): probdist random u 100 | dm "floor(x1*10+20)" |
DISTRIBUTIONS |
e = epsilon (very small number), oo = infinity, p = a/b, *1--tail test params mean min max prob Uniform 0.5 0+e 1-e x..1 Binomial N p Np 0 N x..N* Normal Z 0 -oo +oo -oo..x* t df see F 0 +oo x..oo Chi-Square df df 0 +oo x..oo F df1 df2 df2/(df2-2) 0 +oo x..oo The critical value functions use an inversion of the probability functions that refine their approximations until the computed distribution value produces a probability within .000001 of the requested value. The random samples are based on uniform random numbers between 0.0 and 1.0 (but not including those extreme values); the uniform random numbers are used as input to the critical statistic calculation. UNIFORM: prob|crit|rand u p|# The uniform probability and critical value functions both return 1 minus the value. prob u .9 # equals 0.1 crit uni .7 # equals 0.3 rand uniform 10 # produces 10 random numbers BINOMIAL: prob|crit|rand b N P r|p|# The binomial distribution returns the cumulative probability from a given value r (number of successes) up to N (the number of trials). The value of P, the probability of a success must be specified as a ratio of integers (e.g., 1/2, not .5). To compute the lower tail of the binomial distribution, that is, the probability of getting r or less successes, the following rule can be used: prob ( B(N,p) <= r ) = prob ( B(N,1-p) >= N-r )For a specified significance level, such as the .05 level, there may be no critical statistic with exactly the desired probability. In most cases, the probability of the statistic will be less than that requested. In some cases, there may be no critical statistic with less than the requested probability (e.g., the probability of 5 successes in 5 binomial trials with p=1/2 is 0.03125), so the computed value would be one greater than the maximum possible (e.g., for the B(5,1/2) example: 6). To compute random binomial numbers, N uniform random numbers are generated and the count of those less than p is the random statistic; with this algorithm, under the verbose option, the probability reported with the random statistic is not meaningful. Probability calculations are based on a logarithmic approximation of sums of products of powers of primes, thought to be accurate to over ten decimal places for N up to 1000. prob binomial 20 1/2 17 # is just less than 0.006 crit bin 30 3/4 .05 # equals 27 (p = .037) rand B 40 1/4 10 # produces 10 random numbers NORMAL-Z: prob|crit|rand n|z Z|p|# The normal-z probability function computes values for the one-tailed cumulative probability from -oo up to the given value. The function is accurate to six decimal places (z values with absolute values up to 6). Probability of Normal Z value computed with CACM Algorithm 209. The quick version of the random number generation adds twelve uniform random numbers and subtracts 6.0. prob normal 2.0 # equals 0.977250 crit n .05 # equals -1.644854 (one-tailed) rand Z 10 # produces 10 random numbers STUDENT'S T: prob|crit|rand t df t|p|# The probability of Student's t-value computed from the relation: t(n)*t(n) = F(1,n) so, the probability reported for a t statistic is the two-tailed probability of |t| exceeding the obtained value. prob t 20 2.0 # equals 0.059 crit t-test 10 .05 # equals 2.23 (two-tailed) rand tarantula 30 10 # produces 10 random numbers F: prob|crit|rand f df1 df2 Probability of F-ratio computed with CACM Algorithm 322. prob f 1 20 3.4 # equals 0.08 crit F 4 10 .05 # equals 3.48 rand F 5 30 10 # produces 10 random numbers CHI-SQUARE: prob|crit|rand c|x df x|p|# Probability of Chi-square computed with CACM Algorithm 299. prob chi2 5 18 # equals 0.003 crit chi-square 2 .05 # equals 5.99 rand X 1 10 # produces 10 random numbers |
UPDATED | August 21, 1989 |