Statistics extension for Numbas

This extension provides a load of statistical functions, wrapping the jStat library, as well as some extra functions not in jStat.

The data binning functions were written by Janet Cheung.

Functions

This list of functions contains descriptions copied from the jStat documentation. Click on the function name to see the original documentation.

There are also some extra functions not in jStat.

Descriptive statistics of a list of numbers

sum(array)

Returns the sum of the array vector.

sumsqrd(array)

Returns the sum squared of the array vector.

sumsqerr(array)

Returns the sum of squared errors of prediction of the array vector.

product(array)

Returns the product of the array vector.

min(array)

Returns the minimum value of the array vector.

max(array)

Returns the maximum value of the array vector.

mean(array)

Returns the mean of the array vector.

meansqerr(array)

Returns the mean squared error of the array vector.

geomean(array)

Returns the geometric mean of the array vector.

median(array)

Returns the median of the array vector.

cumsum(array)

Returns an array of partial sums in the sequence.

diff(array)

Returns an array of the successive differences of the array.

range(array)

Returns the range of the array vector.

variance(array)

Returns the variance of the array vector. By default, the population variance is calculated. Passing true to flag indicates to compute the sample variance instead.

population_variance(array)

Returns the population variance of the array vector.

sample_variance(array)

Returns the sample variance of the array vector.

stdev(array)

Returns the standard deviation of the array vector. By default, the population standard deviation is returned. Passing true to flag returns the sample standard deviation.

population_stdev(array)

Returns the population standard deviation of the array vector.

sample_stdev(array)

Returns the sample standard deviation of the array vector.

meandev(array)

Returns the mean absolute deviation of the array vector.

meddev(array)

Returns the median absolute deviation of the array vector.

coeffvar(array)

Returns the coefficient of variation of the array vector.

quartiles(array)

Returns the quartiles of the array vector.

Correlation of two samples

covariance(array1, array2)

Returns the covariance of the array1 and array2 vectors.

corrcoeff(array1, array2)

Returns the population correlation coefficient of the array1 and array2 vectors (Pearson's Rho).

stdev(array, is_sample)

Returns the standard deviation of the array vector. By default, the population standard deviation is returned. Passing true to flag returns the sample standard deviation.

variance(array, is_sample)

Returns the variance of the array vector. By default, the population variance is calculated. Passing true to flag indicates to compute the sample variance instead.

mode(array)

Returns the mode of the array vector. If there are multiple modes then mode() will return all of them.

Distributions

betapdf(x, alpha, beta)

Returns the value of x in the Beta distribution with parameters alpha and beta.

betacdf(x, alpha, beta)

Returns the value of x in the cdf for the Beta distribution with parameters alpha and beta.

betainv(p, alpha, beta)

Returns the value of p in the inverse of the cdf for the Beta distribution with parameters alpha and beta.

betamean(alpha, beta)

Returns the mean of the Beta distribution with parameters alpha and beta.

betamedian(alpha, beta)

Returns the median of the Beta distribution with parameters alpha and beta.

betamode(alpha, beta)

Returns the mode of the Beta distribution with parameters alpha and beta.

betasample(alpha, beta)

Returns a random number whose distribution is the Beta distribution with parameters alpha and beta.

betavariance(alpha, beta)

Returns the variance of the Beta distribution with parameters alpha and beta.

centralFpdf(x, df1, df2)

Given x in the range [0, infinity), returns the probability density of the (central) F distribution at x.

centralFcdf(x, df1, df2)

Given x in the range [0, infinity), returns the cumulative probability density of the central F distribution. That is, jStat.centralF.cdf(2.5, 10, 20) will return the probability that a number randomly selected from the central F distribution with df1 = 10 and df2 = 20 will be less than 2.5.

centralFinv(p, df1, df2)

Given p in [0, 1), returns the value of x for which the cumulative probability density of the central F distribution is p. That is, jStat.centralF.inv(p, df1, df2) = x if and only if jStat.centralF.inv(x, df1, df2) = p.

centralFmean(df1, df2)

Returns the mean of the (Central) F distribution.

centralFmode(df1, df2)

Returns the mode of the (Central) F distribution.

centralFsample(df1, df2)

Returns a random number whose distribution is the (Central) F distribution.

centralFvariance(df1, df2)

Returns the variance of the (Central) F distribution.

cauchypdf(x, local, scale)

Returns the value of x in the pdf of the Cauchy distribution with a location (median) of local and scale factor of scale.

cauchycdf(x, local, scale)

Returns the value of x in the cdf of the Cauchy distribution with a location (median) of local and scale factor of scale.

cauchyinv(p, local, scale)

Returns the value of p in the inverse of the cdf for the Cauchy distribution with a location (median) of local and scale factor of scale.

cauchymedian(local, scale)

Returns the value of the median for the Cauchy distribution with a location (median) of local and scale factor of scale.

cauchymode(local, scale)

Returns the value of the mode for the Cauchy distribution with a location (median) of local and scale factor of scale.

cauchysample(local, scale)

Returns a random number whose distribution is the Cauchy distribution with a location (median) of local and scale factor of scale.

chisquarepdf(x, dof)

Returns the value of x in the pdf of the Chi Square distribution with dof degrees of freedom.

chisquarecdf(x, dof)

Returns the value of x in the cdf of the Chi Square distribution with dof degrees of freedom.

chisquareinv(p, dof)

Returns the value of x in the inverse of the cdf for the Chi Square distribution with dof degrees of freedom.

chisquaremean(dof)

Returns the value of the mean for the Chi Square distribution with dof degrees of freedom.

chisquaremedian(dof)

Returns the value of the median for the Chi Square distribution with dof degrees of freedom.

chisquaremode(dof)

Returns the value of the mode for the Chi Square distribution with dof degrees of freedom.

chisquaresample(dof)

Returns a random number whose distribution is the Chi Square distribution with dof degrees of freedom.

chisquarevariance(dof)

Returns the value of the variance for the Chi Square distribution with dof degrees of freedom.

exponentialpdf(x, rate)

Returns the value of x in the pdf of the Exponential distribution with the parameter rate (lambda).

exponentialcdf(x, rate)

Returns the value of x in the cdf of the Exponential distribution with the parameter rate (lambda).

exponentialinv(p, rate)

Returns the value of p in the inverse of the cdf for the Exponential distribution with the parameter rate (lambda).

exponentialmean(rate)

Returns the value of the mean for the Exponential distribution with the parameter rate (lambda).

exponentialmedian(rate)

Returns the value of the median for the Exponential distribution with the parameter rate (lambda)

exponentialmode(rate)

Returns the value of the mode for the Exponential distribution with the parameter rate (lambda).

exponentialsample(rate)

Returns a random number whose distribution is the Exponential distribution with the parameter rate (lambda).

exponentialvariance(rate)

Returns the value of the variance for the Exponential distribution with the parameter rate (lambda).

gammapdf(x, shape, scale)

Returns the value of x in the pdf of the Gamma distribution with the parameters shape (k) and scale (theta). Notice that if using the alpha beta convention, scale = 1/beta.

gammacdf(x, shape, scale)

Returns the value of x in the cdf of the Gamma distribution with the parameters shape (k) and scale (theta). Notice that if using the alpha beta convention, scale = 1/beta.

gammainv(p, shape, scale)

Returns the value of p in the inverse of the cdf for the Gamma distribution with the parameters shape (k) and scale (theta). Notice that if using the alpha beta convention, scale = 1/beta.

gammamean(shape, scale)

Returns the value of the mean for the Gamma distribution with the parameters shape (k) and scale (theta). Notice that if using the alpha beta convention, scale = 1/beta.

gammamode(shape, scale)

Returns the value of the mode for the Gamma distribution with the parameters shape (k) and scale (theta). Notice that if using the alpha beta convention, scale = 1/beta.

gammasample(shape, scale)

Returns a random number whose distribution is the Gamma distribution with the parameters shape (k) and scale (theta). Notice that if using the alpha beta convention, scale = 1/beta.

gammavariance(shape, scale)

Returns the value of the variance for the Gamma distribution with the parameters shape (k) and scale (theta). Notice that if using the alpha beta convention, scale = 1/beta.

invgammapdf(x, shape, scale)

Returns the value of x in the pdf of the Inverse-Gamma distribution with parametres shape (alpha) and scale (beta).

invgammacdf(x, shape, scale)

Returns the value of x in the cdf of the Inverse-Gamma distribution with parametres shape (alpha) and scale (beta).

invgammainv(p, shape, scale)

Returns the value of p in the inverse of the cdf for the Inverse-Gamma distribution with parametres shape (alpha) and scale (beta).

invgammamean(shape, scale)

Returns the value of the mean for the Inverse-Gamma distribution with parametres shape (alpha) and scale (beta).

invgammamode(shape, scale)

Returns the value of the mode for the Inverse-Gamma distribution with parametres shape (alpha) and scale (beta).

invgammasample(shape, scale)

Returns a random number whose distribution is the Inverse-Gamma distribution with parametres shape (alpha) and scale (beta).

invgammavariance(shape, scale)

Returns the value of the variance for the Inverse-Gamma distribution with parametres shape (alpha) and scale (beta).

kumaraswamypdf(x, alpha, beta)

Returns the value of x in the pdf of the Kumaraswamy distribution with parameters a and b.

kumaraswamycdf(x, alpha, beta)

Returns the value of x in the cdf of the Kumaraswamy distribution with parameters alpha and beta.

kumaraswamyinv(p, alpha, beta)

Returns the value of p in the inverse of the pdf for the Kumaraswamy distribution with parametres alpha and beta.

kumaraswamymean(alpha, beta)

Returns the value of the mean of the Kumaraswamy distribution with parameters alpha and beta.

kumaraswamymedian(alpha, beta)

Returns the value of the median of the Kumaraswamy distribution with parameters alpha and beta.

kumaraswamymode(alpha, beta)

Returns the value of the mode of the Kumaraswamy distribution with parameters alpha and beta.

kumaraswamyvariance(alpha, beta)

Returns the value of the variance of the Kumaraswamy distribution with parameters alpha and beta.

lognormalpdf(x, mu, sigma)

Returns the value of x in the pdf of the Log-normal distribution with paramters mu (mean) and sigma (standard deviation).

lognormalcdf(x, mu, sigma)

Returns the value of x in the cdf of the Log-normal distribution with paramters mu (mean) and sigma (standard deviation).

lognormalinv(p, mu, sigma)

Returns the value of x in the inverse of the cdf for the Log-normal distribution with paramters mu (mean of the Normal distribution) and sigma (standard deviation of the Normal distribution).

lognormalmean(mu, sigma)

Returns the value of the mean for the Log-normal distribution with paramters mu (mean of the Normal distribution) and sigma (standard deviation of the Normal distribution).

lognormalmedian(mu, sigma)

Returns the value of the median for the Log-normal distribution with paramters mu (mean of the Normal distribution) and sigma (standard deviation of the Normal distribution).

lognormalmode(mu, sigma)

Returns the value of the mode for the Log-normal distribution with paramters mu (mean of the Normal distribution) and sigma (standard deviation of the Normal distribution).

lognormalsample(mu, sigma)

Returns a random number whose distribution is the Log-normal distribution with paramters mu (mean of the Normal distribution) and sigma (standard deviation of the Normal distribution).

lognormalvariance(mu, sigma)

Returns the value of the variance for the Log-normal distribution with paramters mu (mean of the Normal distribution) and sigma (standard deviation of the Normal distribution).

normalpdf(x, mean, std)

Returns the value of x in the pdf of the Normal distribution with parameters mean and std (standard deviation).

normalcdf(x, mean, std)

Returns the value of x in the cdf of the Normal distribution with parameters mean and std (standard deviation).

normalinv(p, mean, std)

Returns the value of p in the inverse cdf for the Normal distribution with parameters mean and std (standard deviation).

normalmean(mean, std)

Returns the value of the mean for the Normal distribution with parameters mean and std (standard deviation).

normalmedian(mean, std)

Returns the value of the median for the Normal distribution with parameters mean and std (standard deviation).

normalmode(mean, std)

Returns the value of the mode for the Normal distribution with parameters mean and std (standard deviation).

normalsample(mean, std)

Returns a random number whose distribution is the Normal distribution with parameters mean and std (standard deviation).

normalvariance(mean, std)

Returns the value of the variance for the Normal distribution with parameters mean and std (standard deviation).

paretopdf(x, scale, shape)

Returns the value of x in the pdf of the Pareto distribution with parameters scale (x<sub>m</sub>) and shape (alpha).

paretocdf(x, scale, shape)

Returns the value of x in the cdf of the Pareto distribution with parameters scale (x<sub>m</sub>) and shape (alpha).

paretoinv(p, scale, shape)

Returns the inverse of the Pareto distribution with probability p, scale, shape.

paretomean(scale, shape)

Returns the value of the mean of the Pareto distribution with parameters scale (x<sub>m</sub>) and shape (alpha).

paretomedian(scale, shape)

Returns the value of the median of the Pareto distribution with parameters scale (x<sub>m</sub>) and shape (alpha).

paretomode(scale, shape)

Returns the value of the mode of the Pareto distribution with parameters scale (x<sub>m</sub>) and shape (alpha).

paretovariance(scale, shape)

Returns the value of the variance of the Pareto distribution with parameters scale (x<sub>m</sub>) and shape (alpha).

studenttpdf(x, dof)

Returns the value of x in the pdf of the Student's T distribution with dof degrees of freedom.

studenttcdf(x, dof)

Returns the value of x in the cdf of the Student's T distribution with dof degrees of freedom.

studenttinv(p, dof)

Returns the value of p in the inverse of the cdf for the Student's T distribution with dof degrees of freedom.

studenttmean(dof)

Returns the value of the mean of the Student's T distribution with dof degrees of freedom.

studenttmedian(dof)

Returns the value of the median of the Student's T distribution with dof degrees of freedom.

studenttmode(dof)

Returns the value of the mode of the Student's T distribution with dof degrees of freedom.

studenttsample(dof)

Returns a random number whose distribution is the Student's T distribution with dof degrees of freedom.

studenttvariance(dof)

Returns the value of the variance for the Student's T distribution with dof degrees of freedom.

weibullpdf(x, scale, shape)

Returns the value x in the pdf for the Weibull distribution with parameters scale (lambda) and shape (k).

weibullcdf(x, scale, shape)

Returns the value x in the cdf for the Weibull distribution with parameters scale (lambda) and shape (k).

weibullinv(p, scale, shape)

Returns the value of x in the inverse of the cdf for the Weibull distribution with parameters scale (lambda) and shape (k).

weibullmean(scale, shape)

Returns the value of the mean of the Weibull distribution with parameters scale (lambda) and shape (k).

weibullmedian(scale, shape)

Returns the value of the median of the Weibull distribution with parameters scale (lambda) and shape (k).

weibullmode(scale, shape)

Returns the mode of the Weibull distribution with parameters scale (lambda) and shape (k).

weibullsample(scale, shape)

Returns a random number whose distribution is the Weibull distribution with parameters scale (lambda) and shape (k).

weibullvariance(scale, shape)

Returns the variance of the Weibull distribution with parameters scale (lambda) and shape (k).

uniformpdf(x, a, b)

Returns the value of x in the pdf of the Uniform distribution from a to b.

uniformcdf(x, a, b)

Returns the value of x in the cdf of the Uniform distribution from a to b.

uniforminv(p, a, b)

Returns the inverse of the uniform.cdf function; i.e. the value of x for which uniform.cdf(x, a, b) == p.

uniformmean(a, b)

Returns the value of the mean of the Uniform distribution from a to b.

uniformmedian(a, b)

Returns the value of the median of the Uniform distribution from a to b.

uniformmode(a, b)

Returns the value of the mode of the Uniform distribution from a to b.

uniformsample(a, b)

Returns a random number whose distribution is the Uniform distribution from a to b.

uniformvariance(a, b)

Returns the variance of the Uniform distribution from a to b.

binomialpdf(x, n, p)

Returns the value of k in the pdf of the Binomial distribution with parameters n and p.

binomialcdf(x, n, p)

Returns the value of k in the cdf of the Binomial distribution with parameters n and p.

geometricpdf(x, p)

geometriccdf(x, p)

geometricmean(p)

geometricmedian(p)

geometricmode(p)

geometricsample(p)

geometricvariance(p)

negbinpdf(x, r, p)

Returns the value of k in the pdf of the Negative Binomial distribution with parameters n and p.

negbincdf(x, r, p)

Returns the value of x in the cdf of the Negative Binomial distribution with parameters n and p.

hypgeompdf(x, population_size, success_rate, num_draws)

Returns the value of k in the pdf of the Hypergeometric distribution with parameters N (the population size), m (the success rate), and n (the number of draws).

hypgeomcdf(x, population_size, success_rate, num_draws)

Returns the value of x in the cdf of the Hypergeometric distribution with parameters N (the population size), m (the success rate), and n (the number of draws).

poissonpdf(x, l)

Returns the value of k in the pdf of the Poisson distribution with parameter l (lambda).

poissoncdf(x, l)

Returns the value of x in the cdf of the Poisson distribution with parameter l (lambda).

poissonmean(l)

poissonsample(l)

Returns a random number whose distribution is the Poisson distribution with rate parameter l (lamda)

poissonvariance(l)

triangularpdf(x, a, b, c)

Returns the value of x in the pdf of the Triangular distribution with the parameters a, b, and c.

triangularcdf(x, a, b, c)

Returns the value of x in the cdf of the Triangular distribution with the parameters a, b, and c.

triangularinv(p, a, b, c)

triangularmean(a, b, c)

Returns the value of the mean of the Triangular distribution with the parameters a, b, and c.

triangularmedian(a, b, c)

Returns the value of the median of the Triangular distribution with the parameters a, b, and c.

triangularmode(a, b, c)

Returns the value of the mode of the Triangular distribution with the parameters a, b, and c.

triangularsample(a, b, c)

Returns a random number whose distribution is the Triangular distribution with the parameters a, b, and c.

triangularvariance(a, b, c)

Returns the value of the variance of the Triangular distribution with the parameters a, b, and c.

Statistical Tests

zScore(value, mean, sd)

Returns the z-score of value given the data from array. flag===true denotes use of the sample standard deviation.

zScore(value, array)

Returns the z-score of value given the data from array. flag===true denotes use of the sample standard deviation.

zTest(value, mean, sd, sides)

Returns the p-value of value given the data from array. sides is an integer value 1 or 2 denoting a one or two sided z-test. If sides is not specified the test defaults to a two sided z-test. flag===true denotes the use of the sample standard deviation.

zTest(zscore, sides)

Returns the p-value of value given the data from array. sides is an integer value 1 or 2 denoting a one or two sided z-test. If sides is not specified the test defaults to a two sided z-test. flag===true denotes the use of the sample standard deviation.

tScore(value, mean, sd, n)

Returns the t-score of value given the data from array.

tScore(value, array)

Returns the t-score of value given the data from array.

tTest(value, mean, sd, n, sides)

Returns the p-value of value given the data in array. sides is an integer value 1 or 2 denoting a one or two sided t-test. If sides is not specified the test defaults to a two sided t-test.

tTest(tscore, n, sides)

Returns the p-value of value given the data in array. sides is an integer value 1 or 2 denoting a one or two sided t-test. If sides is not specified the test defaults to a two sided t-test.

tTest(value, array, sides)

Returns the p-value of value given the data in array. sides is an integer value 1 or 2 denoting a one or two sided t-test. If sides is not specified the test defaults to a two sided t-test.

anovaFScore(array1, ..., arrayN)

Returns the f-score of an ANOVA on the arrays.

anovaFTest(array1, ..., arrayN)

Returns the p-value of the f-statistic from the ANOVA test on the arrays.

ftest(fscore, df1, df2)

Returns the p-value for the fscore f-score with a df1 numerator degrees of freedom and a df2 denominator degrees of freedom.

Confidence intervals

normalci(value, alpha, sd, n)

Returns a 1-alpha confidence interval for value given a normal distribution in the data from array.

normalci(value, alpha, array)

Returns a 1-alpha confidence interval for value given a normal distribution in the data from array.

tci(value, alpha, sd, n)

Returns a 1-alpha confidence interval for value given the data from array.

tci(value, alpha, array)

Returns a 1-alpha confidence interval for value given the data from array.

Special functions

betafn(x, y)

Evaluates the Beta function at (x,y).

betaln(x, y)

Evaluates the log Beta function at (x,y).

betacf(x, a, b)

Returns the continued fraction for the incomplete Beta function with parameters a and b modified by Lentz's method evaluated at x.

ibetainv(p, a, b)

Returns the inverse of the incomplete Beta function evaluated at (p,a,b).

ibeta(x, a, b)

Returns the incomplete Beta function evaluated at (x,a,b).

gammaln(x)

Returns the Log-Gamma function evaluated at x.

gammafn(x)

Returns the Gamma function evaluated at x. This is sometimes called the 'complete' gamma function.

gammap(a, x)

Returns the lower incomplete gamma function evaluated at (a,x). This function is usually written with a lower case greek gamma character, and is one of the two <a href="http://mathworld.wolfram.com/IncompleteGammaFunction.html">incomplete gamma functions</a>.

factorialln(n)

Returns the natural log factorial of n.

factorial(n)

Returns the factorial of n.

combination(n, m)

Returns the number of combinations of n, m.

permutation(permutation)

Returns the number of permutations of n, m.

gammapinv(p, a)

Returns the inverse of the lower regularized incomplete Gamma function evaluated at (p,a). This function is the inverse of lowerRegularizedGamma(x, a).

erf(x)

Returns the error function evaluated at x.

erfc(x)

Returns the complementary error function evaluated at x.

erfcinv(p)

Returns the inverse of the complementary error function evaluated at p.

randn(n, m)

Returns a normal deviate (mean 0 and standard deviation 1).

randg(shape, n, m)

Returns a Gamma deviate by the method of Marsaglia and Tsang.

Data binning

bin(data, num_bins, [range])

Put the list of numbers data into bins of equal size. The number of bins is given by num_bins. If range is given, then the bins span that range; otherwise they span the same range as the given data, with the first bin starting at the minimum value and the last bin ending at the maximum.

Examples: