public class ZipfDistribution extends AbstractDiscreteDistribution
ZipfDistribution class provides a finite
distribution parameterized by a positive integer number of outcomes
with outcome probability inversely proportional to the rank of
the outcome (ordered by probablity). Many natural language
phenomena such as unigram word probabilities and named-entity
probabilities follow roughly a Zipf distribution.
The Zipf probability distribution
Zipfn with n
outcomes is defined by assigning a probability to the
rank r outcome, for 1<=r<=n, by:
Zipfn(r) = (1/r)/Zn
where Zn is the normalizing factor
for a Zipf distribution with n outcomes:
Zn
= Σ1<=j<=n
1/j
The Zipf distribution class provides a method for returning the entropy of the Zipf distribution. It also provides a static method for returning a Zipf distribution's probabilities in rank order. This latter method is useful for comparing observed distributions to that expected from a Zipf distribution.
For more information, see:
| Constructor and Description |
|---|
ZipfDistribution(int numOutcomes)
Construct a Constant Zipf distribution with the specified number of
outcomes.
|
| Modifier and Type | Method and Description |
|---|---|
long |
maxOutcome()
Returns the maximum outcome, which is just the number of
outcomes.
|
long |
minOutcome()
Returns one, the minimum outcome in a Zipf distribution.
|
int |
numOutcomes()
Returns the number of non-zero outcomes for this Zipf
distribution.
|
double |
probability(long rank)
Returns the probability of the outcome at the specified rank.
|
static double[] |
zipfDistribution(int numOutcomes)
Returns the array of probabilities indexed by rank for the Zipf
distribution with the specified number of outcomes.
|
cumulativeProbability, cumulativeProbabilityGreater, cumulativeProbabilityLess, entropy, log2Probability, mean, variancepublic ZipfDistribution(int numOutcomes)
numOutcomes - Number of outcomes for the distribution.IllegalArgumentException - If the number of outcomes
specified is not positive.public long minOutcome()
minOutcome in interface DiscreteDistributionminOutcome in class AbstractDiscreteDistributionpublic long maxOutcome()
maxOutcome in interface DiscreteDistributionmaxOutcome in class AbstractDiscreteDistributionpublic int numOutcomes()
public double probability(long rank)
0.0 for non-positive ranks or
ranks greater than the number of ranks in this distribution.probability in interface DiscreteDistributionprobability in class AbstractDiscreteDistributionrank - Rank of outcome.public static double[] zipfDistribution(int numOutcomes)
numOutcomes - Number of outcomes.Copyright © 2019 Alias-i, Inc.. All rights reserved.