public class GaussianKernelDistribution extends AnotherAbstractRealDistribution
#kernelBandwidth
A precision parameter controls the precision, such that for a precision of 0.1, all sample values between -0.05 and 0.05 will be treated as 0.0.
The bandwidth of the kernel is adjusted to the data distribution and works best for normally distributed data It used the formula proposed in:
Scott, D. W. (1992) Multivariate Density Estimation: Theory, Practice, and Visualization. Wiley
Modifier and Type | Field and Description |
---|---|
protected double |
h
smoothing parameter
|
protected java.util.Map<java.lang.Long,java.lang.Double> |
kernelPointsAndWeights
This map stores the number of occurrences of values in defined intervals.
|
protected org.apache.commons.math3.distribution.NormalDistribution |
ndist |
static int |
NUMBER_OF_BINS
Grid over the data
|
protected double |
precision
The precision parameter determines the interval size for kernels for improved efficiency.
|
protected java.util.List<java.lang.Double> |
sampleValues
All observed values in an array (easier for sampling)
|
protected static java.math.MathContext |
veryPrecise |
Constructor and Description |
---|
GaussianKernelDistribution() |
GaussianKernelDistribution(double precision)
Creates a kernel distribution grouping kernels with values falling into the range of the precision parameter into one "bin" with added weight
Precision 0.1 for example creates ten bins for one unit, precision 0.5 creates two bins.
|
Modifier and Type | Method and Description |
---|---|
void |
addValue(double val) |
void |
addValues(double[] values) |
double |
cumulativeProbability(double x) |
double |
cumulativeProbability(double arg0,
double arg1) |
double |
density(double x) |
protected double[] |
getDoubleArray(java.util.List<java.lang.Double> values) |
double |
getH()
The smoothing parameter for the density estimation that depends on the number of nodes and the inter-quartile range
|
double |
getNumericalMean()
The expected value:
|
double |
getReasonableLowerBound() |
double |
getReasonableUpperBound() |
double |
getSupportLowerBound() |
double |
getSupportUpperBound() |
java.util.List<java.lang.Double> |
getValues() |
boolean |
isSupportConnected() |
boolean |
isSupportLowerBoundInclusive() |
boolean |
isSupportUpperBoundInclusive() |
double |
probability(double x)
Should use density, as P(X=x) is zero for real-valued distributions
|
double |
sample()
Simply select one value from the observations at random and
sample from it's Gaussian Kernel heap.
|
protected void |
updateKernels() |
void |
updateSmoothingParameter()
Uses the 'rule of thumb' for the kernel bandwith combined with the more
robust quantile based approximation.
|
getNumericalVariance, value
protected double precision
Change: Make this dynamic depending on the range of values (make it
public static final int NUMBER_OF_BINS
protected java.util.Map<java.lang.Long,java.lang.Double> kernelPointsAndWeights
precision
argumentprotected java.util.List<java.lang.Double> sampleValues
protected static java.math.MathContext veryPrecise
protected double h
protected org.apache.commons.math3.distribution.NormalDistribution ndist
public GaussianKernelDistribution()
public GaussianKernelDistribution(double precision)
precision
- the interval size to be captured by one bin. Instead of creating n kernels for n values,
we reduce the kernel count by grouping similar values and adjusting the weight of the shared kernel.public void addValues(double[] values)
public void addValue(double val)
protected void updateKernels()
public void updateSmoothingParameter()
protected double[] getDoubleArray(java.util.List<java.lang.Double> values)
public double cumulativeProbability(double x)
cumulativeProbability
in interface org.apache.commons.math3.distribution.RealDistribution
cumulativeProbability
in class AnotherAbstractRealDistribution
public double cumulativeProbability(double arg0, double arg1) throws org.apache.commons.math3.exception.NumberIsTooLargeException
cumulativeProbability
in interface org.apache.commons.math3.distribution.RealDistribution
cumulativeProbability
in class org.apache.commons.math3.distribution.AbstractRealDistribution
org.apache.commons.math3.exception.NumberIsTooLargeException
public double density(double x)
public double getNumericalMean()
AnotherAbstractRealDistribution
getNumericalMean
in interface org.apache.commons.math3.distribution.RealDistribution
getNumericalMean
in class AnotherAbstractRealDistribution
public double getSupportLowerBound()
public double getSupportUpperBound()
public boolean isSupportConnected()
public boolean isSupportLowerBoundInclusive()
public boolean isSupportUpperBoundInclusive()
public double probability(double x)
probability
in interface org.apache.commons.math3.distribution.RealDistribution
probability
in class org.apache.commons.math3.distribution.AbstractRealDistribution
public double sample()
sample
in interface org.apache.commons.math3.distribution.RealDistribution
sample
in class org.apache.commons.math3.distribution.AbstractRealDistribution
public java.util.List<java.lang.Double> getValues()
public double getH()
public double getReasonableUpperBound()
public double getReasonableLowerBound()