Package de.jungblut.math
Class MathUtils
- java.lang.Object
-
- de.jungblut.math.MathUtils
-
public final class MathUtils extends java.lang.ObjectMath utils that features normalizations and other fancy stuff.- Author:
- thomas.jungblut
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classMathUtils.PredictionOutcomePair
-
Field Summary
Fields Modifier and Type Field Description static doubleEPS
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static doublecomputeAUC(java.util.List<MathUtils.PredictionOutcomePair> outcomePredictedPairs)This is actually taken from Kaggle's C# implementation:://www.kaggle.com/c/SemiSupervisedFeatureLearning /forums/t/919/auc-implementation/6136#post6136.static de.jungblut.math.dense.DenseDoubleMatrixcreatePolynomials(de.jungblut.math.dense.DenseDoubleMatrix seed, int num)Creates a new matrix consisting out of polynomials of the input matrix.
Considering you want to do a 2 polynomial out of 3 columns you get:
(SEED: x^1 | y^1 | z^1 )| x^2 | y^2 | z^2 for the columns of the returned matrix.static doubleguardedLogarithm(double input)static de.jungblut.math.DoubleMatrixlogMatrix(de.jungblut.math.DoubleMatrix input)static de.jungblut.math.DoubleVectorlogVector(de.jungblut.math.DoubleVector input)static de.jungblut.math.tuple.Tuple3<de.jungblut.math.DoubleMatrix,de.jungblut.math.DoubleVector,de.jungblut.math.DoubleVector>meanNormalizeColumns(de.jungblut.math.DoubleMatrix x)static de.jungblut.math.tuple.Tuple<de.jungblut.math.DoubleVector,de.jungblut.math.DoubleVector>meanNormalizeColumns(Dataset dataset)Normalizes the given dataset (inplace), by subtracting the mean and dividing by the stddev.static de.jungblut.math.tuple.Tuple<de.jungblut.math.DoubleVector,de.jungblut.math.DoubleVector>meanNormalizeColumns(Dataset dataset, java.util.function.Predicate<FeatureOutcomePair> filterPredicate)Normalizes the given dataset (inplace), by subtracting the mean and dividing by the stddev.static de.jungblut.math.tuple.Tuple<de.jungblut.math.DoubleMatrix,de.jungblut.math.DoubleVector>meanNormalizeRows(de.jungblut.math.DoubleMatrix pMatrix)static doubleminMaxScale(double x, double fromMin, double fromMax, double toMin, double toMax)Scales a single input into the interval given by min and max.static de.jungblut.math.DoubleMatrixminMaxScale(de.jungblut.math.DoubleMatrix input, double fromMin, double fromMax, double toMin, double toMax)Scales a matrix into the interval given by min and max.static de.jungblut.math.DoubleVectorminMaxScale(de.jungblut.math.DoubleVector input, double fromMin, double fromMax, double toMin, double toMax)Scales a vector into the interval given by min and max.static de.jungblut.math.DoubleVectornumericalGradient(de.jungblut.math.DoubleVector vector, CostFunction f)Calculates the numerical gradient from a cost function using the central difference theorem.
-
-
-
Method Detail
-
meanNormalizeRows
public static de.jungblut.math.tuple.Tuple<de.jungblut.math.DoubleMatrix,de.jungblut.math.DoubleVector> meanNormalizeRows(de.jungblut.math.DoubleMatrix pMatrix)
- Returns:
- mean normalized matrix (0 mean and stddev of 1) as well as the mean.
-
meanNormalizeColumns
public static de.jungblut.math.tuple.Tuple3<de.jungblut.math.DoubleMatrix,de.jungblut.math.DoubleVector,de.jungblut.math.DoubleVector> meanNormalizeColumns(de.jungblut.math.DoubleMatrix x)
- Returns:
- the normalized matrix (0 mean and stddev of 1) as well as the mean and the stddev.
-
meanNormalizeColumns
public static de.jungblut.math.tuple.Tuple<de.jungblut.math.DoubleVector,de.jungblut.math.DoubleVector> meanNormalizeColumns(Dataset dataset)
Normalizes the given dataset (inplace), by subtracting the mean and dividing by the stddev. Dataset will have 0 mean and stddev of 1.- Returns:
- a tuple of the mean and the stddev.
-
meanNormalizeColumns
public static de.jungblut.math.tuple.Tuple<de.jungblut.math.DoubleVector,de.jungblut.math.DoubleVector> meanNormalizeColumns(Dataset dataset, java.util.function.Predicate<FeatureOutcomePair> filterPredicate)
Normalizes the given dataset (inplace), by subtracting the mean and dividing by the stddev. Dataset will have 0 mean and stddev of 1.Additionally you can supply a predicate, if you want to only execute this on a specific sub part of the dataset.
- Returns:
- a tuple of the mean and the stddev.
-
createPolynomials
public static de.jungblut.math.dense.DenseDoubleMatrix createPolynomials(de.jungblut.math.dense.DenseDoubleMatrix seed, int num)Creates a new matrix consisting out of polynomials of the input matrix.
Considering you want to do a 2 polynomial out of 3 columns you get:
(SEED: x^1 | y^1 | z^1 )| x^2 | y^2 | z^2 for the columns of the returned matrix.- Parameters:
seed- matrix to add polynoms of it.num- how many polynoms, 2 for quadratic, 3 for cubic and so forth.- Returns:
- the new matrix.
-
numericalGradient
public static de.jungblut.math.DoubleVector numericalGradient(de.jungblut.math.DoubleVector vector, CostFunction f)Calculates the numerical gradient from a cost function using the central difference theorem. f'(x) = (f(x + h) - f(x - h)) / 2.- Parameters:
vector- the parameters to derive.f- the costfunction to return the cost at a given parameterset.- Returns:
- a numerical gradient.
-
logMatrix
public static de.jungblut.math.DoubleMatrix logMatrix(de.jungblut.math.DoubleMatrix input)
- Returns:
- a log'd matrix that was guarded against edge cases of the logarithm.
-
logVector
public static de.jungblut.math.DoubleVector logVector(de.jungblut.math.DoubleVector input)
- Returns:
- a log'd matrix that was guarded against edge cases of the logarithm.
-
minMaxScale
public static de.jungblut.math.DoubleMatrix minMaxScale(de.jungblut.math.DoubleMatrix input, double fromMin, double fromMax, double toMin, double toMax)Scales a matrix into the interval given by min and max.- Parameters:
input- the input value.fromMin- the lower bound of the input interval.fromMax- the upper bound of the input interval.toMin- the lower bound of the target interval.toMax- the upper bound of the target interval.- Returns:
- the new matrix with scaled values.
-
minMaxScale
public static de.jungblut.math.DoubleVector minMaxScale(de.jungblut.math.DoubleVector input, double fromMin, double fromMax, double toMin, double toMax)Scales a vector into the interval given by min and max.- Parameters:
input- the input vector.fromMin- the lower bound of the input interval.fromMax- the upper bound of the input interval.toMin- the lower bound of the target interval.toMax- the upper bound of the target interval.- Returns:
- the new vector with scaled values.
-
minMaxScale
public static double minMaxScale(double x, double fromMin, double fromMax, double toMin, double toMax)Scales a single input into the interval given by min and max.- Parameters:
x- the input value.fromMin- the lower bound of the input interval.fromMax- the upper bound of the input interval.toMin- the lower bound of the target interval.toMax- the upper bound of the target interval.- Returns:
- the bounded value.
-
guardedLogarithm
public static double guardedLogarithm(double input)
- Returns:
- a log'd value of the input that is guarded.
-
computeAUC
public static double computeAUC(java.util.List<MathUtils.PredictionOutcomePair> outcomePredictedPairs)
This is actually taken from Kaggle's C# implementation:://www.kaggle.com/c/SemiSupervisedFeatureLearning /forums/t/919/auc-implementation/6136#post6136.- Parameters:
outcomePredictedPairs- the list of PredictionOutcomePair: class (0 or 1) -> predicted value- Returns:
- the AUC value.
-
-