Class Evaluator


  • public final class Evaluator
    extends java.lang.Object
    Binary-/Multi-class classification evaluator utility that takes care of test/train splitting and its evaluation with various metrics.
    Author:
    thomas.jungblut
    • Method Detail

      • evaluateClassifier

        public static Evaluator.EvaluationResult evaluateClassifier​(Classifier classifier,
                                                                    de.jungblut.math.DoubleVector[] features,
                                                                    de.jungblut.math.DoubleVector[] outcome,
                                                                    float splitFraction,
                                                                    boolean random)
        Trains and evaluates the given classifier with a test split.
        Parameters:
        classifier - the classifier to train and evaluate.
        features - the features to split.
        outcome - the outcome to split.
        splitFraction - a value between 0f and 1f that sets the size of the trainingset. With 1k items, a splitFraction of 0.9f will result in 900 items to train and 100 to evaluate.
        random - true if you want to perform shuffling on the data beforehand.
        Returns:
        a new Evaluator.EvaluationResult.
      • evaluateClassifier

        public static Evaluator.EvaluationResult evaluateClassifier​(Classifier classifier,
                                                                    de.jungblut.math.DoubleVector[] features,
                                                                    de.jungblut.math.DoubleVector[] outcome,
                                                                    float splitFraction,
                                                                    boolean random,
                                                                    java.lang.Double threshold)
        Trains and evaluates the given classifier with a test split.
        Parameters:
        classifier - the classifier to train and evaluate.
        features - the features to split.
        outcome - the outcome to split.
        numLabels - the number of labels that are used. (e.G. 2 in binary classification).
        splitFraction - a value between 0f and 1f that sets the size of the trainingset. With 1k items, a splitFraction of 0.9f will result in 900 items to train and 100 to evaluate.
        random - true if you want to perform shuffling on the data beforehand.
        threshold - in case of binary predictions, threshold is used to call in Predictor.predictedClass(DoubleVector, double). Can be null, then no thresholding will be used.
        Returns:
        a new Evaluator.EvaluationResult.
      • evaluateSplit

        public static Evaluator.EvaluationResult evaluateSplit​(Classifier classifier,
                                                               EvaluationSplit split)
        Evaluates a given train/test split with the given classifier.
        Parameters:
        classifier - the classifier to train on the train split.
        split - the EvaluationSplit that contains the test and train data.
        Returns:
        a fresh evalation result filled with the evaluated metrics.
      • evaluateSplit

        public static Evaluator.EvaluationResult evaluateSplit​(Classifier classifier,
                                                               EvaluationSplit split,
                                                               java.lang.Double threshold)
        Evaluates a given train/test split with the given classifier.
        Parameters:
        classifier - the classifier to train on the train split.
        split - the EvaluationSplit that contains the test and train data.
        threshold - the threshold for predicting a specific class by probability (if not provided = null).
        Returns:
        a fresh evalation result filled with the evaluated metrics.
      • evaluateSplit

        public static Evaluator.EvaluationResult evaluateSplit​(Classifier classifier,
                                                               de.jungblut.math.DoubleVector[] trainFeatures,
                                                               de.jungblut.math.DoubleVector[] trainOutcome,
                                                               de.jungblut.math.DoubleVector[] testFeatures,
                                                               de.jungblut.math.DoubleVector[] testOutcome,
                                                               java.lang.Double threshold)
        Evaluates a given train/test split with the given classifier.
        Parameters:
        classifier - the classifier to train on the train split.
        trainFeatures - the features to train with.
        trainOutcome - the outcomes to train with.
        testFeatures - the features to test with.
        testOutcome - the outcome to test with.
        threshold - the threshold for predicting a specific class by probability (if not provided = null).
        Returns:
        a fresh evalation result filled with the evaluated metrics.
      • testClassifier

        public static Evaluator.EvaluationResult testClassifier​(Predictor classifier,
                                                                de.jungblut.math.DoubleVector[] testFeatures,
                                                                de.jungblut.math.DoubleVector[] testOutcome)
        Tests the given classifier without actually training it.
        Parameters:
        classifier - the classifier to evaluate on the test split.
        testFeatures - the features to test with.
        testOutcome - the outcome to test with.
        Returns:
        a fresh evalation result filled with the evaluated metrics.
      • testClassifier

        public static Evaluator.EvaluationResult testClassifier​(Predictor classifier,
                                                                de.jungblut.math.DoubleVector[] testFeatures,
                                                                de.jungblut.math.DoubleVector[] testOutcome,
                                                                java.lang.Double threshold)
        Tests the given classifier without actually training it.
        Parameters:
        classifier - the classifier to evaluate on the test split.
        testFeatures - the features to test with.
        testOutcome - the outcome to test with.
        threshold - the threshold for predicting a specific class by probability (if not provided = null).
        Returns:
        a fresh evalation result filled with the evaluated metrics.
      • observeBinaryClassificationElement

        public static int observeBinaryClassificationElement​(Predictor predictor,
                                                             java.lang.Double threshold,
                                                             Evaluator.EvaluationResult result,
                                                             de.jungblut.math.DoubleVector outcomeVector,
                                                             de.jungblut.math.DoubleVector predictedVector)
      • crossValidateClassifier

        public static <A extends ClassifierEvaluator.EvaluationResult crossValidateClassifier​(ClassifierFactory<A> classifierFactory,
                                                                                                de.jungblut.math.DoubleVector[] features,
                                                                                                de.jungblut.math.DoubleVector[] outcome,
                                                                                                int numLabels,
                                                                                                int folds,
                                                                                                java.lang.Double threshold,
                                                                                                boolean verbose)
        Does a k-fold crossvalidation on the given classifiers with features and outcomes. The folds will be calculated on a new thread.
        Parameters:
        classifierFactory - the classifiers to train and test.
        features - the features to train/test with.
        outcome - the outcomes to train/test with.
        numLabels - the total number of labels that are possible. e.G. 2 in the binary case.
        folds - the number of folds to fold, usually 10.
        threshold - the threshold for predicting a specific class by probability (if not provided = null).
        verbose - true if partial fold results should be printed.
        Returns:
        a averaged evaluation result over all k folds.
      • crossValidateClassifier

        public static <A extends ClassifierEvaluator.EvaluationResult crossValidateClassifier​(ClassifierFactory<A> classifierFactory,
                                                                                                de.jungblut.math.DoubleVector[] features,
                                                                                                de.jungblut.math.DoubleVector[] outcome,
                                                                                                int numLabels,
                                                                                                int folds,
                                                                                                java.lang.Double threshold,
                                                                                                int numThreads,
                                                                                                boolean verbose)
        Does a k-fold crossvalidation on the given classifiers with features and outcomes.
        Parameters:
        classifierFactory - the classifiers to train and test.
        features - the features to train/test with.
        outcome - the outcomes to train/test with.
        numLabels - the total number of labels that are possible. e.G. 2 in the binary case.
        folds - the number of folds to fold, usually 10.
        threshold - the threshold for predicting a specific class by probability (if not provided = null).
        numThreads - how many threads to use to evaluate the folds.
        verbose - true if partial fold results should be printed.
        Returns:
        a averaged evaluation result over all k folds.
      • tenFoldCrossValidation

        public static <A extends ClassifierEvaluator.EvaluationResult tenFoldCrossValidation​(ClassifierFactory<A> classifierFactory,
                                                                                               de.jungblut.math.DoubleVector[] features,
                                                                                               de.jungblut.math.DoubleVector[] outcome,
                                                                                               int numLabels,
                                                                                               java.lang.Double threshold,
                                                                                               boolean verbose)
        Does a 10 fold crossvalidation.
        Parameters:
        classifierFactory - the classifiers to train and test.
        features - the features to train/test with.
        outcome - the outcomes to train/test with.
        numLabels - the total number of labels that are possible. e.G. 2 in the binary case.
        threshold - the threshold for predicting a specific class by probability (if not provided = null).
        numThreads - how many threads to use to evaluate the folds.
        verbose - true if partial fold results should be printed.
        Returns:
        a averaged evaluation result over all 10 folds.
      • tenFoldCrossValidation

        public static <A extends ClassifierEvaluator.EvaluationResult tenFoldCrossValidation​(ClassifierFactory<A> classifierFactory,
                                                                                               de.jungblut.math.DoubleVector[] features,
                                                                                               de.jungblut.math.DoubleVector[] outcome,
                                                                                               int numLabels,
                                                                                               java.lang.Double threshold,
                                                                                               int numThreads,
                                                                                               boolean verbose)
        Does a 10 fold crossvalidation.
        Parameters:
        classifierFactory - the classifiers to train and test.
        features - the features to train/test with.
        outcome - the outcomes to train/test with.
        numLabels - the total number of labels that are possible. e.G. 2 in the binary case.
        threshold - the threshold for predicting a specific class by probability (if not provided = null).
        verbose - true if partial fold results should be printed.
        Returns:
        a averaged evaluation result over all 10 folds.