Class RetrieverEvaluation

  • Direct Known Subclasses:
    MACFACRetrieverEvaluation

    public class RetrieverEvaluation
    extends Object
    Can be used to evaluate different retrievers or differently configured retrievers. Hence, some parameters can be specified in order to control the evaluation.
    Author:
    Maximilian Hoffmann
    • Field Detail

      • logger

        protected final org.slf4j.Logger logger
      • metrics

        protected final List<EvalMetric> metrics
        All metrics that are to computed during the evaluation.
      • trainingObjectPool

        protected TrainingObjectPool<NESTWorkflowObject> trainingObjectPool
        The training object pool, that contains training case base, i.e., the case base to retrieve from, and testing case base, i.e., the case base to extract queries from.
      • groundTruthSimilarities

        protected List<SimpleSimilarityResult> groundTruthSimilarities
        The ground-truth similarities are stored as a list of simple similarity results that come from retrievals.
      • k

        protected Integer k
        Specifies the number of cases to inspect beginning with the most similar case. Metrics can use this constant for their computations.
      • similarityResults

        protected Map<CasePair,​Collection<RetrieverSimilarityPair>> similarityResults
        Map containing a case pair and a list of predictions from all retrievers for this pair. This data can be used to compare the predictions of all retrievers for each case pair.
      • retrievalTimeResultMap

        protected HashMap<String,​List<Double>> retrievalTimeResultMap
        Map containing the names of the retrievers and the times required for each.
      • decimalFormat

        protected DecimalFormat decimalFormat
        Specifies the decimal format for the csv output.
      • trackSimilarityResults

        protected boolean trackSimilarityResults
        Tracks and prints the detailed pairwise similarities computed by every retriever. Enable this if you want to analyze this data later on. Be cautious since this can use a large amount of memory.
    • Method Detail

      • setGroundTruthSimilarities

        public void setGroundTruthSimilarities​(List<SimpleSimilarityResult> gtSimilarities)
        Set the ground-truth similarities that are loaded outside of the evaluation.
        Parameters:
        gtSimilarities - list of simple similarity results where each entry represents a single retrieval
      • addRetrieverToEvaluate

        public void addRetrieverToEvaluate​(String uniqueRetrieverName,
                                           Retriever<NESTWorkflowObject,​Query> retriever)
        Adds a retriever to evaluate.
        Parameters:
        uniqueRetrieverName - a unique retriever name to identify it in the results
        retriever - the pre-configured retriever
      • addMetricToEvaluate

        public void addMetricToEvaluate​(EvalMetric metric)
        Adds a metric to compute during evaluation. Besides these metrics, the avg. retrieval time is always computed.
        Parameters:
        metric - the metric to compute during evaluation
      • setTrainTestCaseBase

        public void setTrainTestCaseBase​(WriteableObjectPool<NESTWorkflowObject> trainCaseBase,
                                         WriteableObjectPool<NESTWorkflowObject> testCaseBase)
        Method used to store the training case base and the testing case base in the retriever evaluation, if this has not been done in the constructor. In the background, these are stored in a TrainingObjectPool.
        Parameters:
        trainCaseBase - The training case base, i.e., the case base to retrieve from.
        testCaseBase - The testing case base, i.e., the case base to extract queries from.
      • setK

        public void setK​(Integer k)
      • setDecimalFormat

        public void setDecimalFormat​(DecimalFormat decimalFormat)
        Change the DecimalFormat for a csv output.
        Parameters:
        decimalFormat - The desired decimal format.
      • addRetrieversToEvaluate

        public void addRetrieversToEvaluate​(Map<String,​Retriever<NESTWorkflowObject,​Query>> retrieverMap)
        Adds retrievers to evaluate.
        Parameters:
        retrieverMap - A map consisting of a unique retriever name to identify it in the results and the pre-configured retrievers.
      • addMetricsToEvaluate

        public void addMetricsToEvaluate​(Collection<EvalMetric> metrics)
        Adds metrics to compute during evaluation. Besides these metrics, the avg. retrieval time is always computed.
        Parameters:
        metrics - Collection of the metrics to compute during evaluation
      • importGroundTruthSimilarities

        public void importGroundTruthSimilarities​(String pathGroundTruthSimilarities)
        Loads the ground-truth similarities and tests them.
        Parameters:
        pathGroundTruthSimilarities - The path to load the ground-truth similarities from.
      • writeSimilarityResultsAsCSV

        public String writeSimilarityResultsAsCSV​(String exportPathExportResults)
                                           throws IOException
        Writes similarity results as CSV file to file system.
        Parameters:
        exportPathExportResults - the path to write the results to
        Returns:
        the evaluation results as a CSV string
        Throws:
        IOException - if something goes wrong during export. Ignore if you do not want to export any values.
      • writeSimilarityResultsAsCSV

        public String writeSimilarityResultsAsCSV​(OutputStream outputStream)
                                           throws IOException
        Writes similarity results as CSV file to an output stream. The calling method is responsible for closing the stream afterwards!
        Parameters:
        outputStream - the stream to write the results to
        Returns:
        the evaluation results as a CSV string
        Throws:
        IOException - if something goes wrong during export. Ignore if you do not want to export any values.
      • writeMetricResultsAsCSV

        public String writeMetricResultsAsCSV​(String exportPathExportResults)
                                       throws IOException,
                                              RetrieverEvaluationException
        Writes metric results as CSV file to file system.
        Parameters:
        exportPathExportResults - the path to write the results to
        Returns:
        the evaluation results as a CSV string
        Throws:
        IOException - if something goes wrong during export. Ignore if you do not want to export any values.
        RetrieverEvaluationException - if something goes wrong while evaluating
      • writeMetricResultsAsCSV

        public String writeMetricResultsAsCSV​(OutputStream outputStream)
                                       throws IOException,
                                              RetrieverEvaluationException
        Writes metric results as CSV file to an output stream. The calling method is responsible for closing the stream afterwards!
        Parameters:
        outputStream - the stream to write the results to
        Returns:
        the evaluation results as a CSV string
        Throws:
        IOException - if something goes wrong during export. Ignore if you do not want to export any values.
        RetrieverEvaluationException - if something goes wrong while evaluating
      • printMetricResultsAsASCIITable

        public void printMetricResultsAsASCIITable()
        Writes metric results as ASCII table to standard output.
      • trackSimilarityResults

        public void trackSimilarityResults()
        Tracks and prints the detailed pairwise similarities computed by every retriever. Enable this if you want to analyze this data later on. Be cautious since this can use a large amount of memory.