Class OnePassExclusiveClustering


  • public final class OnePassExclusiveClustering
    extends java.lang.Object
    A one pass exclusive clustering algorithm. As the name suggests, the clustering algorithm will iterate once over a constructed kd-tree and find nearest neighbours inside a distance threshold. The found neighbours are going to be put into a bitset and will be omitted from search in the following kd-tree searches. Found clusters are checked against a minimum size and maybe discarded when not reaching the configured threshold. This is considered a very fast algorithm, it can be used instead of CanopyClustering.
    Author:
    thomas.jungblut
    • Constructor Summary

      Constructors 
      Constructor Description
      OnePassExclusiveClustering​(double t1)
      Constructs a one pass clustering algorithm.
      OnePassExclusiveClustering​(double t1, int k, int minSize, boolean mergeOverlaps)
      Constructs a one pass clustering algorithm.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      java.util.List<de.jungblut.math.DoubleVector> cluster​(java.util.List<de.jungblut.math.DoubleVector> values, boolean verbose)
      Cluster the given items.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • OnePassExclusiveClustering

        public OnePassExclusiveClustering​(double t1)
        Constructs a one pass clustering algorithm. With unlimited maximum number of neighbours retrieved and a minimum cluster size of 2.
        Parameters:
        t1 - the maximum distance of neighbourhood.
      • OnePassExclusiveClustering

        public OnePassExclusiveClustering​(double t1,
                                          int k,
                                          int minSize,
                                          boolean mergeOverlaps)
        Constructs a one pass clustering algorithm.
        Parameters:
        t1 - the maximum distance of neighbourhood.
        k - the maximum number of neighbours to retrieve inside the t1 threshold.
        minSize - the minimum size of a cluster.
        mergeOverlaps - if true, overlapping found centers by t1 distance will be merged.
    • Method Detail

      • cluster

        public java.util.List<de.jungblut.math.DoubleVector> cluster​(java.util.List<de.jungblut.math.DoubleVector> values,
                                                                     boolean verbose)
        Cluster the given items.
        Parameters:
        values - the vectors to cluster.
        verbose - if true, outputs progress to STDOUT.
        Returns:
        a list of centers that describe the given vectors.