Class KMeansClustering


  • public final class KMeansClustering
    extends java.lang.Object
    Sequential version of k-means clustering.
    Author:
    thomas.jungblut
    • Constructor Summary

      Constructors 
      Constructor Description
      KMeansClustering​(int k, de.jungblut.math.DoubleVector[] vectors, boolean random)
      Initializes a new KMeansClustering.
      KMeansClustering​(int k, java.util.List<de.jungblut.math.DoubleVector> vectors, boolean random)
      Initializes a new KMeansClustering.
      KMeansClustering​(java.util.List<de.jungblut.math.DoubleVector> centers, java.util.List<de.jungblut.math.DoubleVector> vectors)
      Initializes a new KMeansClustering.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      java.util.List<Cluster> cluster​(int iterations, DistanceMeasurer distanceMeasurer, double delta, boolean verbose)
      Starts the clustering process.
      de.jungblut.math.DoubleVector[] getCenters()  
      double getClusteringCost()  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • KMeansClustering

        public KMeansClustering​(int k,
                                de.jungblut.math.DoubleVector[] vectors,
                                boolean random)
        Initializes a new KMeansClustering.
        Parameters:
        k - the number of centers to use.
        vectors - the vectors to cluster.
        random - true if use random initialization, else it will just pick the first k vectors.
      • KMeansClustering

        public KMeansClustering​(int k,
                                java.util.List<de.jungblut.math.DoubleVector> vectors,
                                boolean random)
        Initializes a new KMeansClustering.
        Parameters:
        k - the number of centers to use.
        vectors - the vectors to cluster.
        random - true if use random initialization, else it will just pick the first k vectors.
      • KMeansClustering

        public KMeansClustering​(java.util.List<de.jungblut.math.DoubleVector> centers,
                                java.util.List<de.jungblut.math.DoubleVector> vectors)
        Initializes a new KMeansClustering.
        Parameters:
        centers - initial centers, maybe seeded from CanopyClustering.
        vectors - the vectors to cluster.
    • Method Detail

      • cluster

        public java.util.List<Cluster> cluster​(int iterations,
                                               DistanceMeasurer distanceMeasurer,
                                               double delta,
                                               boolean verbose)
        Starts the clustering process.
        Parameters:
        iterations - the iterations to cluster.
        distanceMeasurer - the distance measurement to use.
        delta - is the change in the sum of distances over iterations. If the difference is lower than delta the iteration will stop.
        if - true, costs in each iteration will be printed.
        Returns:
        the clusters, which contain a center and the assigned vectors.
      • getClusteringCost

        public double getClusteringCost()
      • getCenters

        public de.jungblut.math.DoubleVector[] getCenters()
        Returns:
        the current state of the centers.