Package de.jungblut.clustering
Class AgglomerativeClustering
- java.lang.Object
-
- de.jungblut.clustering.AgglomerativeClustering
-
public final class AgglomerativeClustering extends java.lang.Object"Bottom Up" clustering (agglomerative) using average single linkage clustering. This average is paired up with the so called "centroid" method.
Means in normal language: If we merge two points in space with each other, we look for the nearest neighbour (single linkage, defined by the given distance measurer). If we found the nearest neighbour, we merge both together by averaging their coordinates (average single linkage with centroid method). So if point (1,2) is now nearest neighbour to (5,1) we average and receive (3, 1.5) for the next clustering level. If we are now in the next clustering level and say, we found another cluster (10, 14) which is the nearest neighbour to (3, 1.5). We now merge both again to the next level: ( (10+3)/2, (14+1.5)/2) = (6,5, 7,75). This goes until we have just have a single cluster which forms the root of the resulting cluster binary tree.
Few more details about the algorithm:
- Nearest neighbour search is greedy, which means that even far away merges are taken into account, if there is no nearest neighbour available anymore. Therefore one may want to add a distance threshold, and just add those unrelated clusters to the next level until they find a good clustering or just ignore them.
- Nearest neighbours are found using exhaustive search: for every unclustered node in the level, we look through the whole list of clusters to find the nearest to merge.
- If nearest neighbour search was unsuccessful (there was no item to cluster anymore), the point/vector is added to the next level directly.
- Author:
- thomas.jungblut
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classAgglomerativeClustering.ClusterNodeTree structure for containing information about linkages and distances.
-
Constructor Summary
Constructors Constructor Description AgglomerativeClustering()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static java.util.List<java.util.List<AgglomerativeClustering.ClusterNode>>cluster(java.util.List<de.jungblut.math.DoubleVector> points, DistanceMeasurer distanceMeasurer, boolean verbose)Starts the clustering process.
-
-
-
Method Detail
-
cluster
public static java.util.List<java.util.List<AgglomerativeClustering.ClusterNode>> cluster(java.util.List<de.jungblut.math.DoubleVector> points, DistanceMeasurer distanceMeasurer, boolean verbose)
Starts the clustering process.- Parameters:
points- the points to cluster ondistanceMeasurer- the distance measurement to use.verbose- if true, costs in each iteration will be printed.- Returns:
- a list of lists that contains cluster nodes for each level, where the zeroth index is the top of the tree and thus only contains a single clusternode.
-
-