If you are someone who has any interaction with population genetics, the letter K may cause you a distinct feeling of uneasiness. Identifying the number of distinct genetic clusters (often represented as K) in a data set is a primary component in population genetics analyses, but it often becomes a point of contention. The hierarchical nature of genetic variation can make K difficult to determine and sometimes lead to questions about what the true K actually means.
A helpful tool for making objective determinations of K might be CLUMPAK, a new piece of software authored by Kopelman et al. and appearing in the most recent issue of Molecular Ecology Resources. CLUMPAK offers a simplified method for comparing among runs of different STRUCTURE-like programs, streamlining many of the calculations that have to be sometimes done by hand and providing some new solutions for summarizing runs.
You can check out their web interface (complete with good example files) or download a standalone Linux version.
So go forth and calculate K, but don’t let the clusters consume you!
Kopelman N.M., Mayzel J., Jakobsson M., Rosenberg N.A. & Mayrose I. (2015). Clumpak : a program for identifying clustering modes and packaging population structure inferences across K , Molecular Ecology Resources, n/a-n/a. DOI: http://dx.doi.org/10.1111/1755-0998.12387