Friday, January 19, 2007

Clades, Subclades & Clusters

Haplogroups are large groups of haplotypes that can be used to define genetic populations and are often geographically oriented.    Clade is commonly used in genetic genealogy to mean haplogroup.    The several haplogroups together form the Haplogroup Tree.

From:    "David Wilson"
Date:    Sun, 3 Dec 2006 15:17:26
"Subclades are strictly defined not by sets of STR values, but by single mutations called SNPs (single nucleotide polymorphism) or, more generally, UEPs (unique event polymorphism).    In the R family, for example, the SNP called M343 distinguishes R1b, M269 distinguishes R1b1c, and other downstream SNPs like SRY2627 or S21 distinguish branches within the M269 group.

When you aggregate haplotypes that belong to individuals who are members of a specific haplogroup, you will sometimes find that specific STR values for particular markers are associated with that haplogroup but not with others.    To use a single clear example, R1b1c7 (identified by the SNP M222) has distinctive dominant values 390=25, 392=14 and 385a,b=11-13 among others.    That is such a clear signature that it is pretty much a dead giveaway for R1b1c7 status even in the absence of formal SNP test.

In other cases, the STR values are less clear cut.    On the basis of STR values alone, for example, one cannot confidently predict membership in either R1b1c6 or R1b1c10.

Even without the use of SNPs, you can find clusters through statistical processes.    John McEwan's forty-some clusters are the result of crunching a large number of haplotypes.    It so happens that his STR19 cluster correlates extremely strongly with R1b1c7, and his STR22 cluster correlates very strongly with one variety of R1b1c9 commonly called "Frisian" R1b1c.    But there are other R1b1c9 haplotypes that do not look at all "Frisian."    So the situation is complex.

In short, clusters are statistical constructs based on many STR values for many loci, and subclades are hard distinctions based on observable yes/no distinctions for a particular SNP or UEP.    Clusters may or may not correlate strongly with SNP distinctions, and a cluster by itself does not necessarily constitute a subclade.

There is more about the Y-tree and the SNPs that define its branches at if you are interested.    Several of the regular posters on this list are responsible for the pages addressing individual branches of the tree on that site."

Although a haplogroup identity can only be proven with a Y-chromosome SNP test, it can sometimes be inferred as indicated above and further by Ken Nordtvedt who posits that a value of 8 at DYS455 indicates haplogroup I1a with 99.9 percent confidence based on thousands of haplotypes seen to date; and if you are 11 at DYS455, you are not I1a.

Within haplogroups, statistical analysis of STR values suggest different clusters.    For those interested in the technical details, also kindly provided by Ken Nordtvedt -

Norse I1a (14/23/14-14)[DYS19/390/385a-b] and ultraNorse I1a (14/23/14-15) is dominantly DYS462 = 13;

AngloSaxon I1a (14/22)[DYS19/390] is dominantly DYS 462 = 12.

DYS 462 mutates extremely slowly and thus results are strong modals for research.

Nordtvedt also observes that if you have 67 markers you have DYS511.    Within I1a it serves the same purpose as DYS462 99 percent of the time.    DYS511 = 9 is equivelant to DYS462 = 12 and DYS511 = 10 is equivelant to DYS462 = 13.    This is only true in I1a.

Reference: Y- Haplogroup & Sub-clade Projects


Post a Comment

<< Home