Eo'ganacht septs


Eoganacht History
Case Studies
R_L21 Haplogroups
Name Variants
Family Trees
Trinity Study
Contact us

Anatole Klyosov - summary of methods to confirm family trees through mutation rate analysis of clusters

To calculate mutation rates you first need to identify a cluster - a branch of a haplogroup tree with the modal haplotype being the common ancestor.

Copying of DNA from father to son can result in mutations in SNPs (occur once and remain) and STRs (short tandem repeat markers that are either shortened or lengthened).  The group of STRs are known as haplotypes.  The STRs of a common ancestor is known as the modal haplotype of a haplogroup.

Example of mutation rates: in 6 of 12 markers mutations occur at the rate of 1 in 2840 years.  In 67 markers, 1 mutation takes place in 170 years.  If you are a genetic distance of 2 from 67 markers of the modal haplotype of a haplogroup, chances are that it took 340 years for the mutation to take place.





Ave mutation rate

Ave mutation rate

      per haplotype per maker
6 of 12 19, 388, 390, 391, 392, 393 1 in 2840 yrs .0088 .00147 in FTDNA order
12 12 marker panel 1 in 1140 yrs .022 .00183 in FTDNA order
25 25 marker panel 1 in 540 yrs .046 .00184 in FTDNA order
37 37 marker panel 1 in 280 yrs .090 .00243 in FTDNA order
67 67 marker panel 1 in 170 yrs .145 .00216 in FTDNA order

In order to find the Time to the Most Recent Common Ancestor (TMRCA), more than one haplotype will be needed: at least

  • (40) 6 markers
  • ??? 12 markers
  • (10) 25 markers
  • (4) 67 markers
The smaller the amount of markers or haplotypes included, the less reliable the result.  The average STR mutations is used to calculate the time span from the MRCA.

Assumption: A generation is 25 years and occurs 4 times in 100 years.

2 Methods:

  • Logarithmic - unmutated
    • ln (B/A) = kt
      • B=total haplotypes in set
      • A=number of unchanged base haplotypes in set
      • k=average mutation rate
      • t=# years
      • example
        • B= 100, A=80 12 marker haplotypes k =.0088;
        • ln (100/80)/.0088 = 250 years

        divide by zero error A (number of unchanged base haplotypes in set) can not = 0

  • Linear - mutation count
    • n/N/μ = t
      • n=# mutations
      • N=# haplotypes
      • μ=average mutation rate
      • t=# years

If both methods have approximately the same result this is a 'clean' result

If the methods yield different results, the haplotypes in calculation are probably from a mixed population:

"The main reasons of such a discrepancy are typically as follows:
  1. different mutation rates employed by researchers,

  2. lack of calibration of mutation rates using known genealogies or known historical events, or when a time depth for known genealogies was insufficient to get all principal loci involved,

  3. mixed series of haplotypes, which are often derived from different clades, and in different proportions between those series, which directly affect a number of mutations in the series,

  4. lack of corrections for reverse mutations (ASD-based calculations [see below] do not need such a correction),

  5. lack of corrections for asymmetry of mutations in the given series of haplotypes in some cases."

I used the different phylogenetic PHYLIP programs, features and TreeView to find the clusters.  If they didn't have a clean result, I realized that not enough haplotypes were being represented.  Given the population in the world and the number of the South Irish in my FTDNA projects that I include in the case study the probability for finding family clusters is quite low.  For perspective the U.S. population is 312,016,766, FTDNA has 343,915 members and I have 185 South Irish members with a genetic distance less than 9.



FTDNA members vs U.S. population 343,915/312,016,766= .0011
South Irish project members vs FTDNA members 185/343,915= .0005
South Irish FTDNA members vs U.S. population 185/312,016,766= 0.00000059
South Irish FTDNA members vs Irish immigrants during famine 185/2,000,000= 0.0000925
South Irish FTDNA members vs Irish immigrants in U.S. 185/36,278,332= 0.000005099463

Reviewing the percentages above, you can clearly see that the number of project members is quite low and many more results need to be added to build out
family trees and branches.  As a result the number of 'clean' mutation rate results will be lower than anticipated.  The majority of the work will be the in the identification of the clusters using the phylogenetic trees and verifying the relationships using the logarithmic and linear mutation rate calculation methods.

The methodology uses the South Irish research of family trees and branches as an example.  All other FTDNA projects should have similar numbers in comparison to the total population.


Paper - Mutation Rates and some historical evidence written in the Y Chromosome I. Basic Principals and Method - Anatole Klyosov

Paper - DNA Geneaology, Mutation Rates, and Some Historical Evidence written in the Y Chromosome II. Walking the Map  - Anatole Klyosov

Additional Papers from Anatole Klyosov's Series

Interpreting Clusters DNA & Genealogy Colleen Fitzpatrick, Andrew Yeiser; see pages