What We Have Learned

From a Y-DNA Surname Project


By Lee Allyn James

Olympia, Washington



Originally Published In “Olympia Genealogical Society Quarterly,”

Olympia, Washington, Volume XXXVI, No. 3, July 2010, Pages 6-13.



In the last decade the use of DNA analysis to supplement conventional genealogical research has skyrocketed. DNA analysis cannot replace the old-fashioned genealogical methods of research, but it can be quite valuable in verifying what one has already discovered or perhaps suggesting an alternative direction for the conventional research. The two most common DNA analyses are associated with the mitochondrial DNA (mtDNA) passed on from mothers to their children, and the Y-chromosome DNA (Y-DNA) passed on from fathers to their sons, with the latter (Y-DNA) being the most commonly used analysis method. Y-DNA analyses are, at present, the most frequently used because they are usually associated with a stable surname. The Y-chromosome is passed from father to son and, except for occasional mutations, remains unchanged from generation to generation.[1]  The other property that makes Y-DNA more useful in genealogy than mtDNA is that the Y-DNA mutation rate is fast enough that we can see changes in genealogical times, but not so fast that they happen all the time. On the other hand, mtDNA mutation rates are so low that even if two individuals match, their common ancestor may be many thousands of years ago.


Because of the above properties of the Y-chromosome and the probability of stable surnames, a number of “surname projects” have been organized and numerous databases have been made available on-line wherein one can compare his own Y-DNA profile with that of others with the same surname. This article describes such a project featuring men with the surname James who have had their Y-DNA tested and entered into the database. The overall James Y-DNA Surname Project presently features the Y-DNA profiles of (at present) over 115 James men, and the results for our group may be viewed in at http://www.familytreedna.com/public/james/

default.aspx. The results show that there are scores of unrelated James lines, and the author‘s group is that group of James men who are descended from the David James who was born about 1669 in Wales, emigrated to America, and died in Chester County (now Delaware County), Pennsylvania in 1739. Many Y-DNA surname projects have the same objective, but the present article shows how much more can be learned from such a study.


Descendants of David James


Eight men with the surname James have had their Y-DNA tested, and the results show that they are descended from a David James who was born about 1669, possibly in Radnorshire, Wales, [2] emigrated to America, and died 27 June 1739 in Chester County, Pennsylvania. Because this David James had descendants also named David, he is referred to herein as ¯Ancestral David. The eight participants are shown in Figure 1. Seven of the eight had their Y-DNA characterized by FamilyTreeDNA® (FTDNA), while the eighth, Robert, employed a different laboratory, Familybuilder®. The numbers below each of the participants shown in Figure 1 are the kit numbers assigned by FTDNA. Each of the participants had researched their ancestors employing conventional genealogical methods. The remainder of this article discusses what has been learned as a result of this project.


The individual Y-DNA results for 37 markers are tabulated in Figure 2. Not all of the participants had the same number of markers tested: Robert, 16 markers[3]; James, 25 markers; Wilbur, Larry and Joe, 37 markers; and Warren, Lee, and Chris, 67 markers. Nevertheless, some very meaningful conclusions can be drawn from the results shown in Figure 2. It will be noted in Figure 2 that a few mutations exist between the different members; mutations are denoted by a shaded cell. For example, Wilbur and Robert are brothers, and both show a value of 31 at DYS[4] 398-2, whereas the other James participants all show a value of 30. It was originally thought that Wilbur and Robert had differing values at DYS H4[5] because their respective laboratories showed a difference at DYS H4. However, they used different test labs[6] and, after applying the appropriate correction, both brothers had the same value at DYS H4. Finally, Larry, Joe, and Chris show a value of 39 at DYS CDYb, whereas the other participants all show a value of 38. However, as will be shown, these are not unexpected variations. Note that the DYS markers in Figure 2 denoted with an asterisk generally exhibit a faster mutation rate than the average.


Example 1: Determine if the eight participants are related


The results of Figure 2 show that the largest difference in the Y-DNA results occurs between Warren (or Lee) and Larry, a difference of one for three markers (i.e., a “genetic distance” of 3). FTDNA statistics predict that for a 34/37 match, there is a 95% probability that their Most Recent Common Ancestor (MRCA) will be no more than about fifteen or sixteen ancestors away, and in fact their MRCA (Ancestral David) is only nine generations away. (Warren and Lee match 67/67, and the 95% probability of their MRCA (Samuel) is no more than six generations - the observed result is five). Therefore, it is clear that all of the eight James participants are indeed related to one another.


Although their research is incomplete, it is clear that Joe and Chris fit into this James group. They either descend from Ancestral David or, at the very least, share a common male James ancestor with Ancestral David. More about Joe and Chris later.


Example 2: Determine if Ancestral David had a father and/or a brother in America


Older James genealogies, based upon unproved legends, have suggested that Ancestral David had a brother How-ell James and a father James James, both of whom also emigrated to America and settled in Chester County, Pennsylvania. Fortunately, descendants from those two men have participated in the overall James Surname Y-DNA Study. One participant (Kit ¹ 49832) whose descent from Howell James is yet to be completely proved, and two participants descended from James James (Kits ¹ 57434 & 67782), and their results showed that our Ancestral David was not related to either Howell or James. Kit ¹ 49832 exhibited a “genetic distance”[7] of 17 on a 12-marker comparison with Warren, Kit ¹ 57434 exhibited a genetic distance of 5 on a 12-marker comparison, and Kit ¹ 67782 exhibited a genetic distance of 13 on a 37-marker comparison, all indicating no relationship. Hence, the Y-DNA analysis disproved the earlier suggestions that Ancestral David had a brother (Howell) and father (James) also living in Chester County.


Example 3: Derive the 37-marker haplotype[8] of Ancestral David


The following analysis is based on 37 markers and employs the results for the six James men proved to be descended from Ancestral David. It is also stipulated that “parallel mutations” are not allowed in this analysis. A parallel mutation is defined as the same mutation at the same DYS in two different descendant lines. Although rare, parallel mutations do occur but are not considered in this analysis.


Warren and Lee are both descended from Samuel, and they match 37/37 (actually the full match is 67/67). In the 37 markers considered, Warren and Wilbur differ by only marker (DYS 389-2). Their common ancestor is Ancestral David, so Ancestral David‘s 37-marker haplotype is the same as Warren and Wilbur‘s on the 36 markers that they have in common. Ancestral David‘s value at DYS 389-2 could either be 30 (Warren, James, and Lee) or it could be 31 (Wilbur and Robert). Larry, who like Wilbur and Robert, is also descended from Thomas, and Larry has a value of 30 at DYS 389-2. Therefore, Ancestral David and his grandson Samuel have the same 37-marker haplotype and, for that matter, the same 37-marker haplotype as Warren and Lee.

Looking now at Daniel, the son of Thomas, since Wilbur has all the same marker values as Ancestral David except for DYS 389-2, then Daniel has to have the same 36 markers. Since parallel markers are not allowed in this analysis, then Daniel must have a value of 30 at DYS 389-2 since Larry (who is descended from Daniel) has a value of 30 there. Therefore, Ancestral David, his sons Evan and Thomas, and his two grandsons Samuel and Daniel, all have the same 37-marker haplotype, and this is shown in Figure 2. However, if more markers were to be tested, it is possible that Ancestral David and his grandsons might eventually exhibit different haplotypes.


The mutation at DYS 389-2 exhibited by Wilbur and Robert happened somewhere between Daniel‘s son Thomas, and Wilbur and Robert‘s father William. Three mutations also ocurred between Daniel‘s son Jonathan and Larry, but all of Larry‘s mutations occur in the faster mutating markers (denoted by asterisks in Figure 2). For example, DYS CDYb mutates about fifteen times faster than the “average” marker.


Example 4: Where do Joe and Chris fit in?


Both Joe and Chris match Ancestral David‘s 37-marker Y-DNA profile except for a value of 39 at DYS CDYb instead of Ancestral David‘s value of 38.


One possible hypothesis might be that Joe and Chris fit into the line from Ancestral David through Thomas, Daniel and Jonathan. If this is true, then Joe and Chris must descend from Jonathan to account for the CDYb mutation (Larry has the same mutation) because Jonathan‘s brother Thomas does not have the CDYb mutation. However, Elias (Joe‘s ancestor) was born circa 1770-1780 and died circa 1840-1850, while Jonathan was born circa 1785 and died in 1843. Therefore, Elias and Jonathan were contemporaries and Elias could not have descended from Jonathan. Although parallel mutations are generally not allowed in this type of analysis, an exception might be made in the case of Joe, Larry and Chris because CDYb mutates about fifteen times faster than the “average” marker.


Chris is a 37/37 match with Joe, and like Joe his research is incomplete. Chris‘s research extends back only to Isaac, and Isaac might be a possible son of Elias. This link is suggested by several factors: 1) Joe and Chris match 37/37, suggesting a very close relationship.[9]  2) Unlike most of the other James lines, both Joe‘s line and Chris‘s line trace back to the deep South (Alabama, Georgia, Mississippi). 3) The full name of Isaac‘s son is William Elias James, suggesting that the middle name might be honoring a grandfather (Elias).


Besides Daniel, Ancestral David‘s son Thomas had another son Jonathan (a full brother to Daniel), a son Enoch who was a half-brother to Daniel, and Elias (a half-brother to Daniel). Daniel‘s sons and Elias‘ descendants are fully accounted for and Jonathan had only girls, but it is possible that Joe and/or Chris might descend from Enoch. Enoch married Rachel Richards in Pennsylvania. We have lost track of Enoch after his marriage to Rachel, and it is possible that he moved to the deep South where Joe‘s and Chris‘ lines presently end. It is interesting that Elias (Joe‘s ancestor) was born 1770-1780, and thus is the right age to be a son of Enoch. Elias had a son named Thomas (perhaps named after a great-grandfather?), and this Thomas had sons named Enoch and David, and a daughter named Rachel. These are all relatively common given names of the period, but they do suggest a possible relationship back to Enoch, the grandson of Ancestral David.


It is also possible that Joe and/or Chris descend from Ancestral Davis‘s son Isaac. However, nothing is known about Isaac or his descendants and there is no evidence to either prove or disprove this suggestion. Further research is necessary to determine where Joe and Chris fit into this James group, but there is no question that they do indeed fit.


Example 5: The unhappy case of a ninth participant


A participant in the overall James Surname Project (name and kit number withheld for privacy reasons) had researched his James ancestry back to Ancestral David and, in addition, had written a book on his family history. He had traced his ancestry back through Ancestral David‘s son Isaac, and the rest of our group was very pleased because he would have been the only known descendant of Isaac. However, when his 12-marker


Y-DNA test results came back he had only a 8/12 match (genetic distance of 4) with the rest of the group, and clearly was not related to the rest of the group. FTDNA states that with a 8/12 match, “the odds greatly favor that you have not shared a common male ancestor with this person within thousands of years.”


Two possibilities for this difference are possible: 1) the genealogical research was incorrect, or 2) a “non-paternal event” occurred somewhere along his line (e.g., an adoption, an unregistered name change, or marital infidelity). Although an unhappy outcome, this negative result at least showed the participant that he has a problem that he might be able to solve.


Example 6: Determine our James Group Haplogroup


FTDNA defines haplogroups this way: “If we look at the world population as a huge genealogical tree, the haplogroups are the original branches of the tree, which characterized the early human migrations. Therefore, haplogroups are normally associated with geographical regions.” Knowledge of one‘s haplogroup could provide an idea about the migration routes of one‘s “deep ancestors” (those ancestors who lived much earlier than recorded history).


An approximate estimate of one‘s haplogroup may be made using results such as those shown in Figure 2. For example, such an estimate predicted that our little James group belongs to Haplogroup R1b, a group of people who expanded throughout Europe as humans re-colonized after the last Ice Age about 10,000 years ago. Oppenheimer[10] has suggested that the R1b ancestors of the Welsh spent the last Ice Age in refuge from the ice close to the Bay of Biscay near the present France/Spain border, and traveled by land to Wales because the sea levels were about 137 meters (450 feet) lower than today, but this theory has been disputed by others. The R1b Modal Haplogroup is shown in Figure 2. However, further refinement of one‘s haplogroup beyond the simple R1b designation requires a different type of DNA test - one that evaluates SNPs.[11]


Warren James in our James group has had a number of SNP tests to further refine his (and the group‘s) haplogroup. The results are generally portrayed as a “positive” or a “negative” for a given SNP. Warren‘s results, generated at two laboratories (FTDNA and EthnoAncestry®), are:


Positive: M207, M173, M343, P25, M269, S129, S128, S116, and L21.[12]


Negative: M18, M73, S21, S29, S26, M65, M153, SRY2627, S28, M126, M160, S68, M37, M222, and P66.


Although the testing and much of the DNA analysis have been standardized worldwide, the interpretation of haplogroups is still a subject of discussion between analysts. Without going into further detail, the above results put Warren (and the rest of the group) into Haplogroup R1b1b2a1a2f* (where the asterisk denotes that further refinement is necessary but not possible at this time) according to the ISOGG,[13] and R1b1b2a1b5* according to FTDNA. The above difference is mainly an issue of timing - ISOGG tends to declare a new haplogroup earlier than FTDNA which tends to review the data longer. Additional SNPs will need to be discovered and tested for before additional refinement in our haplogroup is possible. At this time additional research (and much more data) is necessary in order to further refine “deep ancestry” migration routes, but the hope is that someday this will be possible. It is also probable that the above haplogroup designations will change as new SNPs are discovered and tested for.


Example 7: Average Mutation Rates For Our James Group


Knowledge of the mutation rates for our James group was undoubtedly not high on the priority list when the various participants joined our group. However, knowledge of mutation rates is important to geneticists in understanding how (and when) Y-DNA mutates, and this is the subject of considerable ongoing discussion and controversy.[14] Therefore, the Y-DNA results for our James group have been included in a genetic/genealogy database maintained by Charles F. Kerchner, Jr.[15]


It has been long been recognized that some DYS markers mutate at a higher (or lower) rate than the “average.” For example, estimates of multiples above the “average” rate for some of the faster-mutating markers are: DYS 464c 2.5 times faster, DYS 449 3.6 times faster, DYS 576 4.4 times faster, and DYS CDYb 15.4 times faster. It is also apparent that the average mutation rates in some haplogroups are higher than in other haplogroups. For example, the average marker in Haplogroup R1a appears to mutate about 1.8 times faster than the average marker in Haplogroup R1b.[15]


Kerchner‘s predictions for “average” mutation rates (ì) for our James group are: 0.0031 mutations per transmission event for a 12-marker FTDNA test, 0.0030 for a 25-marker test, and 0.0040 for a 37-marker test, where one transmission event (T) is from a father to a son (one generation). This implies that, for a 37-marker (M) profile, the average “lifetime” of a DYS haplotype in our James group is


1/[1 - (1 - ì)M] = 1/[1 - (1 - 0.004)37] = 7.3 generations.


In other words, on the average, a Y-DNA haplotype in our group retains the same value for about 7.3 generations before a mutation occurs and the haplotype changes. The average mutation rate (ì) can also be used to verify that the number of mutations observed between two persons is reasonable. For example, Warren (or Lee) and Larry differ by 3 mutations in 37 markers. The number of expected mutations can be estimated by


(ì)(M)(T) = (0.004)(37)(16) = 2.4 mutations


where T is the number of transmission events (16 events, 7 from Warren to Ancestral David and 9 from Larry to Ancestral David). The prediction of 2.4 mutations compared to the observed 3 mutations is very reasonable considering that all three of Larry‘s mutations are in the faster-mutating markers.


Example 8: “Are We Related to Jesse James?”


Almost everyone with the surname James has, at one time or another, been asked the question “Are you related to the outlaw Jesse James?” In the case of our little James group, traditional genealogical research has already shown that it is quite unlikely that we were related to the famous outlaw Jesse Woodson James (1847-1882). However, it is always remotely possible that somewhere along our relatively long James line an unknown James man escaped notice or perhaps even an earlier undiscovered James connection in the British Isles. The James Y-DNA surname project has allowed us to completely foreclose this remote possibility. One participant in the overall James Project (Kit ¹ 108303) is descended from a common male ancestor as the outlaw Jesse James: William James (1754-1805) who was born in England. Hence, his Y-DNA profile should be close to, if not exactly matching, that of Jesse James.[16] Comparing this participant‘s (Kit ¹ 108303) 67-marker Y-DNA profile to that of Warren and/or Lee shows a genetic distance of 29 in a 67-marker test, clearly an indication of no relationship. Therefore, those in our little James group can state unequivocally that we are completely unrelated to the outlaw Jesse James.




This Y-DNA surname project has been a worthwhile experience for the participants in our little James group. Six of the participants have been able to validate their research about their descent from Ancestral David. Although their research is not yet complete, two other participants have learned that they either descend from Ancestral David or, at the very least, share a common male James ancestor with Ancestral David. Hence, they now have a genealogical target to aim for. The results of this project have also allowed earlier suggestions to be disproved - that Ancestral David had a brother and father also living in Chester County. There was an unhappy outcome for a ninth participant who learned that he did not fit into this James group, but at least he knows that he has a problem that he may be able to solve. Considerable refinement has been made in characterizing the group‘s haplogroup, although considerably more data will be required before our group‘s pre-historical ancestors can be further defined. In addition, the average mutation rates for our group have been determined. It was shown that our James group was not related to the outlaw Jesse James. Finally, readers are reminded that caution should be exercised when comparing Y-DNA results from different testing laboratories.




The logic arguments for Objectives 3 and 4 were developed by Robert D. McLaren, and were part of his paper “Managing a Large Surname DNA Project, With Some Interesting Results,” presented at the Annual Meeting of the National Genealogical Society held in Virginia in May 2007. The author thanks Bob for letting him use his logic arguments. Thanks are also due to Charles F. Kerchner, Jr. for providing the average mutation rates for our James group. Finally, the author is also indebted to Susan (James) Rosine, Warren‘s daughter, for providing the SNP and haplogroup information that comprises Example 6, as well as several useful suggestions that improved this article.


Final Note


This is an expanded written version of a talk delivered to the Olympia Genealogical Society Spring Seminar held in Olympia, WA in 2008.


End Notes


1.     Thomas H. Roderick, PhD, “The Y Chromosome in Genealogical Research: ‘From Their Ys a Father Knows His Own Son,’” National Genealogical Society Quarterly, Vol.88, June 2000, pp. 122-143.

2.     Although his origin in Radnorshire has not been definitively proved, it is known that David James was Welsh because the inventory of his estate lists books written in the Welsh language.

3.     Robert also tested a seventeenth marker and had a value of 23 at DYS ¹ 635 (also called GATA-C4), a marker not tested for by the other James participants.

4.     DNA Y-chromosome Segment. A nomenclature system established by international consensus. The DYS is the “name” of each marker.

5.     Although unusual, the observation of a mutation between brothers is not unheard of. For example, Bennet C. Greenspan, the founder and President of FTDNA relates how he, and his brother Jim differ by a value of one at DYS 385a (Bennett had the mutation and passed it on to his son): Bennett Greenspan, “An Insider‘s Look at the Genealogy DNA Field,” New England Ancestors, Vol. 5, ¹ 3, Summer 2003, pp. 21-23.

6.     Brothers Wilbur and Robert employed different DNA testing laboratories, and not all laboratories report results for DYS H4 on the same basis. More information on this issue may be found at the National Institute of Standards & Technology website: http://www.cstl.nist.gov/biotech/strbase/YSTRs/H4_nomenclature.htm . FTDNA (Wilbur‘s test) measures GATA-H4, while Familybuilder (Robert‘s test) measures TAGA-H4 and, when the appropriate correction was applied, both brothers had the same value (reported herein as GATA-H4).

7.     A difference of one unit at one marker is “genetic distance” of one. A difference of one unit at each of two markers, or a difference of two units at one marker, is a genetic distance of two, etc.

8.     Haplotype. One person‘s set of values for the markers that have been tested. Two individuals that match on all markers but one have two distinct haplotypes.

9.     Chris also matches Warren and Lee 66/67, again a very close relationship.

10.   Stephen Oppenheimer, The Origins of the British - A Genetic Detective Story, New York, NY: Carroll & Graf Publishers, 2006.

11.   Single Nucleotide Polymorphism (SNP). A change in the DNA that happens when a single nucleotide (A, T, G, or C) in the genome sequence is altered.

12.   A person who tests “positive” at L21 has a “G” (Guanine) where almost everyone else has a “C” (Cytosine); i.e., a SNP.

13.   International Society of Genetic Genealogy, http://www.isogg.org.

14.   T. Whit Athey, PhD, “Mutation Rates - Who‘s Got the Right Values?,” Journal of Genetic Genealogy, Vol. 3, ¹ 2, 2007, pp. i-iii.

15.   http://www.kerchner.com/dnamutationrates.htm.

16.   Jesse James‘s mtDNA has also been characterized: see Anne C. Stone, James E. Starrs, and Mark Stoneking,

      “Mitochondrial DNA Analysis of the Presumptive Remains of Jesse James,” Journal of Forensic Science, Vol. 46, ¹ 1,

       2001, pp. 173-176.



Figure 2. Thirty-seven-marker results for the eight participants in this portion of the James

             Surname Y-DNA Project.