Blog post

Monday, September 5, 2016

Visualising DNA matches–FTDNA data

I decided to see what I could learn from my Family Tree DNA (FTDNA) match data when I looked at it using the visualisation tool, NodeXL. For my earlier discussion of this tool, see my post where I investigate Ancestry DNA data with it.

I match with around 960 individuals in FTDNA. There are around 10,000 ‘in common with’ matches between those people. Lets see how 10,000 connections between 960 people looks….

NodeXL01_AllPeople

Like a colony of spiders, perhaps?

Each dot represents a person, each line represent a DNA connection between two individuals. There are two bunched up areas of dots where my matches have a lot of interconnections, but otherwise there is little structure to be seen. I tried using all the different layout algorithms available, but this is as good as it gets on the first pass.

Since my father has tested I can divide my DNA matches into two groups based on whether they also match him. I have called the two groups “Paternal” and “Maternal” – which possibly is not entirely accurate but will be close enough for my purposes here – and redrawn the chart with each group laid out separately.

NodeXL02_MaternalPaternal

There are clearly a lot of interconnection between my maternal and paternal matches. I’m surprised by the number of the interconnections, as I don’t descend from an endogamous population.

There are two critical facts about the ‘in common with’ relationships that are not shown in the chart:

  • how close the relationship between my DNA matches is, and more importantly
  • whether their relationships are anything to do with my family tree.

It may be possible to incorporate the second point using FTDNA data. It should be possible to incorporate both points using GEDMatch. I hope to attempt this in a future post.

This exercise suggests to me that it would be even more dangerous than I thought to rely on on ‘in common with’ data without also inspecting segment data.

2 comments:

  1. Since reading your post about using NodeXL for AncestryDNA yesterday I have been playing with visualizing my FTDNA matches as well. The 'group by cluster' option creates some interesting cluster groups but I am not exactly sure what I am seeing (different areas of England or Ireland?)- but a lot of fun anyway!

    ReplyDelete
    Replies
    1. It's great fun, isn't it?! Try mapping longest segment length or shared cM to the vertex size! I'd love to hear about it if you come up with new methods, or make a discovery using it.

      I'm a little wary of working with the raw 'in common with' info for now. It might be better if the most distant relatives are filtered off. I'm contemplating setting up phased chromosome locations as vertices, and linking people to them instead of to each other. I'm not sure that I've explained that well... I know what I mean...

      I'll post again if I try it and it looks promising!

      Delete