Twigs of Yore: DIY Ancestry DNA circles

Blog post

Tuesday, June 14, 2016

DIY Ancestry DNA circles

Ancestry didn’t give me any DNA circles, so I made my own. If you want to join me in the DNA circle loop, then you will need AncestryDNA results and:

The DNAGedcom client ($5 monthly subscription); and
NodeXL (free basic version).

Use the DNAGedcom client to download your Ancestry matches and in-common-with (ICW) results as spreadsheets. You will need to click “Gather Matches” and “Gather ICW”. It’s the most convenient way to get the shared match information from Ancestry.

NodeXL is where the magic happens. It’s an Excel tool for social network analysis. I used NodeXL because it’s in Excel which I’m familiar with and it has all the facilities I need in the free version. I don’t know anything about social network analysis, and I didn’t need to in order to get the result I wanted. Follow the instructions on the website linked above to get started. It takes a little fiddling to get used to it, but in the familiar Excel interface it’s not as intimidating as it might at first seem.

Now the fun begins!

When you create a file using the template, you will see an extra ribbon, and an area for your charts to display. Those extra features won’t be there when you open Excel as normal, only when you open a spreadsheet from the template.

You will see several tabs. The most important for our purposes are “Vertices” and “Edges”. Think of “Vertices” as people, and “Edges” as relationships between people. The list of Match IDs goes into “vertices”, and the paired Match IDs in the ICW file goes into “edges”. As it’s Excel, you can cut and paste data into the sheets. I pasted twice on each sheet – the first time with just the match ID numbers in the first column (or two columns for Edges), then the rest of the columns into the “add your own columns here” section.

Click “Refresh Graph” to see a graph of your information. When you first drop match information in you will probably get a big mess of dots and crossing lines. There are options to fix that.

With a bit of fiddling, I came up with this:

Look! I’ve got circles!

Each dot represents a person, each line a DNA relationship between two people. When trying to interpret the information remember that that Ancestry has a cut off – it won’t show shared matches unless at least one of the people is a fourth cousin or closer to you. At least, that’s how I think it works. I’m not sure if they also have to be fourth cousins or closer to each other to show up. If you can enlighten me on exactly how it works, I’d be grateful.

The point is to remember that because of the cut-off there are likely to be other relationships between the dots that you can’t see. I assume that’s what’s happening with the fan shaped ‘circles’. I had 35 fourth cousins or closer at the time of making this chart and no circles or “New Ancestor Discoveries”.

To get distinct clusters I first used the “Group by cluster…” option on the toolbar.

The groups might still be mixed up at this stage. To separate the groups from each other, I clicked the little arrow dropdown to the right of “Circle” (above) and under “Layout options” I chose “Lay out each of the graph’s groups in it’s own box”.

For the layout I chose “Circle”. Because I wanted DNA circles. You could make a DNA spiral or a sine wave or a grid or a random layout or … but circles work nicely and they help with the circle-envy. This option is available both on the main NodeXL ribbon, and in the settings at the top of the graph area.

“Autofill columns” on the main ribbon lets you easily move information from your own columns into the columns that control the graph’s appearance. There are a lot of options to play with – size and colour of dots, thickness of lines all have potential. I set the size of each dot to the number of Shared cM with me. You can also label the dots using information on the sheet. The obvious label to use is the person’s name.

You need to refresh the graph by clicking “Show graph” when data changes on a worksheet. If you’re only changing display options, you can save the recalculation time by clicking “Lay Out Again”.

There’s a lot of fun to be had just playing with the options. I’ve also tried this with my FTDNA results. For those, I had a much busier chart. Different clustering algorithms had different effects, and the dynamic filter came in useful to clear away matches who sat in distracting “pile up regions” which could be seen as a dense collection of interlinked spots.

In my next post I’ll show you how I used my DIY Ancestry DNA circles to identify a new research lead.

11 comments:

The Brigham City FortJune 16, 2016 at 4:49 AM
Intersting Blog. Maybe we should follow each other's blogs.
thestephensherwoodletters.blogspot.com
ReplyDelete
Replies
Cassmob (Pauleen)June 21, 2016 at 6:24 PM
Shelley, you've excelled (oops bad pun!) yourself. I'm not unfamiliar with Excel but this sounds a tad overwhelming. However I'll save the post and reflect on it further.
ReplyDelete
Replies
A O'BrienSeptember 5, 2016 at 5:37 PM
Shelley I just wanted to thank you so much for this. I have created my own circles now using this method and it is so interesting!
ReplyDelete
Replies
A O'BrienJune 22, 2017 at 11:24 AM
Hi Shelley, I am still using NodeXL to create circles of my DNA matches and want to thank you again for this post as it has been so useful for my DNA research. Have you shared it on the DNAGedcom group on Facebook or the DNA for Genealogy Aus & NZ Facebook group? I think they would find it very useful too :) Thanks again.
ReplyDelete
Replies
MagdaJuly 31, 2017 at 2:35 AM
Does DNAGedcom client download into Chromobook Google spreadsheets ? Can hardley wait to try this method.
ReplyDelete
Replies
UnknownNovember 11, 2017 at 3:50 PM
Putting two technical suggestions up front for people who made the same mistakes I did:
1. Installed Windows version of NodeXL, loaded Excel, couldn't find template anywhere. Win 10 64 bit and others reported issues. Turns out you have to find the template by name in the Start menu and run it there. Comes up just fine.
2. Built first spreadsheet to use. Pasted selected DNAGEDCOM match into Vertices tab, ICW data for about 5 people into the Edges tab. Worked great. Copied more data into place in the same spreadsheet. Didn't work at all! Totally counter-intuitive. Maybe there's a way to get this iterative copying to work, but I gave up after about 45 minutes. Realized Excel might support a much better approach anyway, and it does. Put your entire match data into the vertices and 100% or your ICW rows into edges. Don't try to pre-select ICW rows to copy. Put them all in there. Then you can Filter on one of the ICW columns, graph, select more rows, graph again, select fewer rows, graph again, etc. This way, you're never trying to paste new data after your initial population of the tabs.
3. For convenience of selecting ICW rows, I put the ICW name into the Label column and the ICW admin name into the first "add your columns here" column on the right. I filter on the ICW admin column most of the time to pick up all the kits that one person administers. But you can filter on any column you want.
ReplyDelete
Replies
UnknownNovember 13, 2017 at 6:34 AM
Thanks for the link, I'll work my way through the posts. I'm getting hundreds of Ancestry matches added each week so I opted for the simplest exploration so far (having only spent a couple of evenings on this). I'm needing to add various DNA-based cousins to my genealogy database to help interpret clustering (for example, one cluster all comes through the same pioneers in Perry Co, Pa while another has the right surname for the expected DNA match but through a different PA count - do they intersect in any known way?). Once done, groups closer to me may point to where this DNA entered my own cluster of cousins. The next thing I'll read in the blogs relates to removing clutter from the graphs to see useful structure. I've already realized I should start small and then add in people as seems appropriate. / Tom
ReplyDelete
Replies

Add comment

Pages

Blog post

Tuesday, June 14, 2016

DIY Ancestry DNA circles

11 comments: