Tuesday, June 14, 2016

DIY Ancestry DNA circles

Ancestry didn’t give me any DNA circles, so I made my own. If you want to join me in the DNA circle loop, then you will need AncestryDNA results and:

Use the DNAGedcom client to download your Ancestry matches and in-common-with (ICW) results as spreadsheets. You will need to click “Gather Matches” and “Gather ICW”. It’s the most convenient way to get the shared match information from Ancestry.

NodeXL is where the magic happens. It’s an Excel tool for social network analysis. I used NodeXL because it’s in Excel which I’m familiar with and it has all the facilities I need in the free version. I don’t know anything about social network analysis, and I didn’t need to in order to get the result I wanted. Follow the instructions on the website linked above to get started. It takes a little fiddling to get used to it, but in the familiar Excel interface it’s not as intimidating as it might at first seem.

Now the fun begins!

When you create a file using the template, you will see an extra ribbon, and an area for your charts to display. Those extra features won’t be there when you open Excel as normal, only when you open a spreadsheet from the template.

You will see several tabs. The most important for our purposes are “Vertices” and “Edges”. Think of “Vertices” as people, and “Edges” as relationships between people. The list of Match IDs goes into “vertices”, and the paired Match IDs in the ICW file goes into “edges”. As it’s Excel, you can cut and paste data into the sheets. I pasted twice on each sheet – the first time with just the match ID numbers in the first column (or two columns for Edges), then the rest of the columns into the “add your own columns here” section.

Click “Refresh Graph” to see a graph of your information. When you first drop match information in you will probably get a big mess of dots and crossing lines. There are options to fix that.

With a bit of fiddling, I came up with this:


Look! I’ve got circles!

Each dot represents a person, each line a DNA relationship between two people. When trying to interpret the information remember that that Ancestry has a cut off – it won’t show shared matches unless at least one of the people is a fourth cousin or closer to you. At least, that’s how I think it works. I’m not sure if they also have to be fourth cousins or closer to each other to show up. If you can enlighten me on exactly how it works, I’d be grateful.

The point is to remember that because of the cut-off there are likely to be other relationships between the dots that you can’t see. I assume that’s what’s happening with the fan shaped ‘circles’. I had 35 fourth cousins or closer at the time of making this chart and no circles or “New Ancestor Discoveries”.

To get distinct clusters I first used the “Group by cluster…” option on the toolbar.


The groups might still be mixed up at this stage. To separate the groups from each other, I clicked the little arrow dropdown to the right of “Circle” (above) and under “Layout options” I chose “Lay out each of the graph’s groups in it’s own box”.


For the layout I chose “Circle”. Because I wanted DNA circles. You could make a DNA spiral or a sine wave or a grid or a random layout or … but circles work nicely and they help with the circle-envy. This option is available both on the main NodeXL ribbon, and in the settings at the top of the graph area.

“Autofill columns” on the main ribbon lets you easily move information from your own columns into the columns that control the graph’s appearance. There are a lot of options to play with – size and colour of dots, thickness of lines all have potential. I set the size of each dot to the number of Shared cM with me. You can also label the dots using information on the sheet. The obvious label to use is the person’s name.

You need to refresh the graph by clicking “Show graph” when data changes on a worksheet. If you’re only changing display options, you can save the recalculation time by clicking “Lay Out Again”.

There’s a lot of fun to be had just playing with the options. I’ve also tried this with my FTDNA results. For those, I had a much busier chart. Different clustering algorithms had different effects, and the dynamic filter came in useful to clear away matches who sat in distracting “pile up regions” which could be seen as a dense collection of interlinked spots.

In my next post I’ll show you how I used my DIY Ancestry DNA circles to identify a new research lead.


