This is the first post of a series that will demonstrate, step-by-step, how to use a free Excel-based network analysis tool to explore and get more from your Ancestry DNA matches.
You can find an index to this series here.
I first posted on this subject a year ago (under the title DIY Ancestry DNA Circles). Since then, I have experimented and refined the methods I’m using. I think my process is ready for prime time, and that a lot of genetic genealogists would get a lot of value from it. The tool is NodeXL Basic – a free template created by the ‘Social Media Research Foundation’ – which gives Excel the ability to create network charts. It has made a huge difference to how I look at my Ancestry DNA matches.
What you will learn
- how to visualise your Ancestry DNA shared matches,
- find DNA circles for yourself and among you matches
- keep track of little clues that you discover along the way
- update with new information easily
The people I have in mind while writing these posts have a reasonable basic knowledge of how to get around on their computer (open a file, find and click on a menu item), but no special skills in Excel. You don’t need to be able to write formulas (but if you can you’ll be able to supercharge your analysis).
Example
An example of the type of visualisation it’s possible to create from match lists is below. These are very quick, just to provide a general idea of where this exercise is headed. There’s a lot more you can do.
Each dot represents a person on my match list – I could have included name labels but have left them them off for online display. I’ve also added some additional information to some of my matches, and I have retained those labels. Two of my matches in the green cluster descend from John Allsop and Ellen Fearn. Having looked at some of the trees of people in my red cluster, I suspect they may be connected to my ‘Mack’ family ancestors who lived in Belfast.
If I decide to focus on the Mack family, for example, I can see right away a group of close matches whose trees I should examine. I will know which ‘private’ or ‘no tree’ people I could contact in relation to that branch, or who I could ask to upload results to GEDmatch to compare DNA directly.
The dots are sized according to the amount of DNA I share with them. Larger dots are closer relatives. Which brings me to another point. I can see relationships here that I can’t see by navigating through Ancestry shared match pages! On Ancestry, the shared match page for each match only shows people who match and who are (approximately) fourth cousins or closer to you. That means that on the match page for one of your thousands or distant relatives you may see one of your fourth cousins. You won’t see that distant relative on the shared match list for the fourth cousin.
Take a look at the green cluster above. There is a smaller, ‘distant relative’ dot at the bottom of the main grouping who is connected to three of the larger, fourth or closer cousin dots. I don’t see that distantly related person on the my fourth cousin shared match pages! But maybe that person has something in their tree that is the key to working out exactly where those others fit in. I also wouldn’t easily see from navigating ancestry pages, or even with any ease the in common with lists, the big clue that they may be related to my ancestors John Allsop and Ellen Fearn.
Once I get through loading data and navigating the interface, I’ll provide some more detailed examples of use.
What you will need
- Excel 2007 (or more recent) installed on a Windows
machine.
Knowledge of Excel is not required. If you are able to follow the “what you need” steps, you should be able to follow the rest. I will make suggestions for more advanced users of Excel who want to get even more from their. Sorry Mac users, but the tool I’m going to suggest is Windows only. - AncestryDNA match and in-common-with lists
Obviously you will need to have done an AncestryDNA test.
To obtain these lists, use the DNAGedcom client [PDF link with instructions]. There is a very modest cost involved – $5 US for one month of access. It’s the most convenient way to get the shared match information from Ancestry and well worth the money. Follow the directions, and download your match and in-common-with (ICW) lists. - NodeXL Basic Excel Template
Follow the directions on the website to install the Basic (no cost) template.
Take care, as there are links to both a Pro and a Basic version. Its the free Basic version you need – make sure the download link you click has the word “Basic” in it. You should not be prompted for a licence number or payment when you install.
In my next post, I’ll show you how to load your information into the template and generate your first graph.
Still confused!!!
ReplyDeleteGenetic genealogy is a confusing subject!
DeleteThat's for sure.
DeleteOk, I've got it installed and working. Awaiting your next step/s...
ReplyDeleteI've since uninstalled the template as it was causing issues. On startup it keeps asking for the Pro licence. I ignored that and then an error message popped up that Excel had stopped working and shut the program down. Using Excel 2016 (via Office 365 sub).
DeleteI have the same version of Excel as you and haven't had any issue, I'm sorry you haven't had the same experience.
DeleteDid you download the Basic or the Pro version? It's the basic version you need. The file you downloaded should have been called NodeXLBasicExcelTemplate2014Setup.exe. A screen pops up for 20 seconds when you first open a template in the basic version but nothing that prompts for a licence.
I rechecked and found I had downloaded the Pro version in error. Now working as expected, thanks.
Deletecan't.. apparently it's not a trusted something or rather ... :(
ReplyDeleteThe installation page ( http://www.smrfoundation.org/nodexl/installation/ ) gives some instructions about adding an address to trusted sites if you get an error. If you are prepared to add it to the trusted sites list that should probably resolve the problem.
DeleteThat is only for the pro version, which you do not need. The basic version works without any special setup or permissions. Here is the direct link: http://nodexl.codeplex.com/downloads/get/806203
DeleteThanks Shelley - coming late to the party. Have set the afternoon to play (if my payment to DNA Gedcom gets processed by then).
ReplyDeleteDownloading the three files from Ancestry via DNAGedcom client took 7 1/2 hours. Guess I'll do the next step tomorrow.
ReplyDeleteThat first download can be slow.
Deletesubsequent downloads are relatively quick except for the icw files.
Deletei have a MAC. :(
ReplyDeleteYou can find the mac version here:
Deletehttp://www.dnagedcom.com/app/DNAGedcom.app.zip. It supports Ancestry, but not yet 23andMe or FTDNA.
Rob great! Thanks for that tip for MAC users! I have Excel on my Mac
DeleteThis just took me to the usual DNAGedcom connection client page. I'm wondering if there is a mac version of the Excel Template, ie to use on Numbers?
DeleteI'm having this same problem trying to use the Basic Excel template but owning a Mac. Did you ever find a solution?
DeleteNodeXL is Windows only, unfortunately. I've heard that it works on a Mac under Parallels. Alternatively there are a number free programs specifically for network graphs that do run on a Mac. Gephi seems to be popular. It can do most of the things I cover in these posts. Some it does better, some with more data wrangling before loading, but some not at all. I hope you find a solution!
DeleteI'd love to be able to try this, another Mac user though :(. I will say the Matches and ICWs downloaded from Ancestry lickety-split with DNAGedcom. Now I'm off in search of a tool, will take a look at Gephi.
DeleteI'm struggling to understand all of this but I have a basic question, because I don't know enough to ask anything else!
ReplyDeleteWhat is the difference between the output from this product and AncestryDNA Helper
offered by Jeff Snavely?
What fields appear in the resultant spreadsheet?
I haven't used the AncestryDNA Helper myself. My understanding is that only DNAGedcom extracts an in-common-with file, which is the file that shows you which of your matches are related to each other.
Deletedoes it work with Home and Student 2007 Excel
ReplyDeleteI don't know. I've seen someone post elsewhere that they had difficulties with it using Excel 2007.
DeleteHere's a question: I have an older dual platform macbook pro with both Ios and Windows installed on it. I rarely use its Windows platform but want to try to make this work. Can I do the in-common lists on my newer macbook using the DNAGedcom client [PDF link with instructions, and then transfer the results to the windows platform on the older computer once they're finished?
ReplyDeleteIt shouldn't make any difference which platform you use to obtain the csv files. So if NodeXL itself works for you - yes!
DeleteThanks, Shelley. I downloaded two sets of information yesterday to .csv files, one to gather matches and one to gather trees. Am I correct that now I gather ICW info?
DeleteYes, you'll need the in-common-with info.
DeleteIs it possible to create an in-common-with file manually? I'd rather not go into details, but for me, the DNAGedcom subscription is an impossibility at this time.
ReplyDeleteIt doesn't matter how the file was created - if it has pairs of IDs it will work. It would be an extremely time-consuming process though. You can get the match ID numbers from the URL of the match page on Ancestry.
DeleteShelley, I dont have a windows computer, so am thinking about buying one. Would 6gb memory be enough to do this work? Thanks!
ReplyDeleteHi Mary, I really couldn't say. It will depend on how big your file is.
DeleteI did this on another computer and it worked as for getting the graph. I had to do 4th cousins. I can't get them to go in groups and I can not click on dots to show anything. I can only click on the name and it will show up in the graph. When it shows connections in red I can not click on those dots. Everything in the graph is froze. I have done this over and over....
ReplyDeleteDo you have a lot of matches? It might be that your file is too large for your PC to manage. I will be describing a clean up step that might help in my next post.
DeleteI don't have that many 4th. I did mine yesterday and I could click on dots and the names would appear. I did my aunts today with only 4th and my graph keeps trying to move horizontal on top of sheet and when I click on a dot the title bar "document actions" turns orange and nothing happens. And I still can't get it to group in separate boxes.
DeleteI got the clusters separated a bit with diff colored dots but still can not click on dots to give me the names.
DeleteDoes it have to be Excel? Why wouldn't Open Office software work just as well?
ReplyDeleteBecause this process depends on using an add-on program (NodeXL) that was designed for Microsoft Excel.
DeleteHi Shelly and thank you very much for this. It's working well for my Ancestry results and I'm already building my own circles and seeing patterns I haven't seen before.
ReplyDeleteMy question is around FTDNA matches and where I can get my icw file from. I can download my complete list of matches but the icw one eludes me. I see from an earlier blog post of yours that you've been able to do it.
Cheers
Laurie
Hi Loz, the DNAGedcom client will also download FTDNA information.
DeleteHi Shelley
DeleteI've got the files, are there any instructions on which columns to set as Edge and Vertices please?
Thanks Shelly I'm trying that now
ReplyDeleteDo you know if there's a work-around for someone using Excel 2016? I get an error when trying to install NodeXL.
ReplyDeleteIt's supposed to work with Excel 2016. Try running it as admin? And check the information on the NodeXL home page about adding it to trusted sites, or any other help that page can offer. I hope it will work for you. Good luck!
DeleteI'm using it in Excel 2016 without issue.
DeleteIs there any way to get this to work using Excel on an iMac?
ReplyDeleteHi Sarah, sorry I missed this comment earlier. It won't work on the Mac version of Excel as I understand, but some people have said that they have been able to use it with Excel running under Parallels.
DeleteI administer my sister and 2 of my children's DNA tests at Ancestry. Do I download the DNAGedcom files for all of them or just mine to XLNode? Thank you sooo much for sharing this system! I am really looking forward to exploring it. :)
ReplyDeleteShelley, should this nodexl work on excell 2013? I downloaded the basic program, but can't find the template anywhere. Any suggestions?
ReplyDeleteGene
Hi Gene, it should work. See my reply to another comment:
Deletehttp://twigsofyore.blogspot.com.au/2017/07/visualising-ancestry-dna-matchesindex.html?showComment=1501805817748#c561069917028592215
Thanks, yes I found it no problem now, but a couple of questions.
Delete1. When I downloaded the ancestry data I downloaded everything. Took days. Can I just delete lines from the csv below 10 cm ( or higher) to cut down the number of points. First time through it was just a blob.
2. I am not an excel illiterate, but is there a clean way to clear the spreadsheet out and start over?
Deleting the small match records won't help, as they'll still be in the ICW file and show up as edges. You can use the dynamic filter option to temporarily hide them, or 'skip' them to hide them completely. Also try using different clustering algorithms. The Wakita-Tsurumi clustering algorithm usually break it up the most. Have a look at the part 5 and 6 posts for more options that may help.
DeleteIs there any reason to also process the tree file? --Ken Waters
ReplyDeleteNot for the purposes of making the chart.
DeleteI have just accessed this page for the first time and wanted to say you are amazing. I have a very good working knowledge of Excel and to be able to combine my love for genealogy, DNA, computers AND Excel is a dream come true. Thank you so much for making this available.
ReplyDeleteYou're very welcome :-) I'll just mention that the tool isn't mine, only these instructions. I'm very pleased that so many people have been able to take the step into something they didn't realise they could do.
DeleteI have just completed the construction of the Excel Spreadsheets and have imported my ancestry.com DNA data - my advice is to do it very slowly and follow the instructions Step by Step - one step at a time - if you rush it you will not succeed. When you succeed you will be rewarded with an outstanding representation of your DNA data. Thank you so much Shelly - very impressive. Well done.
ReplyDeleteThank you for this information! I am stuck gathering the matches on dnagedcom. I downloaded the client and I begin the process of gathering the matches and after about 2% have loaded, I receive the following message: "object reference not set to an instance of an object". I have basic computer skills and I have no idea why I am getting this message. My match gathering then ends after I click OK.
ReplyDeleteHi Anonymous, I'd suggest contacting DNAGedcom about that. One way to do so is through the user group on Facebook: https://www.facebook.com/groups/DNAGedcomUserGroup
DeleteI have this problem too. I don't have a Facebook account. How can I resolve the problem
DeleteThis comment has been removed by a blog administrator.
DeleteOnce I have downloaded everything per above, do I need to keep my monthly subscription to DNAGedcom?
ReplyDeleteOnce I have downloaded everything per above, do I need to continue my monthly subscription to DNAGedcom?
ReplyDeleteNot if you don't want to :-)
Delete