Genealogy Data Entry Standards
Home Home About Me Computers Non-Profits What's New
Gallery Gallery Home
Genealogy Genealogy Home FAQs Search Documents County Map Data Entry Standards Gurganus My Family Marilu B. Smallwood Royal / Medieval Smallwood Beasley
Quick Search - First Last

In life I like to do what I can to make the world a cleaner place, and it makes sense to extend this to genealogy... As I browse data online, I routinely find things that drive me nuts. Perhaps it comes out of my database management background, where I like everything neat and organized.

As the number of researchers and the amount of online data increases, it makes sense to try to apply some sort of standards, or else the possible variations and the resulting confusion will only multiply.

So if you post your data online, here are a few suggestions or pointers for cleaning up your data. Also listed below are a few other websites I have found with other suggestions.
  • Names
    • Unknown spouses: I do a search for John Seymour (b.abt 1218) to see if he had any known wives. Half of the results I find are for "Mrs. John Seymour", because this name matches the one I searched for... To me, this just makes things more confusing, rather than getting only his records. Even then, some of the entries often don't contain any additional data - just a blank record. My suggestion: if a spouse name is not known, but you still have data to include, use the name "unknown" or "?", rather than assigning him or her a married name. If no other data is none, then do not include a spouse record at all.
    • Unknown parents: I often notice a person's parent listed as "N.N. Smith" (intended to be "No Name Smith", but could be confused with someone's initials which could be N.N.), or "__ Smith", only to find no additional data after clicking into it. If there is additional data to record, use "__ Smith" or "? Smith", and if there is no other data, then do not include a parent record at all.
    • Patronymic names: For medieval names that include "de", "ap", "von", "le", etc., it is helpful to list the given name as "John de", and the surname as "Seymour", rather than the surname as "de Seymour"... at least in cases where the "de" is optional or dropped in some generations. (Modern cases may be different, as the surname may be more standardized with the prefix). As long as both forms are used, online searches need to be repeated for both variations of the name, or else miss half of the results.
    • Abbreviations: Similar to the patronymic names, if you have surnames like "St. Martin", choose whatever form you like, but stay consistent. Including variations such as "St. Martin", "St Martin", and "Saint Martin" at the same time may lead to names not being found.
  • Dates
    • Many genealogy programs can do cross-checks on the dates of related people, to point out errors. Online I often find people who were born before their parents were, or were born years after their parents died. As these point out obvious impossibilities, it makes sense to clean up the data where possible before posting online for everyone to see. If your program has this option, check it out.
    • Added 7/17/05: If at all possible, record the birth and death dates, or make educated guesses based on parents, children, date of marriage, etc. if the actual dates are not known (and record that they are guesses). When cross-checked (mentioned above), the dates may expose errors. If left out, there's nothing to validate the data against.
  • Places
    • When possible, it is helpful to spell it all out, except perhaps in common abbreviations such as the 2-letter state abbreviations, and when it is clear what these mean. Sometimes I have seen place abbreviations and have had to do more digging to find out what they stood for, and this can lead to misinformation...
    • When possible, include the county names and city names. At least in NC, there are cases of a city name and a county name being the same, and the city is not in that county. So "Washington, NC" could be confused for either "Washington Co, NC" or "Washington, Beaufort Co, NC", again, leading to misinformation.
    • While not necessarily an accepted standard by others, I prefer my places in reverse order: Country, State, County, City, Other... Thus when I produce lists sorted by place alphabetically, I get the list both alphabetically and geographically sorted at the same time.
  • Posting Online
    • Many times I have seen one researcher's data posted on the same website (such as at Rootsweb) multiple times - probably one set of data is an update of the other sets? If you do not need to have multiple copies, then please remove the older uploads. Otherwise, the outdated data is still there for people to see, and it increases the amount of data to search through.
  • Links
Above all, intentionally choose your standards, stick to them, and be consistent.

If you believe these ideas to be helpful, feel free to refer this page to the listservs you are on, or provide links from your website. Only by spreading the word will the ideas be able to make a difference.

Thanks for your efforts. If anyone has suggestions to update the above, or other ideas to add, please let me know.


pushpinDate: 02/11/07; By Chief Scribe;
Well done!
Comments for HusbandAdd Notes For Page