‘The perfect is the enemy of the good’

Figure 1

Originating in an Italian proverb in 1603 and popularized by Voltaire in 1770, we have all heard the phrase “the perfect is the enemy of the good.” This phrase is very well-suited to the topic of searching genealogical databases, and particularly for AmericanAncestors.org.

Over the last year, the NEHGS web team has been researching a wide variety of things that we can do to improve the search experience for our 250,000 members. Along the way it has become clear that one of the bigger problems our members face is the dreaded “0 records returned” message (Figure 1). You just know that the record you are looking for is out there, but you can’t seem to find it when you fill out the search form.

Figure 2

We really do have a very sophisticated scalable search technology in place, so you are certainly entitled wonder why this happens. Well, our search engine has been set up only to return records that perfectly match every one of the criteria you specify. So, the more criteria you specify, the less likely is that you will get the results you are looking for since all the results returned “must fit.” As genealogists, it would seem intuitive to enter every bit of information you know when you search. However, this is not good if you don’t get to see the results that match almost all your criteria.

So, the more criteria you specify, the less likely is that you will get the results you are looking for…

Searching genealogical records is clearly an area where getting the good matches would be dramatically better than perfect. For some additional perspective, here are my top 4 reasons why perfectly good search criteria won’t get the results you expect:

  1. The database does not have one of the fields you entered indexed. If this happens, you get no matches. You can avoid this problem by searching within a specific database and checking the search tips to confirm exactly what fields are available.
  2. Variations in the spelling of names – and even places. Often there are variables in spelling of names and locations in original records. Our search engine has powerful capabilities to flexibly match name spellings, but it can be defeated by the naming flexibility of our ancestors.
  3. Recording errors in the original data. Often the people writing the original documents make mistakes, which we then index and make available to you. Just last week, I saw census pages recorded for the town of Harricon instead of Horicon, and a death from the turn of the twentieth century where they used the wrong century. In these cases, if you used the correct town or year, you would not get a match.
  4. Indexing errors. Most database have been indexed by volunteers, who do amazing work making databases available to you. However, with billions of names to search, you have to expect that there will at least be some indexing errors. (When you see one, please send a note to webmaster@nehgs.org – we do fix them!) This could also result in a record not being returned.

I am sure all of us can come up with a few stories where we eventually found a record that was hidden behind one or more of the variables given above. And you can overcome some of these issues by starting with very few search parameters, then gradually refining with more criteria. However, we wanted to find a way to make it easier to get better search results and avoid these problems automatically.

Figure 3

So, we have introduced a better approach and have delivered two new search forms – Category Search and Database Search – to make your search experience better. These two new capabilities have many advantages, including:

  • “Best fit” search results – by default, we will return results that match any of your search terms, with the best fit items listed first (labeled in relevance order on the screen). This means you can safely put in the fields you are confident about and you still get results that partially match your request.
  • Focused search fields – only the search fields available for the database or category are presented. This avoids the common problem of using search terms which will yield no results (see Figure 2).
  • Search tips and category descriptions right on the search screen – you can see exactly what fields are available and you don’t need to perform a search or click to another screen to get tips on the available fields. Just be sure to scroll down! (See Figure 3.)
  • Sample page images – One or more sample page images are included for each category and database to provide you a better perspective of what you will see in your search results.
  • Find related databases – if you want to see what other databases are in same category, you can click on the button at the top right and you will see a list of the relevant databases.

We hope you’ll give these new search experiences a try. You can find the links to Category Search just below the current search form. Here is an example. To use the new Database Search, just click on any magnifying glass icon on the Databases List A to Z page.

You will find some addition information in the Database News posts, where we announce all enhancements to search: New Database Search Option and New Category Search Option. If you want to stay current, please subscribe at https://dbnews.americanancestors.org/.

About Don LeClair

Don is the Associate Director, Database Search & Systems, at NEHGS. He first got involved with genealogy while in college and spent many a day in the NEHGS library tracing his ancestors through New England and New York. Don also did volunteer indexing work for the library before joining the staff in 2016. Previously, Don had a 30-year career in the software industry working in and leading engineering and product management teams focused on IT Management products. Don has a B.A. and M.B.A. from Boston University.

9 thoughts on “‘The perfect is the enemy of the good’

  1. I wondered why I was getting off the wall returns or the opposite, 0 returns, just by entering a surname. I also didn’t find any documents for New Jersey, Jersey, “New Jersey”, or “Jersey”. I gave up and went elsewhere. I used to get several pages for my Ebenezer Ward.

    1. Hi Toni,
      Sorry to hear you were having problems searching for Ebenezer Ward. We do a variety of databases that include New Jersey, and certainly there are Ward and Ebenezer Ward records. But I didn’t see anything with all of the above. Please feel free to send me an email (don.leclair@nehgs.org) if you think some records that used to be there are missing.

  2. I usually try the least restrictive search. Obviously this often results in too many responses. In the past if I would limit by location or date all journals would be limited from the results. Will the changes change this?
    Thank you for you on going efforts to improve our searching experience.

    1. There is nothing wrong with starting with a few criteria and adding more. With the changes described above, you would have the benefit of seeing still seeing results that match most of your criteria really well, even if they don’t match one of them.

  3. Some of us may need at least one more option for the search fields — a “fuzzy” search box which would allow us to check off that one field (or more than one field) that might have as many as 30 possible spelling variations.(The Soundex option with which I’m familiar usually doesn’t allow for variable, missing or mistranscribed initial letters in the name fields.) A Boolean search option (operators: AND, OR, NOT, AND NOT) can also be very helpful.

    1. The default search mode will allow for spelling variations in names. It actually seems to be a bit more flexible than soundex, and we also provide the soundex options. You can also use wildcard characters. However, even with all of those, there are still cases where you would need to use multiple searches to cover all the possibilities. Booleon options would be something else for us to evaluate for the future.
      Thank you for the suggestion`

  4. In the “best fit” returns, I’ve found that if I keep paging down, I very often find unexpected records that relate to my search. Generally this is because of those darn misspellings in the original document, an erroneous date, wrong town, or somebody just couldn’t read the writing and indexed it wrong, or transcribed it wrong. Some really weird stuff I’d never find with other search methodologies. The human brain can pick up things the computer misses even with “fuzzy” searches. This is maybe just a step beyond random, but it has worked for me often enough it is worth the try. Then I can use other information in the newly discovered record to track down more information, or at least have a clue where to look.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.