When we were deciding how our AmericanAncestors.org database search would work, one of the key considerations was that we didn’t want to return search results that contained a lot of ‘noise.’ On other websites, the database architects allowed for a certain (sometimes significant) number of irrelevant search results. This was undoubtedly intended to be helpful, but it is actually quite frustrating. So we decided to do ‘exact’ searches with a couple of twists. The goal was to give results that were exactly what you searched for. We spent quite a lot of time tuning our search algorithm, trying different approaches and analyzing the results. We’re pretty happy with our final approach, but it’s definitely helpful to understand how it works. And what the twists are.
Actually, I said that our searches are ‘exact’, but the ‘exact’ portion of the search applies to the surname, year range, record type, location, and any specific database or database type specified.
Twist #1 is that first names (which can include middle names and initials as well as maiden names) are searched with an ‘any match’ algorithm. So you will get a search result ‘hit’ by searching for given name ‘jon’ where ‘jon’ makes up any separate part of the given/middle/maiden name. For instance, searching our American Canadian Genealogical Society Index of Baptisms, Marriages, and Burials, 1840-2000, with ‘jon’ as a first name (and no other search fields filled-in) will returns hits for not only ‘Jon’ but also ‘Michael Jon,’ ‘Jon Alfred,’ ‘Jon Robert Jr,’ etc. Why did we do this? Well, often, a middle name or initial or suffix is not known by the searcher, and we feel that requiring an exact match in this case is too limiting. Any of those ‘Jons’ may be the one you’re looking for.
But what if you don’t want to do an exact search for last names? It is certainly the case that some names have common spelling variations. And, in many cases, last names appear with the ‘as written’ spelling, often phonetically, as written by a town clerk or census enumerator, who often just wrote ‘what they heard.’ Sometimes when these phonetic surnames are indexed, we add spelling variations to make it possible to find the surname without having to resort to Twist #2. In our original database of Yarmouth, Massachusetts Vital Records to 1850,* we indexed last names as written. Those last names often contained many spelling variations, with ‘Eldridge’ being a prime example. In the Yarmouth records, ‘Eldridge’ was occasionally spelled as such and a search for ‘eldridge’ in the original Yarmouth to 1850 VR database results in 248 hits for that spelling. However, the surname variations of ‘Eldredg,’ ‘Eldred,’ ‘Eldreg,’ ‘Eldredge’ and others appear in those Yarmouth records.
Twist #2 involves the use of ‘wildcards.’ A search wildcard is a special character that represents any single character (a question mark, or ‘?’) or a sequence of any characters (an asterisk, or ‘*’). In the Eldridge example, with the original Yarmouth Vital Records to 1850 database, if you search for ‘eld*’ as a last name, you’ll get 540 hits. The asterisk allowed all the various spellings of the name that started with ‘eld’ to be found.*
My own last name ‘Sturgis’ is frequently spelled as ‘Sturges.’ Some branches of the family traditionally used the ‘e’ spelling, but mine usually used the ‘i’ spelling. And sometimes the same individual’s name was spelled with both variations. When I’m researching my own family, I always use the ‘?’ wildcard as part of the last name: ‘sturg?s’. This gets me both spelling variations.
The ‘?’ and ‘*’ wildcards can be used in any of the text fields, including First Name, Last Name, County/City/Town, and keyword. If you haven’t tried using wildcards in searches, now’s the time to try them!
*Our original Yarmouth, Massachusetts Vital Records to 1850 database has been revised with the addition of first names and last name spelling variations and added to our Massachusetts Vital Records to 1850 database. A large number of previously ‘stand-alone’ databases have been given the same treatment.
Thanks for your article. What wildcard would you use for the name Browne, Brown so you get the name with or without the final ‘e’?
Wildcards at the end of a name are a bit of a problem. If you search for ‘brown*’. you’ll get every name that begins with ‘brown’, including ‘browne’, but you’ll also get ‘brownell’, ‘browning’, etc. And, if you search for ‘brown?’, you won’t get any ‘brown’ results, since the question mark requires a character in it’s position. Another possibility is to search for last name ‘brown or browne’. (Put an ‘or’ between the names in the last name field.) This works pretty well, but you may find that you sometimes get results where ‘brown’ or ‘browne’ is the spouse in the record.
Again, how would you deal with the multiple ways of spelling Katharine, Katherine, Catherine, Cathryn, Catharine? Would you run it twice with a Kath* and Cath*?
You’ve got it! You could also search for ‘kath* or cath*’ in the last name field to get all the name variations.
I have a two word last name. Some spell with a space, some not, Van Brunt, VanBrunt, or Vanbrunt. If I use a ? between would I get all the spellings in one search? How are spaces/capital letters handled in a search? I always do two (or more) searches, but if I could do one it would be easier. Also any other tricks for two word last name searches? Thank you, I love reading Vita Brevis very much.
The method recommended in the previous replies works in this case, too. If you enter ‘van brunt or vanbrunt’ in the last name field you’ll get hits on both names.
One of my last names is vanHaagen or Van Haagen or van Haagen. I’d love to have a tick for the two word last name searches, too! n Thank you!
This was very helpful.I never learned before how to use “?” wildcard in a search and actually I’ll try it with your exact example—my own “sturg?s” ancestors, for a start. (They, too, started in Barnstable and then moved to Gorham, Maine.) To clarify: asterisk substitutes for more than one letter; and question mark for when one and only one letter may vary?
Would an asterix substitute for a space between a last name such as Van Haagen? Thank you! Karen Campbell
The asterisk represents any number of characters, including zero. Searching for ‘sturgis*’ will return all the ‘sturgis’ names plus any that have characters after the ‘sturgis’. (Possibly ‘sturgiss’ ?) The question mark stands for one letter each time it’s used. So, ‘sturg??” will return all the ‘sturgis’ and ‘sturges’ records.. Both wildcards can be used more than once, at the same time, although it’s not an obviously useful capability.