Choosing a geocoding provider
Monday, December 16th, 2024
Yesterday when I mentioned my paranoia of third-party dependencies on The Session, I said:
I’ve built in the option to switch between multiple geocoding providers. When one of them inevitably starts enshittifying their service, I can quickly move on to another. It’s like having a “go bag” for geocoding.
(Geocoding, by the way, is when you provide a human-readable address and get back latitude and longitude coordinates.)
My paranoia is well-founded. I’ve been using Google’s geocoding API, which is changing its pricing model from next March.
You wouldn’t know it from the breathlessly excited emails they’ve been sending about it, but this is not a good change for me. I don’t do that much geocoding on The Session—around 13,000 or 14,000 requests a month. With the new pricing model that’ll be around $15 to $20 a month. Currently I slip by under the radar with the free tier.
So it might be time for me to flip that switch in my code. But which geocoding provider should I use?
There are plenty of slop-like listicles out there enumerating the various providers, but they’re mostly just regurgitating the marketing blurbs from the provider websites. What I need is more like a test kitchen.
Here’s what I did…
I took a representative sample of six recent additions to the sessions section of thesession.org. These examples represent places in the USA, Ireland, England, Scotland, Northern Ireland, and Spain, so a reasonable spread.
For each one of those sessions, I’m taking:
- the venue name,
- the town name,
- the area name, and
- the country.
I’m deliberately not including the street address. Quite often people don’t bother including this information so I want to see how well the geocoding APIs cope without it.
I’ve scored the results on a simple scale of good, so-so, and just plain wrong.
- A good result gets a score of one. This is when the result gives back an accurate street-level result.
- A so-so result gets a score of zero. This when it’s got the right coordinates for the town, but no more than that.
- A wrong result gets a score of minus one. This is when the result is like something from a large language model: very confident but untethered from reality, like claiming the address is in a completely different country. Being wrong is worse than being vague, hence the difference in scoring.
Then I tot up those results for an overall score for each provider.
When I tried my six examples with twelve different geocoding providers, these were the results:
Provider | USA | England | Ireland | Spain | Scotland | Northern Ireland | Total |
---|---|---|---|---|---|---|---|
1 | 1 | 1 | 1 | 1 | 1 | 7 | |
Mapquest | 1 | 1 | 1 | 1 | 1 | 1 | 7 |
Geoapify | 0 | 1 | 1 | 0 | 1 | 0 | 3 |
Here | 1 | 1 | 0 | 1 | 0 | 0 | 3 |
Mapbox | 1 | 1 | 0 | 1 | 1 | -1 | 3 |
Bing | 1 | 0 | 0 | 0 | 0 | 0 | 1 |
Nominatim | 0 | 0 | 0 | 0 | -1 | 1 | 0 |
OpenCage | -1 | 1 | 0 | 0 | 0 | -1 | -1 |
Tom Tom | -1 | -1 | 0 | 0 | -1 | 1 | -2 |
Positionstack | 0 | -1 | 0 | -1 | 1 | -1 | -2 |
Locationiq | -1 | 0 | -1 | 0 | 0 | -1 | -3 |
Map Maker | -1 | 0 | -1 | -1 | -1 | -1 | -5 |
Some interesting results there. I was surprised by how crap Bing is. I was also expecting better results from Mapbox.
Most interesting for me, Mapquest is right up there with Google.
So now that I’ve got a good scoring system, my next question is around pricing. If Google and Mapquest are roughly comparable in terms of accuracy, how would the pricing work out for each of them?
Let’s say I make 15,000 API requests a month. Under Google’s new pricing plan, that works out at $25. Not bad.
But if I’ve understood Mapquest’s pricing correctly, I reckon I’ll just squeek in under the free tier.
Looks like I’m flipping the switch to Mapquest.
If you’re shopping around for geocoding providers, I hope this is useful to you. But I don’t think you should just look at my results; they’re very specific to my needs. Come up with your own representative sample of tests and try putting the providers through their paces with your data.
If, for some reason, you want to see the terrible PHP code I’m using for geocoding on The Session, here it is.