gazetteers

Diamond Bay Research presents a brief survey of online gazetteer resources:

GeoNames

GeoNames is the most Open gazetteer, in the sense that the entire database can be downloaded if you want it, and is meant to accept public requests. Geonames is the invention of Marc Wick, a reclusive genius, who developed a way to harvest the major national gazetteers into a global framework and categorize them with a basic feature classification, and also make sure they fit into a reasonable hierarchy.

In 2012, the requests to Geonames API were running up to 25 million per day, and half of those were from smart phones. The data is Open, in that you are free to use it in applications or store it in a database. There is a limit of 3000 requests per day, and a licensed version for heavy use. What makes Geonames “free?” Interestingly: commercial sponsorship, primarily from the casino industry.

GeoNames maintains stable URIs to each GeoNames ID, resulting in a ubiquitous presence in the Linked Open Data web. We give Geonames our highest recommendation.

Family Search

The LDS effort to create geneologies is based on the Mormon belief that baptism can be performed on the dead. In practical terms, this means that a vast global effort to trace back family histories, and to accurately record information about people who lived in the past, has produced a treasure trove of biographical and geographical authority files. Apparently these records – some 2 million rolls of microfilm containing 2 billion names – are locked away behind 14-ton doors in the Granite Mountain Records Vault near Salt Lake City.

Since the 1990s, the LDS research team have been developing great digital resources to support their geneologies. These can be incredibly helpful. For example, I was able to confirm about half of the missing names in the Japanese Meiji Towns datasets for the G. W. Skinner Japan-T Dataset, using the beta version of their Developers API.

An excellent resource, worth investigating. See the Place Authority.

Nominatum

This is essentially the search engine for all named features in OpenStreetMap. It is also Open and free to download, but in the form of the OSM database, which is rather huge. Nominatum is a crowd-sourced database, rather than assembled from national gazetteers, so it does not have the same consistency of admin hierarchy or classifications as GeoNames. But it is open and extensive.

Google Places API

The elephant in the room, Google Places API now posts their daily traffic volume on their web-site as more than 1 billion monthly active users. When we first wrote this note (in 2012), the usage figures were secret. They also claim to have 150 million “places” in their database.

The Google Places API is very useful for looking up placenames or reverse geocoding, the JSON output is easily parsed, similar to GeoNames. Note that Google has strict licensing terms, though they are loose about enforcing anything. The fact is that Google “prohibits” the storage of the info provided by their API into a database. They only “allow” the use directly on a live Google Map. In reality, people are storing this info all over the place. But it is a gray area, and violates their terms of service.

Owing to frequent changes to the API methods, and the continuous alterations in terms of use and fees, proceed with caution.

OpenPOIs

This was the project of Raj Singh while he was at OGC. The OpenPOI API attempted to create an open, extendable database of all places of interest (by harvesting all the features from OSM, GeoNames, as well as names from FourSquare checkins and other sources), into a single database and API. Interesting, but Raj moved on to Cloudant, so I think this project is dormant.

USGS Board of Geographic Names

The Board of Geographic Names is the U.S. federal agency responsible for all official toponyms used by US Govt. They have separate divisions for Domestic Names and Foreign Names. They started to release vernacular scripts in mid 2000s. This is a great resource, but only exists as .csv files for download. However, if the use case is for specific countries, you could automate the ingest of new versions from GNS names server as they are posted. These names also get harvested into GeoNames.

Pleiades

The main historical gazetteer project, with open restful URIs to all historical places. Focus on “Classical Western” geography, the Mediterranean, Europe, Middle East, Northern Africa. I have made an effort to bring the Digital Ottoman Project members into the Pleiades network of contributors, so it is expanding to Anatolia. The time periods are wonky. The platform is based on Zope, and is slow and a little strange. Their metadata schema is elaborate and repetitive. But it works!

PastPlace

Humphrey Southall’s project to pull in all the historical placenames from UK and a world atlas from the 19th Century (as well as GeoNames and WikiData historical places) into a database and API. I find it to be cluttered with multilingual entries and not useful for time-based searches. And their API is flakey.

Temporal Gazetteer

The only Open API for historical placenames that actually works for requests of multiple facets, including: Names, Years, Feature Types, Parent Jurisdictions, in any combination. The speed is optimized using materialized views in a Maria (MySQL database). The main drawback: it only covers China, Tibet, late 18th Century Russia. But it does work, and handles vernacular scripts in UTF-8 encoding (such as Cyrillic, Traditional Chinese, and Tibetan) and the codebase is freely available on github.

World Historical Gazetteer

Rising from the ashes of earlier projects at Univ of Pittsburgh, the WHG has great potential to establish a useful global historical API. Karl Grossner is the technical lead – meaning things will work as described on the tin! – and there are fruitful synergies with the Pelagios and Recogito projects. Another plus is the effort going into Linked Traces, which will establish a protocol for instantiating datasets of historical travels, routes, and itineries; a capability sorely missing from historical geospatial applications. Check this one out!