Tip:
Highlight text to annotate it
X
So Unlock Places is a service which searches across many different sources of data, mostly
opens geographic data, and provides the user with points or with detailed shapes, footprints,
founding boxes, representing the places that they're searching for. Unlock also provides
the unlock text service which is a geoparsing service which extracts location information
from documents and leaves them geotagged with the most likely locations. So we developed
this software in partnership with the language technology group with the School of Informatics
in Edinburgh, so quite a serious dedicated research group with computational linguists.
The language technology group weren't particularly specialists in geographic information, so
Edina came to them and helped develop different algorithms for being able to pick the best
known locations, and that's where we wanted to start adding more data sources to Unlock
Places to provide better coverage. Particularly, to provide worldwide coverage so that we could
enable the service to be used by researches around the world, not just in the UK.
So Unlock began based on Ordnance Survey MasterMap data. Edina provided this license for educational
use only, and a lot of work was done extracting more semantic detail from MasterMap that the
Ordnance Survey products don't necessarily provide by looking at names to figure out
features types and returning more information to the user. So it provides a very rich search
of shapes of rivers, detailed outlines of towns and so on. But, of course, what it was
missing was the ability to be reused and other applications to be used. To have the data
and republish it as part of an academic publication and so on. So, soon after I started at Edina,
we started looking at different open data sources to add to the service, the first of
which is GeoNames which is a worldwide public domain and data set of many millions of points
worldwide.
And we also incorporated very soon after the launch of Ordnance Survey Open Data will be
one of the first people to reuse that data and application. And this Ordnance Survey
Open Data for the UK provides us with detailed boundaries for political areas. It provides
post code look up and geocoding. So it's great to be able to offer that without registration
to anybody, so really turn Unlock into a more free and open service. And quite recently
we added to the service Natural Earth which is another public domain project that provides
quite detailed to the political boundaries worldwide.
So Edina's collaboration with the language technology group began as a series of projects.
So mostly looking at some historic archived text manuals are taking 19th Century parliamentary
reports and population reports, digitizing them, and then extracting and georeferrencing
the content sort of GeoEnabling archival collections.
LTG discovered that the quality of the extraction of place names from a document is much higher
than more place names that you can successfully identify. And in a lot of the historical cases,
place names were being missed, because they weren't in the contemporary gazetteer. The
names have changed. So as a result of this, we started looking at what ways to augment
contemporary gazetteer data sources with deeper historic-rich data.
So it's being used in various different ways. It's being used by a map search and ranking
service that various national libraries use to publish their map collections. And it's
also being used by several web services where you could post code geocode things such as
being able to use a search by post code and get the approximate location or to a geocode
collection of records based on the post codes. Obviously, the further back in history, the
more problematic that becomes. But post code geocoding will get you a rough level of accuracy
precision. So how else is it -- you know, we've used Unlock within the Challis Project
within the in historic text mining project. We used it to align some of the historic names
in the English place name server with contemporary sources with GeoNames, and also Ordnance Survey
Open Data.
Another project that's used Unlock has taken transcripts of interviews covering a certain
geographic area and used Unlock Text to pick out locations of likely places within the
transcripts and geolocate the centres of activity. That same process is being gone through with
parliamentary transcripts again to sort of show a focus of where the parliamentary discussion
is happening. The Unlock text service is also being used, plugged into, institutional repositories
so a new learning resource or a publication goes into repository and are not text as an
experimental used to pick out the locations and provide more massive data for the users
and make the content more searchable and more linkable across geography and provide the
sort of missing links between archival collections.