The NLmaps corpus consists of 2,380 questions about geographical facts that can be answered with the
(available under the Open Database Licence).
The questions are in English and German and have a corresponding Machine Readable Language (MRL) formula via which gold answers can be extracted using the
which is based on the Overpass-API
(available under the Affero GPL v3).
The main purpose for the creation of this corpus is to build a semantic parser and to provide an interface to OSM that lets a user ask a question in natural language, which is then parsed into a database query that is executable against a webbased filtering tool and returns OSM data on an interactive map.
The parser (available under the Apache License 2.0) may be used to train a model that can consequently convert natural language questions to MRL formulae. It approaches semantic parsing as a monolingual SMT task in which one has to translate from a natural language to a machine readable one. It is closely linked to the cdec decoder.
A first version of the website is running. Feel free to test it but be aware that we are still working on the generalization. Any question you pose will help us!
The NLmaps corpus is available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International. If you use the corpus in your work, please cite Haas and Riezler (2016).
A few example questions are:
- What is the closest bank with ATMs from the Palace of Holyroodhouse in Edinburgh?
- Where are the closest bank and the closest pharmacy from the Rue Lauriston in Paris?
- Give me the name and location of all tourist related activities that can be accessed with a wheelchair in Heidelberg!
- Where is the closest Indian or Asian restaurant from the cinema Le Cinaxe in Paris?
- What is the homepage of the cinema Die Kamera in Heidelberg?
- What are the names of cinemas that are within walking distance from High Street in Edinburgh?
- How many schools in Edinburgh have a bus stop less than 200 meters away?
- Which driving school is closest to Mannheimer Stra√üe in Heidelberg and where is it?
- What is the name of the closest museum or art centre from Notre Dame in Paris?
- How many historic sites are in the east of Nantes?
- Which cuisines are there in Denver?
- Where is the closest restaurant or bar from the Hawes Pier in Edinburgh?
- How many locations are in the east of Heidelberg where you can play miniature golf?
- Are there any caves in Osterode and if so how many?
The work was in part supported by the grant RI-2221/2-1 for the project "Grounding Statistical Machine Translation in Perception and Action" funded by the Deutsche Forschungsgemeinschaft (DFG).
Carolin Haas and Stefan Riezler (2016). A Corpus and Semantic Parser for Multilingual Natural Language Querying of OpenStreetMap. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics ‚Äď Human Language Technologies (NAACL HLT 2016), San Diego, CA. [pdf] [bib].