Ruprecht-Karls-Universität Heidelberg

NLmaps

There are two versions of the NLmaps corpus. NLmaps (v1) and its extension NLmaps v2

Both versions of the NLmaps corpus consist of questions about geographical facts that can be answered with the OpenStreetMap (OSM) database (available under the Open Database Licence). The questions are in English and have a corresponding Machine Readable Language (MRL) parse. Gold answers can be obtained by executing the gold parses against the OSM database using the NLmaps backend, which is based on the Overpass-API (available under the Affero GPL v3).

NLmaps v1

  • 2,380 question-parse pairs
  • Questions are available in English and German

NLmaps v2

  • 28,609 question-parse pairs
  • Contains all questions from NLmaps v1
  • Questions are available in English
  • Alternative files exist in which locations and POIs (Points of Interest) are masked. This makes it possible to treat semantic parsing and named entity recognition as two separate problems.

Natural Language Interface to OSM

The main purpose for the creation of both versions of the NLmaps corpus is to build a semantic parser and to provide an interface to OSM that lets a user ask a question in natural language. The question is then parsed into a database query which is executed against a web-based filtering tool and returns OSM data on an interactive map.

A demo version of the Natural Language Interface to OSM is available.

Backend

The NLmaps backend (available under the Affero GPL v3) may be used to obtain an answer to a corresponding MRL formula.

Terms of use

Both versions of the NLmaps corpus are available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International.

If you use NLmaps v1 in your work, please cite Haas and Riezler (2016).

If you use NLmaps v2 in your work, please cite Lawrence and Riezler (2018).

Download

nlmaps.tar.gz
(md5: 1a13e8de1e1d89cae675a388d922f0ab)

nlmaps_v2.tar.gz
(md5: 108dfe7aaa81d93043fc15df4766be90)

Example Questions

A few example questions are:

  • What is the closest bank with ATMs from the Palace of Holyroodhouse in Edinburgh?
  • Where are the closest bank and the closest pharmacy from the Rue Lauriston in Paris?
  • Give me the name and location of all tourist related activities that can be accessed with a wheelchair in Heidelberg!
  • Where is the closest Indian or Asian restaurant from the cinema Le Cinaxe in Paris?
  • What is the homepage of the cinema Die Kamera in Heidelberg?
  • What are the names of cinemas that are within walking distance from High Street in Edinburgh?
  • How many schools in Edinburgh have a bus stop less than 200 meters away?
  • Which driving school is closest to Mannheimer Stra√üe in Heidelberg and where is it?
  • What is the name of the closest museum or art centre from Notre Dame in Paris?
  • How many historic sites are in the east of Nantes?
  • Which cuisines are there in Denver?
  • Where is the closest restaurant or bar from the Hawes Pier in Edinburgh?
  • How many locations are in the east of Heidelberg where you can play miniature golf?
  • Are there any caves in Osterode and if so how many?

Acknowledgments

The work was in part supported by the grant RI-2221/2-1 for the project "Grounding Statistical Machine Translation in Perception and Action" funded by the Deutsche Forschungsgemeinschaft (DFG).

Publications

  • Improving a Neural Semantic Parser by Counterfactual Learning from Human Bandit Feedback. Carolin Lawrence, Stefan Riezler. To appear in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), Melbourne, Australia. (pdf) (bib)
  • NLmaps: A Natural Language Interface to Query OpenStreetMap. Carolin Lawrence, Stefan Riezler. In Proceedings of the International Conference on Computational Linguistics (COLING 2016), Osaka, Japan. (pdf) (bib)
  • A Corpus and Semantic Parser for Natural Language Querying of OpenStreetMap. Carolin Haas, Stefan Riezler. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics ‚Äď Human Language Technologies (NAACL HLT 2016), San Diego, CA. (pdf) (bib)
zum Seitenanfang