A Spatially Explicit Pretrained Language Model for Named Geographic Entity Recognition and Localization
Topics:
Keywords: Text Mining, Named Entity Recognition, Geographic Entity Localization, Spatially Explicit AI
Abstract Type: Paper Abstract
Authors:
Zhaonan Wang, University of Illinois Urbana-Champaign
Wei Hu, University of Illinois Urbana-Champaign
Bowen Jin, University of Illinois Urbana-Champaign
Minhao Jiang, University of Illinois Urbana-Champaign
Jiawei Han, University of Illinois Urbana-Champaign
Shaowen Wang, University of Illinois Urbana-Champaign
,
,
,
,
Abstract
Named geographic entities (or geo-entities), the fundamental units of many geospatial datasets, are ubiquitous in unstructured data. While there has been an abundance of studies taking advantage of pretrained language models to recognize and extract such geo-entities from text, most, if not all, of them fail to explicitly consider the geospatial attributes, including geographic coordinates and contexts. We hypothesize that incorporating such knowledge shall augment the performance of this information retrieval task and thereby propose a novel spatially explicit language model. Built upon BERT backbone, our model takes in geospatial embeddings of the entities and their contexts in a manner of multimodal learning. Furthermore, the extracted geo-entities are localized on maps accordingly, which is a nontrivial follow-up task considering the heterogeneous nature of geo-entities (being points, lines, or polygons). This text and map interacted framework not only enhances the recognition of named geographic entities in enormous text data, but facilitates the understanding of text content from a geospatially explicit perspective.
A Spatially Explicit Pretrained Language Model for Named Geographic Entity Recognition and Localization
Category
Paper Abstract