SpatialTech: Bruno Martins

On Tuesday, November 17, 2020, The UCSB forum on spatial technology presents

Challenges in resolving place names over text


Bruno Martins

University of Lisbon

11:30 a.m. Tuesday, November 17, 2020 | Zoom*


Toponym resolution concerns the disambiguation of place names in textual documents, envisioning the support for applications such as geographical search or the mapping of textually encoded information. Place names are first recognized through a named entity recognition model, and the disambiguation is then achieved by associating each of the place references to a unique position on the Earth’s surface, e.g., through the assignment of geospatial coordinates. The toponym resolution task is particularly challenging, given that place references are highly ambiguous (i.e., distinct locations can share the same place name, and multiple names can be used to refer to the same place). In this talk, I will discuss techniques for toponym resolution, with a particular emphasis on a novel deep learning approach. Contrarily to most previous methods, the novel approach does not involve matching references in the text against entries in a gazetteer, instead directly predicting geospatial coordinates. In brief, the neural network architecture considers multiple inputs (e.g.,the toponym to disambiguate together with the surrounding words), leveraging pre-trained contextual word embeddings for modeling the textual data. The intermediate representations are then used to predict a probability distribution over possible geospatial regions, and finally to predict the coordinates for the input toponym. I will present evaluation results over different types of corpora (e.g., modern newswire text or historical documents), and I will discuss the impact of model extensions related to (i) the use of external information concerning geophysical terrain properties, including information on terrain development or elevation, among others, and (ii) additional training data collected from Wikipedia articles, to guide and further help with model training.


Bruno Martins is an assistant professor at the Computer Science and Engineering Department of Instituto Superior Técnico of the University of Lisbon (IST/UL), and a researcher at the Information and Decision Support Systems Lab of INESC-ID, where he works on problems related to the general areas of information retrieval, text mining, and the geographical information sciences. He received his MSc and PhD degrees from the Faculty of Sciences of the University of Lisbon, both in Computer Science. Bruno has been involved in several research projects related to geospatial aspects in information access and retrieval, and he has accumulated significant expertise in addressing challenges at the intersection of language technologies, machine learning, and the geographical information sciences. He and his students have worked on many different application areas, and he is proudest of the many PhD/MSc students who have graduated under his supervision and are now building wonderful careers.


ThinkSpatial: Martin Doerr

On Thursday, September 17, 2020, The UCSB forum on spatial thinking presents

Identifiable Individuals and Reality
What Do We Describe and Why


Dr. Martin Doerr

Foundation for Research and Technology – Hellas (FORTH)

10:00 a.m. Thursday, September 17, 2020 | Zoom link*


Data of empirical-descriptive sciences, such as cultural heritage studies, geography, geology, biodiversity are usually kept in predicate-logic based information systems that refer to things in reality by unique identifiers. This can only work, if the referred features or phenomena, in reality, are distinct and can diachronically be identified in the same way by independent observers without a dialogue between them. In this presentation, we argue that only a smaller part of the features in our environment is sufficiently distinct over a useful time-span to form “identifiable individuals.” Different ontological categories can provide specific criteria about how parts of reality can be subdivided into “identifiable individuals” that turn out to be useful for modeling the behavior of reality as a result of observation, rather than convention, the so-called ontological individuation. We demonstrate (1) that there are always cases in which individuality is undecidable basically within all such categories, (2) that multiple individuals may overlap in substance in characteristic ways, and (3) that no such individual has precise spatiotemporal boundaries due to a variety of causes.

We argue that the kinds of conditions allowing for ontological individuation have widely not been studied, as well as what properties make phenomena not suited for individuation, such as clouds, stages of growth, flowing matter, and so forth. We further propose that the description of delimited situations in such systems, be it after observation or in prediction, needs to relate to identifiable individuals as reference. This epistemic individuation inherits the indeterminacy of the individuals of reference. We further propose that many kinds of scientific description of reality are an approximation that can be better processed via outer bounds. As a practical application, we show how adequate individuation criteria can substantially reduce the ambiguity of spatiotemporal gazetteers.


Dr. Martin Doerr is a Research Director at the Information Systems Laboratory and honorary head of the Centre for Cultural Informatics of the Institute of Computer Science, FORTH. He has been leading the development of systems for knowledge representation and terminology, metadata, and content management. He has been leading or participating in a series of national and international projects for cultural information systems. His long-standing interdisciplinary work and collaboration with the International Council of Museums on modeling cultural-historical information have resulted besides others in an ISO Standard, ISO21127:2006, a core ontology for the purpose of schema integration across institutions.


