Artificial Intelligence, Computational Approaches, and Geographical Text Analysis to Investigate Early Colonial Mexico Patricia Murrieta-Flores, Diego Jiménez-Badillo and Bruno Martins This is a preprint. Published as: Murrieta-Flores, P., Jiménez-Badillo, D., and Martins, B. (2022) Artificial Intelligence, Computational Approaches, and Geographical Text Analysis to Investigate Early Colonial Mexico. Oxford Research Encyclopedia of Latin American History. DOI: 10.1093/acrefore/9780199366439.013.977 Summary The application of digital technologies within interdisciplinary environments is enabling the development of more efficient methods and techniques for analysing historical corpora at scales that were not feasible before. The project Digging into Early Colonial Mexico is an example of cooperation among archaeologists, historians, computer scientists and geographers, engaged in designing and implementing methods for text mining and large-scale analysis of primary and secondary historical sources, specifically the automated identification of vital analytical concepts linked to locational references, revealing the spatial and geographic context of the historical narrative. As a case study, the project focuses on the Relaciones Geográficas de la Nueva España (Geographic Reports of New Spain, or RGs). This is a corpus of textual and pictographic documents produced in A.D. 1577-1585, which provides one of the most complete and extensive accounts of Mexico and Guatemala's history and the social situation at the time. The research team is developing valuable digital tools and datasets, including (a) a comprehensive historical gazetteer containing thousands of georeferenced toponyms integrated within a Geographical Information System; (b) two digital versions of the RGs corpus, one fully annotated and ready for information extraction, and another one suitable for further experimentation with algorithms of Machine Learning, Natural Language Processing, and Corpus Linguistics analyses; and (c) software tools that support a research method called Geographical Text Analysis (GTA). GTA applies Natural Language Processing based on deep learning algorithms for named entity recognition, disambiguation, and classification to enable the parsing of texts and the automatic mark-up of words referring to place names that are later associated with analytical concepts through a technique called Geographic Collocation Analysis. By leveraging the benefits of the GTA methodology and resources, the research team is in the process of investigating questions related to the landscape and territorial transformations experienced during the colonisation of Mexico, as well as the discovery of social, economic, political and religious patterns in the way of life of Indigenous and Spanish communities of New Spain towards the last quarter of the sixteenth century. All datasets and research products will be released under an open- access licence for the free use of scholars engaged in Latin American Studies or interested in computational approaches to history.