History Lab: Making Maps of Mexico
Johns Hopkins University - Spring 2021
Johns Hopkins University - Spring 2021
History Lab: Making Maps of Mexico was an experiment in collective, collaborative research and learning that took place at Johns Hopkins University in Spring 2021. The course was designed to help students learn something about Mexican history in the nineteenth century and something about the basics of qualitative data management and map making.
The course was based around a set of agricultural surveys conducted across Mexico in 1899. We used the spreadsheets municipal officials filled out to think about the history of data, the history of life in rural Mexico, and the potential and the problems inherent in digital humanities.
The class's work is part of Prof. Casey Lurtz's ongoing digital agricultural atlas project that aims to make this historical data widely available to scholars in an interactive, online format. Through the data cleaning, mapmaking, story telling, and conversations we had in this class, we thought through some of the challenges building such an interface poses.
The nine students enrolled in the class, our TA Alexander Young, and Prof. Lurtz came at this project with very different levels of experience and knowledge about all aspects of the material. We worked closely with members of the MSE Library's Data Services team, especially Reina Chano Murray and Lena Denis, to whom we are very grateful for all their help. We approached the semester as a learning process and took Miriam Posner's tweet about constructive, collective learning as our starting point.
This StoryMap serves as a tour through the course: our source material and readings, our conversations, the tools we learned to use, and the interests we developed and explored. As with everything in the class, it represents a collaboration between all the team members involved.
In advance of the 1900 Paris Exposition, the Mexican Agricultural Society, an organization dedicated to the promotion and modernization of Mexico’s rural economy, convinced the Department of Fomento, the government ministry dedicated to Mexico’s development, to use the occasion to compile an agricultural statistical report for the whole country.
Mexican statistics were used for the project of attracting foreign investors to pour money into Mexico. Also, as a way of displaying modernization and forwardness of a nation (by European standards) - Beata on Mauricio Tenorio Trillo, “Mexican Statistics, Maps, Patents, and Governance.” In Mexico at the World’s Fair (1996)
The men who proposed the surveys were quite clear in their purposes, as their letter to the director of Mexico's pavilion in Paris makes apparent:
Archivo General de la Nación de México, Fomento y Obras Publicas: Exposiciones Extranjeras y del País. Caja 67, Expediente 7, translation by Casey Lurtz
From the outset, the project's authors debated what information they already had, what they needed, and how to combine the two. The bureaucrats at the Exposition committee edited the proposed tables sent by the Agricultural Society, sending drafts back and forth, adding and deleting columns to suit what data they thought would sell best on the world stage and, at least in theory, reduce the work done by the municipal officials charged with collecting the statistics.
Archivo General de la Nación de México, Fomento y Obras Publicas: Exposiciones Extranjeras y del País. Caja 67, Expediente 7, translation by Casey Lurtz
Copies of the final spreadsheets were sent to all 2,300 municipalities in Mexico in the summer of 1899. Over the course of the following months, officials from 1,400 municipalities returned their completed surveys to the federal government. In reading through these surveys, it becomes clear that the project of the officials who filled out the spreadsheets was not always the same as that of the officials who created the blank sheets in the first place.
The statistics we are viewing are meant to be attractive, so their accuracy cannot be automatically assumed. [Tenorio] does say that they are valuable documents, and very important to understanding the agricultural moment of the time, but to keep in mind the idealization of the situation. - Dianne on Tenorio Trillo
Spreadsheet for Calvillo, Aguascalientes. Archivo General de la Nación de México, Fomento y Obras Publicas: Exposiciones Extranjeras y del País. Caja 52, Expediente 10.
The spreadsheets were never published, their contents never displayed in Paris. Instead, the completed surveys were filed away at the national archive and the failure of the project dismissed by the planning committee as the fault of recalcitrant municipal officials who failed to return the requested information.
This class and Prof. Lurtz's larger project aim to take the information the surveys contain and make it accessible and useful. With the help of research assistants Lauren MacDonald and Oriol Regue Sendros, Prof. Lurtz transcribed all of the original documents into Excel. She cleaned some of the data, standardizing names and conducting some basic calculations, but the datasets we started this class with were far from uniform and many decisions about how to regularize their contents remained.
In the first weeks of class, each student selected datasets from two different states, each one representing about 100 properties. Prof. Lurtz designed the datasets to maximize coverage of the whole country and make sure that each student would get a sense of the diverse ways in which municipal officials filled out their surveys.
Our datasets
In total, our datasets include more than 2,000 properties, covering about 250 municipalities in 12 states. The full dataset Prof. Lurtz is working from is much larger, but by working with a representative subset of it, we were able to settle on some means of normalizing and cleaning the data for input into mapping software. At the same time, we discussed what would be lost in the cleaning process and how we might work to keep that information available to readers and researchers.
Our conversations about data drew on the work of historians including Joan Scott and Jessica Marie Johnson to think about the appeal of data and the ways in which it flattened human experiences both historically and in the hands of historians.
Statistics should be viewed no differently than qualitative evidence: a piece of information that relies upon the context in which it was collected to tell the full story, rather than a contextless, objective truth"- Autumn on Joan Scott, “A Statistical Representation of Work.” In Gender and the Politics of History (1988)
As we tried to establish norms for cleaning our own datasets, it became clear that the historical actors filling out the surveys regularly refused or resisted the categories offered to them by the Mexican Agricultural Society and the Department of Fomento. We read more Mexican history, including work by Antonio Escobar-Ohmsted and Matthew Butler, to try to understand how economic, political, and social changes related to the increasing commercialization of land and agriculture did and did not change how people understood their own agricultural undertakings.
Even though we want to maintain the diversity of ways in which people described their landholding, their local climates, their crops, and their tools, we also want to be able to import this information into ArcGIS and use some basic analytical tools to make comparisons across municipalities and regions. In order to undertake this cleaning process, we used a free program called OpenRefine . Marley Kalt, from the MSE Library's Data Services team, led a workshop to teach us the basics, and we spent the next weeks coming up with standards for our data.
Data from Aguascalientes in the midst of cleaning in OpenRefine
Different teams took on different columns and established norms and JSON code for making the data comparable across municipalities. We standardized capitalization and spelling, settled on units of measure, clarified what words we would use to describe different climates and tools, and did our best to keep space for the oddities and messiness by adding a column for annotations.
We decided to keep "se ignora" because it is an active documentation by the individuals filling out these spreadsheets, but we decided to replace "no hay" or "not indicated" and its equivalents with <null>.- Julianne, Dianne, and Emma, General Standards working group
Some columns resisted OpenRefine. The team working on the Yield column ended up going through our dataset manually to pull out the crop information where it was available and try to come up with a standard form for writing out the different ways different municipal officials provided information on yield. The fact that many municipalities just provided a rate of return, rather than including any information about what crop it was for, was a good reminder of the ways that the survey authors' aspirations to interpolate this information with other information held by the federal government never came to pass.
We generated a codebook to record our decision making process and everyone applied the same steps to their datasets in OpenRefine. This meant that everyone could continue working with their subsets of data, but, when the time came, that we could also start to make comparisons across regions.
We moved to mapmaking by first thinking about what maps do in abstract and concrete terms. We talked about national cartography as a way of thinking that required certain developments in both government capacity as well as new ways of thinking. We paired Mary Berry's work on the absence and emergence of maps beyond the very local in medieval and early modern Japan with Raymond Craib on the history of mapping in Mexico in the years around our agricultural surveys. Thinking about surveying and tromping across the countryside got us talking about Google StreetView and GeoGuessr and how close we can get to understanding life before GPS.
How are we certain of what maps are not?- Emma on Mary Berry, “Maps are Strange.” In Japan in Print (2006)
When it came to actually mapping our data, we turned again to our wonderful Data Services librarians, this time Lena Denis and Reina Chano Murray. They guided us through two kinds of activities in ArcGIS Online: georeferencing and creating polygon layers and then joining and analyzing data.
In the 19th century, Mexico's administrative structure included municipalities within districts within states. The agricultural surveys reflected this structure. The districts were done away with after the Mexican Revolution, and so contemporary maps do not include them. We decided to create polygons that represented the 19th century districts as a way to allow for data to be mapped at that level.
Prof. Lurtz initially shared a set of state-level maps published in 1899 that included the outlines of all the districts in each state.
Lena and Reina guided us through the process of tracing these boundaries to create new polygon layers in ArcGIS, layers that could then be used to analyze our data at the district level.
As team members started tracing the districts in some states, it became clear that the 1899 maps were not very precise, much as the Craib reading should have led us to expect. We found another set of maps from 1904 that were better for some states, and so began using these as well.
Our district polygon layers cover much of Mexico for 1900 (or so) and will be made available as a shape file for historians to use, after some further cleaning and processing by Prof. Lurtz and a digital humanities project she is working with in Mexico called Archivo.MX .
The new district polygon layers are only useful, though, if joined to our cleaned data. Because we don't know the specific locations for each of the properties in our dataset, the closest we could get to locating them was at the municipal level. But, unlike with the district maps above, there are no maps of municipal boundaries from the end of the 19th Century. Some of us tried joining straight to the district polygons we had created, but this felt like it wasn't going to capture some of the regional differences we had noticed. Instead, Reina and Lena found us a municipality-level polygon layer from Mexico's National Institute of Statistics and Geography (INEGI) and we learned how to join our data to that. We created summary tables in Excel to allow for comparisons across municipalities, but also imported the property-level data for those who want to see things in greater detail.
Click on a polygon to see pop-ups for the individual properties as well as summary data for the municipality. (Summary data will either be the first or last slide)
The polygons are drawn from Mexico's statistical office's contemporary maps of the country and do not necessarily correspond precisely to the historical boundaries of the 1899 municipalities.
The contemporary municipality of Tamazula, Durango (in red) for example, overlaps with the 1899 municipalities of Amaculí, Copalquín, Remedios, and Tamazula. Clicking through the pop up lets you see data for each historical municipality.
By using ArcGIS symbology, we started to figure out how to represent regional differences. This map by Rui uses gradations of color to represent the different numbers of properties in municipalities in Michoacán.
This map by Julianne represents variation in the total value of property in parts of Hidalgo.
This map by Dianne overlays circles representing the total size and value of properties in municipalities in Chiapas on a graduated representation of the number of properties in each.
This map by Prof. Lurtz using all the class data looks at the relationship between average property value and average property size.
Seeing this ["the disparity between fantasy and capacity"] a bit in the maps we originally thought we could create with data at the beginning of the course that are now turning out to be much more difficult to visualize. - Julianne on Raymond Craib, "Situated Knowledge," in Cartographic Mexico (2004)
These maps are still works in progress. We have come to recognize that the information in pop ups is likely just as important as what actually gets displayed on the map. The surveys we are working with were never published, nor were they displayed at the 1900 Paris Exposition, and the process of data cleaning and mapmaking has made the reasons for this very clear. The information they contain is difficult even for us, with all our tools, to turn into something that ArcGIS can map. If nothing else, we know that we need to include the original sources with the final web interface wherever possible, something the map tour below demonstrates.
Our maps alone can only do so much to capture the information that the surveys both contain and allude to. Even with detailed pop ups and metadata, a lot has been left out. As our final project, team members created StoryMaps to provide greater detail on one theme or subject that caught their interest during the semester.
Using the StoryMap interface, again introduced by our wonderful DataServices librarians, and primary sources we found online thanks to the help of our Latin America librarian Josh Everett, we tried to bring back in some of the nuance and detail that data cleaning and mapping had flattened. Thinking back to Jessica Marie Johnson's article “Markup Bodies: Black [Life] Studies and Slavery [Death] Studies at the Digital Crossroads” and Lara Putnam's admonition to remain aware of how relying on digital archives shapes what we find in her article “ The Transnational and the Text-Searchable: Digitized Sources and the Shadows They Cast ”, we built StoryMaps to facilitate further exploration and conversation about the diversity of communities and kinds of knowledge that the original 1899 statistical surveys tried to cram into columns and rows.
You can explore some examples of these StoryMaps in the collection below.
Making Maps of Mexico