Georeferencing Historical Maps

Methods, best practices, and a tutorial in ArcGIS Pro

Overview

The term georeference refers to the process of locating or aligning spatial data to a geographic grid. This tutorial discusses methods and best practices for georeferencing scanned historical maps. Other terms for georeferencing are rubbersheeting, warping, and georectification. There is little significant semantic difference among them. In this tutorial, 'georeferencing' refers to the process of aligning map images. Georectification is the tool used to take your georeferencing settings and permanently apply them to map image, and 'georectified map' is the final product of georeferencing.

There are four major uses of georeferencing:

  1. Visualizing Map Extent. ‎A growing interest in georeferencing is to contextualize the maps within geographic space. In this case, not much is done with the map, except to make it known that the cartographer was representing a particular geographic space that we have come to know and represent in other cartographic terms. The precision of alignment is not very important for this purpose. 
  2. Data Transfer. The most common reason to georeference historical maps is to transfer features on the map to the GIS. Features can be anything and commonly take the form of boundaries, historical sites of mines, towns, or other points of interest, or even transportation networks or river systems. This goal of georeferencing requires an accurate and consistent alignment. 
  3. 3D and Extrusions. A georeferenced map can be transformed vertically (z values), not just in x-y dimensions. Thus, the map can be 'crumpled' to reflect the actual topography or it can be extruded to reflect the varying 'heights' of different parcel values, such as housing costs or education levels in an urban environment. Such practices reveal the hidden terrain of the historic maps. 
  4. Cartographic Insight. Finally, georefere‎ncing the map might tell us a lot about the process of making the map: the bias of the cartographer, the varying scales and precision of representational space, etc. To georeference, one must know the map well and one learns much in the process. This information can produce valuable insights.

Control points

Georeferencing depends upon the identification of landmarks on the scanned map that can be located in other GIS layers, i.e. layers that have been added to a GIS map frame and which are precisely located and aligned to a geographic grid. In ArcGIS Pro, these associations between a place on the scanned map (i.e. a pixel in the map image matrix) and a place in the map frame (i.e. an XY location within a GIS coordinate system) are called control points. With a minimum of three control points (and ideally many more), ArcGIS will transform your map image: stretching, rotating, and bending the gridded pixels of the image to fit the geographic grid.

Until very recently, map makers did not know the XY coordinates for the geographic features they are trying to map. Rather, the location of the primary features were determined by a secondary set of features, what we might call the base features, whose location was already known, either by means of sets of coordinates or by way of an existing map, which we might call the base map. Thus, the map maker will begin by copying or plotting the base features to a new map sheet and, afterwards, locating and drawing the primary features in relationship to these base features.

Because the primary features of original map are relative to the base features, they are subject to a double threat of error: the imprecision/error of base features and imprecision/error of primary features. Thus, set control points to the base features, which have less chance of error. To determine and distinguish between the base and primary features, you should know when, why, and by whom the map created. If the goal of the original map maker was to establish the field boundaries for persons with irrigation rights (as in the example we will use below), do not set your control points to features with this cartographic theme (i.e. do not use field boundaries or waterways such as rivers or canals). Instead, use something like the location settlement geography (churches or municipal buildings, for instance) or transportation geography (roads or railways, for instance).

Not all basemap features are equally reliable. Often, human geography serves more reliably than physical geography. Before the advent of modern military survey data and the widespread application of remote sensing technologies (beginning no earlier than the 1930s), only large-scale physical geographical features (such as small lakes) should be trusted. Mountain ranges and other meso-scale topographical features were usually sketched in afterwards, often to add color and beauty to the map.

When setting your control points, look at the map’s composition. What is near the center? Is there is central feature? For maps before the mid- to late-19th century, consider setting control points around the periphery of the central feature(s). Then, add a control point in each corner of the map. You will likely find that the central features were more accurately mapped (and better known to the map maker or to their informants) than were peripheral sites which are often there for context or to make a secondary point.

A georeferencing transformation is a mathematical equation to assign XY coordinates to every pixel of your map image, essentially interpolating between the control points you have set. ArcGIS Pro offers many types of transformations, each offer a unique advantage, be it simplicity, preservation of shape, global consistency, or local precision. These can simply shift data (zero polynomial), stretch between control points (which would become fixed), optimize for global accuracy (first-order polynomial), honor the parallelism of lines of latitude and longitude (projective), or bend lines at numerous points of inflection (2nd and 3rd order polynomial), such as the spine transformation, which offers localized but not general accuracy. See the ArcGIS Pro ' Overview of georeferencing ' and read content on 'Transforming the raster'.

If the georectified map is important to your research, you should save the control point table and use the raster file's metadata to record observations on the georectification process. It might also be a good idea to generate a point layer of the control points and to record point-specific observations in a notes field within the attribute table.

Tutorial

In the following tutorial, we will produce a georectified map of an irrigation map of the Teotihuacan Valley, located about 30 kms from the center of Mexico City. The image (AHA_AS_C3194_E43887_F3.jpg) can be  downloaded here .

Analyze Map

A first step before georeferencing is to understand the history and cartography of your map. In post-revolutionary Mexico, land and water rights were often redistributed to local farmers. This map, drawn at the scale of 1:20,000 in 1925 by Pablo Reider, represents the results of an accord between users of water in the Valley of Teotihuacán and the national regulatory agency (Comisión Nacional Agraria). Georectifying this map is an essential step to digitize and transfer features to vector layers. The map can be used to analyze property rights, access to water, and land use after the Mexican Revolution.

Plano de conjunto de la zona irrigada por las aguas del Río de San Juan, Estado de México. Comisión Nacional Agraria, Departamento de Aguas. Archivo Histórico del Agua, Mexico City (Aguas Superficiales, caja 3194, expediente 43887, folio 3).

The map sets out town boundaries using a complex array of thick yellow and red lines. Areas with yellow boundaries indicate a part within the whole, while red lines indicate external municipal boundaries. These colorful lines distract from the primary line types used to signify the boundaries of eight major land owners in the value. They are drawn in black ink, each with its own unique line type, and described in the map legend. On top of the colored or styled line types, a thin black line is used consistently to denote changes in ownership. Dots are interspersed within the line when there is a point of inflection.

The transportation network is drawn with black lines: railways (cross-hatched), country roads (parallel solid and dashed), and village roads (parallel solid). Village settlement (houses and properties) is represented carefully with gridded rectangles. Churches are usually drawn as circles with a tangential cross, but sometimes as a circle inside of a circle. Larger yards and buildings of rural estates are drawn with multi-sided polygons.

Hydrography and topography is drawn in blue and red, respectively. Double solid blue lines for primary waterways, single blue lines for minor waterways, concentric wavy blue lines for wetlands and reservoirs, and blue dots for springs. The irrigation zone is delimited by dashed black line border and filled with blue hatching using three different angles. Topographical lines (i.e. contours) are drawn in red ink.

In some cases, we find point features with dashed lines emanating radially to two points of inflection on the boundary lines. Likely, these points and lines represent survey points with known distances between points.

Set Up

Import the map image. Open a new ‘map’ in Pro. Create a folder link to the downloaded folder with this lesson's data. Import the image (AHA_AS_C3194_E43887 F3) via 'Add Data' or drag-and-drop from the contents pane. You'll get a message telling you that the layer has an 'unknown coordinate system'. In fact, we know it lacks a coordinate system, but the rows and columns of pixels are similar to a coordinate system. Right-click on the new layer in the Contents pane and click 'Zoom to Layer'. You'll see the map, located at the point of origin on the Map coordinate system, which will be off the coast of Africa unless you have changed this. Otherwise, you'll find it at the 0,0 location of a projected map.

Fit to Display. Zoom to the approximate spatial extent of your map image, which will be about 30 kms northeast of Mexico City, roughly between Ecatepec and Teotihuacan. On the ‘Imagery’ ribbon, click ‘Georeference’, which will open a new ‘Georeference’ ribbon. On the georeference ribbon, click 'Fit to Display', which will relocate the image to spatial extent of your current display. This will make it much easier to align a control point on the image with one at the appropriate place within the geographic grid.

Add Control Points

It is time to add your control points. Given our knowledge of the map's cartography, it seems likely that the basemap used by Pablo Rieder, the map's cartographer, included the transportation network, especially railways and major roads. The survey would have been conducted with this basemap in hand. I suspect, too, that great care was then taken to locate important churches, given they coincide well with the road network. With roads, railways, and churches, few places on the map would have been far from an established (mapped) point of reference. The location of the remaining features were thus dependent upon those of the basemap. This is especially true of hydrography and property boundaries. These would have been thoroughly debated by the property owners in the valley. Given this assessment, we want to place points at intersections of major roads and, first, railways; and second, other major roads. Then, we will seek to add points at churches.

To begin, click 'Add Control Points'. Choose an appropriate basemap, probably either the satellite imagery or a streetmap. Then, find intersections of railways and roads on the map image, zooming in and out as necessary, and turning on and off the map image layer in the Contents pane. When you find an appropriate place, add a control point to the map image (i.e. the source), turn off the map image layer, and then add a corresponding spot on the basemap (i.e. the target). Don't worry if you add a bad point. You can either escape out of the operation (if you haven't added the target point) or you can delete this point later on in the control point table. Continue adding points based on the transportation network. Then, add some for the churches. Try to add points throughout the spatial extent of the map image, remembering that it is likely that points outside of the map maker's main AOI (Area of Interest) are likely to be less reliable. Nevertheless, do your best to add at least a few around the outside. Ensure that your control points are widely distributed (for global accuracy) and locally distributed around the core features (for local accuracy).

After you have added at least 10 points (probably 15-18 would be nice), it is time to find an algorithm to interpolate the map based upon your control points. Before you choose a transformation, open your control point table and review the points and their RMS errors. Note the forward residual error (the Root Mean Square, or RMS). The RMS is the sum of the square of the difference of where the algorithm predicts the location of the target control point and the target point that you choose. The forward residual shows you the error in the same units as the data frame (i.e. map) spatial reference. Errors are unavoidable in historical maps, but you will seek to choose a transformation that minimizes the error. Once you have 4 or more points, you can deselect points with high RMS errors. Doing so before 4 points are set can cause your raster to shift in the map viewer quite unexpectedly, although this can be fixed. On the georeferencing ribbon, click save.

Try the 'adjust' transformations, which balances global and local accuracy. Or try the 'project' transformation, which will keep lines parallel. Try some options and choose a transformation that is appropriate to your needs.

Georectify

Once you are satisfied with you control points and the transformation, select 'Save as New' on the Georeferencing ribbon. This will launch the 'Export Raster' window pane. Choose a name, a coordinate system, and an 'Output Format', although the Tiff format is as good as any. In the 'Settings' tab of this tool, might want to change the Resample method. Nearest Neighbor is the simplest and usually suffices. Nevertheless, bilinear can improve the smoothness of the imagery, which will make it better for presentation. Click 'Export' and a new layer will be added to your Contents pane. This will be your georectified map image, projected in whichever system you chose. If you skip the next step, you can delete the layer that was being georeferenced. It will end in .jpg, while the new georectified image will end in .tif.

Record Work

This step is optional but recommended. Most importantly, take five minutes to fill out the metadata of this newly created file. You'll have to do this step, eventually, if you upload the file to AGOL, so you might as well do it now, while the process is fresh in your mind.

You should also export control points, a tool found on the geoferencing ribbon. This will write the control points to a small text file with four tab separated values in each line: SOURCE_X, SOURCE_Y, TARGET_X, TARGET_Y. The units will be in the map frame's spatial reference system.

Finally, it might be helpful to create a feature class of the control points. To do this, import the 'export control points' text file into excel. Add a row at the top to label fields: SOURCE_X, SOURCE_Y, TARGET_X, TARGET_Y. Close the spreadsheet. In ArcGIS Pro, go to 'Add Data' and choose XY Point Data. In the pane that opens, choose the spreadsheet and the fields for X (TARGET_X) and Y (TARGET_Y) and choose the coordinate system that is use in the map data frame [WGS 1984 Web Mercator (auxiliary sphere)]. In the new point feature class, you can add a text field for 'Type' and another text field for 'Notes'. You can record types as 'church', RAILxRoad, or other site types as deemed appropriate. You can add notes about uncertainties about certain points.

More

Are you looking for some more practice?

Or do some more work on your own:

Begin with  PA_Scranton_462064_1950_24000_geo , a USGS 1:24,000 topographical map with a known projection system. You can even test your skills against a  georeferenced version  of the same image.

If you are looking for a bigger challenge, try MGRR_1855_98688704, the 1855 map of the Manassas Gap Railroad, as the railroad's directors imagined its expansion. Jason Tercha, a PhD candidate at Binghamton University, has worked with this image. He notes that the map "is a challenge since identifying features on the map to current locations can be difficult due to twentieth century suburban sprawl and the fact that part of the railway (the Loudoun Branch and the section east of Haymarket) was never constructed. However, it is not THAT much of a challenge, since nearly all of the towns on the map can be identified as current locations. The Library of Congress has a digital copy that can be downloaded here:  https://www.loc.gov/item/98688704/ ."

Plano de conjunto de la zona irrigada por las aguas del Río de San Juan, Estado de México. Comisión Nacional Agraria, Departamento de Aguas. Archivo Histórico del Agua, Mexico City (Aguas Superficiales, caja 3194, expediente 43887, folio 3).