
Princeton at the Turn of the 20th Century Project
Explores methodology and workflows of spatially linking the United States historical population census data to individual houses
Introduction
The project explores methodology and workflows of spatially linking the United States historical population census data to individual houses located in an urban area as well as in a rural area. The project used different maps of Princeton published around 1900 and georeferenced and then extracted locations that were listed in the census enumeration forms from the maps, and later linked the historical census data to the locations. It also describes the challenges and opportunities of using historical census data downloaded from the Ancestry library licensed by ProQuest and the method of correcting transcription errors found in the Ancestry library database.
Princeton map at the turn of the 20th century ( Link )
At the turn of the 20th century, Princeton was divided into two municipalities, Princeton Borough and Princeton Township. In fact, this division existed from the early 1800s until 2012, when it became one administrative unit again. Princeton Borough is where most of the businesses are located and it is also the oldest part of the town, where in 1900 all the houses have numbers as well as street names. This part of Princeton represents urban America. Princeton Township is more rural; it surrounds the Borough and many of the houses did not have house numbers in 1900 when the census was taken.
Population Census Data
The United States decennial population censuses of individuals are released to the public after 72 years. This release gives researchers rich historical records about people living in a particular place at a certain time; if the data is spatially tied to individual houses, it will become much richer, with the ability to explore spatial distribution and patterns of people living in the area. The scanned copies of forms filled by the census enumerators are made accessible to the public by the United States National Archives and Record Administration through their website. In addition, the Church of Latter Day Saints has transcribed the forms and built spreadsheets of each form that can be downloaded by users through the Ancestry Library edition database licensed by ProQuest, the global information-content and Technology Company to the libraries in the United States. The interface to view the data is very intuitive and easy to use; however, there is no easy way to download the whole spreadsheet of place data at once. To download a transcribed part of the census records, a person has to view each form one at a time and then click on the “show index” icon which will show the transcription of records in the spreadsheet style below the scanned form, then copy each spreadsheet record and paste it on the Excel spreadsheet. There are a total of 100 scanned forms for Princeton, and each form has 50 entries, except a few forms at the end.
Princeton Enumeration Districts
Princeton was divided into three Enumeration Districts (ED) in the 1900 census: two EDs covering Princeton Borough and one for Princeton Township. There is no ED boundary map to guide how the census data was geographically distributed. The EDs were numbered 0053, 0054, and 0055. The Census Enumerators were selected from the local area, one from each ED. This can be verified by their names on the census form and the places where they lived. The Census enumerators started taking the census on June 1st but they ended on different dates. ED 0053 ended on June 26, ED 0054 on June 18, and ED 0055 on June 14. ED 0053 had the largest population followed by ED 0054, and ED 0055 had the lowest.
Copying Census Data
The copying of their transcribed census data was time consuming and organizing them on one single spreadsheet required some adjustments on how the rows and the columns should align. The 1900 population census asked the following questions: name, relation to head of family, race, sex, birth month, birth year, age, marital status, years of marriage, number of children, place of birth, father’s birthplace, mother’s birthplace, immigration year, years in the United States, naturalization, occupation, months not employed, attended school, can read, can write, can speak English, house owned or rented, house owned free or mortgaged, and farm or house. The instructions given to Enumerators for taking the census are explained in detail in Measuring America: The Decennial Censuses from 1790 to 2000, issued on April 2002 by the United States Census ( https://www.census.gov/prod/2002pubs/pol02-ma.pdf ). Once you put all the census data on a single sheet, it allows a researcher to analyze and develop comprehensive socio-economic characteristics of a place.
Screen shot of AncestryLibrary.com page
Princeton Borough Maps
Building a historical geocoding database needs historical spatial data that have either line features with associated street names and address ranges or point or polygon features with attribute tables associated with a single address. In this project, I built a single address geocoding database. To build this kind of geocoding database, I needed a map that showed the location of each address from that period. Finding maps of Princeton Borough that show either property boundaries or building footprints with house numbers was a bit easier than finding similar maps of Princeton Township.
I managed to find three detailed maps of Princeton Borough published around 1900. The first map was published in 1898 ( https://maps.princeton.edu/catalog/princeton-j9602257x ) and shows building footprints with house numbers. The second map is part of an atlas; it has four sheets covering Princeton Borough showing parcel boundaries, street names, house numbers, and owners’ names. The atlas was published in 1905. The third map I used was the Sanborn map of Princeton published in 1902 ( https://catalog.princeton.edu/catalog/4595321 ). This map shows detailed information about houses including house numbers and it comes in 12 sheets. It is the best map to use for this project, but it does not cover all of Princeton Borough. All three maps do have street names and house numbers, but some of the house numbers do not match with the 1900 census house numbers. Sometime the census does not provide a clear address of a person’s residence. However, to build a location database, I verified the addresses from other sources and created a location for each household. For example, in the census, Princeton University President Francis Patton was listed as living at 65 B Place in Princeton although he was living in the Princeton University’s President residence, called Prospect House. None of the Catalogues of Princeton University list Prospect House’s address at 65 B Place, and I didn’t see any maps which showed that street name. In a situation like this, I create an address based on the house location.
I have also found errors in the Census where a person was supposed to be living at one address but in another official record the same person was said to be living at a different address. An example here is the Princeton University Professor of Physics, Cyrus Fogg Brackett, listed in the census as living at 4 Washington Road, but in the 1900 Catalogue of Princeton University, he was listed as living at 4 Prospect Ave., which is on the corner of Prospect Ave. and Washington Road. The closest address on the Washington road side from Prospect Ave. is 42. There is no 4 Washington Road. Also, the same Catalogue a few years later listed Professor Brackett as still living at 4 Prospect Ave. This show the census record is probably incorrect.
Princeton University Faculty Directory, 1900
Princeton Township Maps
Finding maps of Princeton Township published around 1900 that show either property boundaries or building footprints with house numbers was much more challenging. In fact, most of the households located in Princeton Township that were enumerated in the 1900 census did not have house numbers. However, I found a few land ownership maps published around 1900 for Mercer County, New Jersey, where Princeton Township and Borough were located, and a few maps of Princeton that included both Borough and Township. These maps show the general location of a house and the name of the person who owned the property. These maps were very helpful in linking a person to a particular location if the person owned property. If the person was not a land owner but renting a house or a farm then it was difficult to tie a household to a location. However, in the census form there is a column called “Number of dwelling in order of visitation”. This house visitation number was very helpful in approximating the location of a household to a property. In order to approximate the location of households, I used a 1920 tax map, the oldest tax map in the Princeton Municipal office that I acquired for this project, and the first aerial photography that covers all of New Jersey State taken in the 1930 s. This georeferenced aerial photography helped to identify possible locations of households based on street names and dwelling visitation numbers and the 1920 tax map helped me verify some household names who were renting in 1900 but on the 1920 tax map they were listed as property owners along the same roads where they were listed as renters in the census. On a few occasions I had to use 1920 census data to verify a person’s location which was not clearly listed in the census.
Princeton map from "Map of Mercer County, New Jersey, Philadelphia, Penna. Irving C. Hicks, publisher, 1903" ( Link )
Building Historical Geocoding Database
Once all the maps were acquired, scanned and georeferenced, a historical single house address location database was built because many streets’ house numbers were changed and many streets were built or removed since 1900, and therefore verifying each address on the street and rebuilding an address range on a street database might take more or less the same amount of time as building a single house address database. For example, Woodrow Wilson who was a Professor at Princeton University before he became the President of the United States, was listed at 50 Library Place in 1900 but the same house located in the same place is now changed to 82 Library Place . Also well-known African-American personality Paul Leroy Robeson, who was born in Princeton in 1898, was listed in the 1900 census as living at 72 Witherspoon Street, but the same house address is now 108 Witherspoon Street . Because of the above reasons, I decided to build a geographic database of individual households by extracting a point for each location. Once that decision was made, then I had to go through each street name and house number that was listed on the census, and see whether those street names and house numbers existed on the map. If they existed then I extracted the point and wrote down the street name and house number. I didn’t want to extract all the house numbers displayed on all the referenced maps because some house numbers are changed from one map to another. I think that has been done because new houses were added and house numbers needed to be added or changed. I want to make sure I am extracting house numbers that are in the census rather than extracting every house number shown on maps which might spatially link a household to the wrong location.
While going through the process of creating a geographic database, I found many transcription errors of street names, house numbers, people’s names, occupation labels, etc. For example, Trenton Turnpike was transcribed as Turton Tumeric, Alexander was written AlenAuller, Lawrence Road was written Launi Road, 20 Baker Street was labelled 30 Boker Street, Silvester was Siliester, etc. Almost all of them could be verified and corrected by using the maps that shows property owners’ names, house numbers and street names, and sometime going through the house visitation numbers. Most of the house visitation numbers are in sequential order, which helps to verify how the Enumerator might have walked from one house to the next. Checking each census household record is very time consuming but also very helpful in correcting many of the transcription errors. In order to verify the transcription errors of each enumeration form, all the copied spreadsheet records from enumeration forms were numbered by scanned image and ED numbers. Since the scanned enumeration forms were organized by ED number, having image and ED numbers next to the record helped me in verifying any transcription problem in the census data by looking at the enumeration scanned form and verifying the transcription. If image and ED numbers were not marked on the copied spreadsheet, then it would have taken me a lot of time to find a scanned form to verify and correct the transcription error. Cleaning the data took lots of time. The data is a bit messy however, it is better than someone transcribing them from the census form. It is apparent from creating the single address database that I have to create two separate databases: one that has address and street names, and other with street and household names only. It is also important to distinguish between the locations that were verified on a map verses the locations that were approximate based on visitation numbers and other sources. All the census record does have a note differentiating those two locations. This will help researchers to understand the spatial accuracy of historical GIS database.
Spreadsheet records with image sheet and ED numbers
For geographic data that has address and street names, I created a geocoding database. For geographic data that had only street and household names, I did not create geocoding address locators because none of them had house numbers, so I joined them to the census data. In order to join geographic data to the census data, I created unique IDs for each household in the census data and used the same unique ID for each location data where a household lived. Later the census data were joined to location data using the unique ID as a common field.
Analyzing the Data
The population size of Princeton at the turn of the 20th century was 4,854. After a century, in the 2000 census, the population had grown almost 8 fold (30,230). In 1900, 80.33% of the population lived in Princeton Borough (3,900) and 19.67% of the people were living in Princeton Township (955). However, a century later, 55.15% of Princetonians were living in Princeton Township and only 44.85% were living in Princeton Borough. Princeton also saw a shift in the percentage of Black people living in the town. In 1900, 20.60% (1,000) of the population were defined as Black, whereas a century later, it was reduced to 5.82% (1,760). However, the percentage of White population living in Princeton remained more or less the same. In 1900, it was 79.27% (3,848) whereas in 2000 it was 80% (24,206). Princeton attracted a large immigrant population in 1900 mostly from European countries (612); however, there were people who immigrated from Canada, the Caribbean Islands, South America, and Asia (35). The share of the United States born population was 86.48% (4,156) and the immigrant population was 13.56% (647).
Immigrant Population from European Continent
Immigrant Population from non-European Continent
The immigrant population was distributed all over Princeton Borough and Township and there was no clustering of immigrants from one particular country in any part of Princeton.
Red dots are U.S. born and blue dots are foreign born
However, most of the Black population was living in the northern part of Princeton Borough. There were a few Black households who lived outside the Black neighborhood.
Distribution of Black household in Princeton, 1900
The largest number of immigrants came from Ireland, followed by Germany, then England, and then Scotland. The population born in the United States came from 38 different states including the District of Columbia and New Jersey. Most of the people were born in New Jersey state (3044), followed by Pennsylvania (257), New York (245), Virginia (215), and North Carolina (109). The rest of the people came from other states.
Princeton 1900 Census Location
There were 1100 households are living in Princeton in 1900 and the size of the households varied from 1 to 14 people. For example, Mr. Philip Golden who was born in New Jersey, but his parents came from Ireland, was living at 94 Bayard Lane and 14 people were living at this residence. They include his wife, 5 daughters, 3 sons, and 4 boarders. The household count statistic was created by summarizing family numbers which were listed on the enumeration form.
Geocoding Historical Addresses
The historical geographic data I created helped me to build a geocoding database that can be used in geocoding any historical address of Princeton from 1900. In order to test this methodology, we scanned the Princeton Business Directory published in 1900 from microfilm and tried to use OCR to convert the scanned directory to text, but it did not work well because of the quality of the microfilm as well as advertisements printed in the directory. So we transcribed the directory that did not OCR properly, and created a spreadsheet file. This spreadsheet was geocoded using the historical geographic data created for the 1900 census; all the listings in the directory that had addresses were matched and geocoded with the business listings’ locations.
Princeton Business Directory, 1900
Princeton 1900 Business Directory Geocoded
The geocoded data could be used with the census data to fill missing gaps either in the census or in the business directory.