COVID-19 in Toronto Neighbourhoods
Throughout this presentation, various factors will be analyzed for their potential impact on the number of COVID-19 cases by neighbourhood
Throughout this presentation, various factors will be analyzed for their potential impact on the number of COVID-19 cases by neighbourhood
Many others have been studying socioeconomic factors and their effects on Covid-19 case rates. One particular article that caught my attention and sparked ideas was R.B. Hawkins, E.J. Charles and J.H. Mehaffey's piece called "Socio-economic status and COVID-19–related cases and fatalities". They discuss various factors such as education, Black residents as well as income. This is why I have decided to dive further into this topic and see if the data aligns.
Hawkins, R., Charles, E., & Mehaffey, J. (2020). Socio-economic status and COVID-19–related cases and fatalities. Public Health, 189, 129-134. doi: 10.1016/j.puhe.2020.09.016
This analysis relies upon leveraging four different data sets, they are as follows:
COVID Cases by Neighbourhood as of the cutoff (extraction) date (December 3, 2020)
City of Toronto Neighbourhood boundaries
2020 Demographic data from Enrichment in ArcGIS Online
Hospitals
First we must establish the boundaries in which the analysis will be conducted. This Map represents the Neighbourhood Boundaries in Toronto, Ontario. (Extracted from the city of Toronto open data site)
I published the neighbourhood boundaries downloaded from the city of Toronto open data site as a web layer in ArcGIS online.
I then downloaded the spreadsheet of Covid-19 cases and rates by neighbourhood for Toronto, Ontario. I published this as a hosted table in ArcGIS online.
Using geoenrichment ( enriched layer tool ), I was able to obtain average household income and average household size for each neighbourhood.
The join features tool was used to link the attributes of the Enriched Toronto Neighbourhood boundaries with the table containing COVID-19 case data. The end result is a layer that contains neighbourhood name, COVID-19 case information, income and household size for each neighbourhood. All COVID-19 data is current to December 3rd, 2020.
A series of maps were produced in ArcGIS online to visualize all the prepared data.
The number of cases on its own does not tell much of a story. There is no indication of how many people live in such neighbourhood or any other factors that may be affected the number of cases.
This is why it is crucial to look at other factors.
This Map represents the raw case numbers (purple polygons) and the total population in each neighbourhood (green circles). Although all factors are important, it is quite intuitive that the higher the population, the more cases there will be.
So now we can look at the number of cases in a more telling manner. This map represents the number of COVID-19 cases per 100,000 people in each neighbourhood.
This Map represents the average annual incomes by households in each Toronto Neighbourhood. The larger the circle, the higher the average household income.
This Map is now showing the average annual household income as well as the covid case rate in that neighbourhood.
One of the first observations we can now make is that the higher the income, generally the less cases in that neighbourhood. The larger circles represent higher income, and they for the most part are overtop of light pink neighbourhoods representing lower case rates.
Another factor that we can look at is average number of people per household. In this map, the larger the house icon, on average the more people per household in that neighbourhood.
When comparing the average household size with the COVID-19 cases per 100,000 people, it can be seen that in many cases, the larger average household size, the more COVID-19 cases per 100,000 people. This is especially true in the Northeast and Northwest corners of the map.
Another interesting way of viewing this is when the data is categorized using a 3x3 grid system. Any given neighborhood is given a colour based on where it falls in that grid. The Grid is shaped as a diamond, in the very bottom square, this is where the neighbourhood falls under low household size and a low case rate. at the very top, it is when both factors are high. On the far left square, the data shows a high caserate but a low average household size. The far right is vice versa, and the remaining squares are just not as extreme as the far corners but still represent high and low values.
Below is a map that shows the comparison of using the two different evaluation methods.
Although the visualizations above are very telling. It has been mostly done through observation. In order to quantify our observations and test our hypothesis, analysis using ArcMap was done. The Image displayed on the right is a model created of all of the steps taken to produce the maps to follow.
In the following work the factor of hospitals has been added, in hopes of providing more context on why cases are occuring in specific locations.
The purpose of this analysis was to see if socioeconomic factors and factors related to health care can predict where COVID-19 case rate hotspots will be.
Using a query builder, the hospitals were separated out of the Toronto health care data, which geographically displays all types of healthcare locations. Next, a Euclidean Distance surface (150m cells) was computed, where each cell is the distance from the nearest hospital. The Toronto neighbourhoods were used as a mask. It was then symbolized into 500m intervals up to 5000m (5km) and beyond. Next, I reclassified the distance values on a scale from 1 to 11 where 1 is closest to a Hospital (500m) and 11 is furthest (over 5km). Lastly, I made a polygon of the distance rings for display in ArcGIS Online.
Hospital Euclidean Distance Surface
I then took the Neighborhood polygons which have been enriched with Average Household income and average Household size. These values came from Enrichment in ArcGIS Online so the data is an estimate from Environics Analytics . Next the Neighborhood polygons were converted to a raster (150m cell size) and then reclassified using the same 1 – 11 scale. This time, 1 is high income ($411,544 to $781,904), 11 is low income ($60,973 to $75,109).
Rasterized average household income by neighbourhood
Essentially step two was repeated, but instead of using the Average household income, I used average household size. The raster surface (150m cells) is reclassified, 1 to 11, where 1 is small household size (1.6 people per household) and 11 is large household size (average of 3.3 to 3.5 people).
Rasterized average household size by neighbourhood
This step involved combining all of the 3 previous steps using a weighted overlay to create a COVID-19 risk score for each cell. This creates a weighted risk surface where the red areas are high risk of COVID-19 infection and green areas are low. The weightings used were (10% Hospital Distance, 40% Income and 50% Household size)
Results of weighted overlay - Raster COVID-19 risk surface
When we compare, this map looks extremely similar to that of the COVID-19 cases per 100,000 people map. (The darker the pink, the higher the case rate). A swipe comparison tool is provided below.
Vector Neighbourhoods symbolized by cases per 100,000 people
This map (Showing data which was published from ArcMap) represents the proximity to one of the hospitals located in Toronto, Ontario. This is a vector representation (polygons) of the euclidean distance surface.
This published map from ArcMap represents the weighted risk surface. The dark red represents the highest risk areas and the light orange represents the lowest risk areas. (This map is not divided by neighbourhoods). These polygons , have been converted from the computed raster risk surface for publication in ArcGIS online.
As mentioned above, with an exception of very few outliers, the two maps generally have their darkest values in the same neighbourhoods. This is especially true in the Northwest corner of the city, which likely proves that the factors of income and household size can explain the observed high case rate in that area.