Interpolation

Finding Suitable Areas to Grow Corn in Utah, USA

Madeline Mulder

November 27, 2022

This StoryMap was created for the Methods in Spatial Analysis Class at the Paris Lodron University of Salzburg as part of the Copernicus in Digital Earth Masters Program. For more details on the lab assignment click here .

Assignment

This assignment focused on interpolation. As part of the assignment, we were asked to consider the following tasks:

Create a temperature map for one European country in ArcGIS Online using the data sets introduced in the lab, carefully choosing presentation details like classification and symbols. Discuss your decisions and choices in the map description before you share the map.
In ArcGIS Pro, use the GSOD (or any other) data set and create two IDW interpolation results with different power value settings. Compare the two results (including quantitative statistics) and discuss the different outcomes.
Briefly discuss how you suggest approaching quality assessment of interpolation results. You e.g. create an IDW output based on 12 vs 18 closest sample points - how can you judge one to be the preferable option?

Scenario

I decided to find my own data for my recreation of this assignment and came up with the following scenario to guide my analysis:

The United States is the world's largest producer of corn. As such, the United States Department of Agriculture (USDA) is looking to try and maximize corn harvest within each state. For this project, they would like to focus on the state of Utah in the southwestern United States. Corn is one of Utah's top three most widely produced crops, with the annual value of corn production for Utah being more than 20 million US dollars. However, despite this impressive turn out, Utah is ranked 41 out of 50 in terms of states that produce the most corn. The USDA want to know if it is possible to increase corn production in Utah by planting corn earlier in the season. The growing season for corn begins in the early summer months, usually around June, but if the conditions are right, it might be possible to plant seeds a month earlier in May. The USDA needs to know if there is sufficient rainfall and if the temperature is warm enough to support a good harvest of corn during May in Utah.

Background

Utah boasts a wide variety of climactic zones which lead to great variation across the state in terms of average temperature and rainfall. As you can see in the map below, Utah has a vast range of climates spanning from arid cold temperate, semiarid warm continental, to humid boreal, and there are even some alpine areas.

United States Climate Zones. Utah is highlighted in red. Click on the image to expand it for a closer look.

The ideal growing conditions for growing corn are a temperature between 60F - 77F (15C - 25C) and between 4 - 8 inches of rain a month. Thus, to identify if it is possible and where it is possible to grow corn in Utah in May we need to identify the areas that allow for crop cultivation (croplands) that also have an average temperature above 60F and an average rainfall above 4 inches for the month.

Data

To conduct the analysis, we will use a dataset that contains weather stations with the monthly temperature and precipitation averages for the United States between 1981 and 2010. This will allow us to identify which areas of Utah meet our ideal climactic conditions for growing corn. We will also incorporate cropland data from the National Land Cover Database (NLCD) to identify possible areas for growing corn. Finally, to enhance our analysis by way of a quality assessment we will also include data on plant hardiness from the USDA to assess and verify our result.

The weather data comes as point data as each average temperature and precipitation measurement was tied to a single station. However, this analysis requires us to know the temperature at any location across the state of Utah. So we must create this data and to do so we must turn to interpolation.

US Climate Data

Methods & Results

Defining Our Study Area

Not all of the weather stations collected both temperature and precipitation data, so we would not be able to use those stations in our analysis. Thus, before diving into a strict interpolation of the weather measurements, it is pertinent to interpolate our study area from the points that had both temperature and precipitation data. To do this, we turn to Thiessen polygons. By creating Thiessen polygons each station was moved into their nearest neighbor polygons. Once this is done, we are able to select out only those stations with areas where the precipitation values were not null and create our true study area.

There are 335 weather stations in Utah but only 280 of them collect both temperature and precipitation data.

Click on the stations and polygons on the map to explore the average temperature and rainfall for each month for each feature.

Hover over the legend symbol to find out what each symbol means.

After defining our study area, we can use the resulting data to show the different average amounts of precipitation for the month of May for each of the weather stations that collect both temperature and precipitation data. This will help us get an idea of which locations experience the most rainfall and are this best suited to growing corn.

Now that we have defined our study area and explored the data a little bit, we need to transform our point data with the weather variables into a continuous value raster for each variable (temperature and precipitation) through interpolation so that we can know the temperature and precipitation at all points in the entire state, not just those value at the weather station locations.

01 / 06

Temperature

Power: 2

We conduct Inverse Distance Weighting (IDW) Interpolation to create a temperature raster for all of Utah. For the first attempt we will use the average May temperature attribute in the weather station layer with the default parameters with a power of 2, a variable search radius, and 12 as our number of closest sample points to use to calculate each pixels' temperature value.

For visualization purposes we will classify our data using geometric intervals which creates near equally sized classes that have a consistent frequency of observations per class so we can best compare the classes. We will create 4 classes which accounts for 9 degrees Fahrenheit difference between classes. With less classes and more distinction between the classes we are better able to visually identify areas that might fit our needs.

Using the default parameters with a power of 2 for this interpolation with this visualization shows that there are higher temperatures in the southern portion of the state, indicating that is where we should focus our corn growing efforts.

Power: 0.5

However, when we change the power for the interpolation the results change. Here we decreased the power from 2 to 0.5, which gave more weight to the points that were more distant within the 12-point neighborhood we defined for each station.

With this decrease in power overall there seems to be less areas of higher temperatures and a lower temperature generally acorss the state. In fact, the only part of the state that was identified as having high enough temperatures to support corn production was in the southeastern part of the state.

Power: 5

On the other hand, when we increased the power from 2 to 5, the points that were closest to each weather station within the 12-point neighborhood were given more weight.

With this increase in power, we can see overall there is a general increase in temperature across the state with additional areas of higher temperature being identified.

Precipitation

Power: 2

We can do the same IDW process but this time with the precipitation values to create a precipitation raster for all of Utah. For the first attempt we will use the average May precipitation attribute in the weather station layer. Again, we will use the default parameters with a power of 2, a variable search radius, and 12 as our number of closest sample points to use to calculate each pixels' precipitation stations across Utah.

Using geometric intervals to classify our data we can see that the northern portion of the state receives the most rainfall with there being less and less rainfall as you travel south down the

The interpolation with a power of 2 produced a rather nuanced precipitation map. There is a lot of variation within areas where large swatches of one class of precipitation values are interrupted by small patches of areas with different precipitation values. It is obvious that the northern portion of the state receive more rainfall, but we can see that there are many patches of areas with less rainfall scattered throughout the state.

Power: 0.5

When we decrease the power to make the farther away points in each weather station's neighborhood weigh more, we lose a lot of the nuance in the location of precipitation seen in the previous slide. Now, with a lower power, we see four distinct precipitation zones with there being increasingly less rainfall as we go south. There are very few if any patches of different precipitation values within zones like we saw when the power was 2.

Power: 5

Increasing the power alters the outcome significantly as well. This time, the large zones of each class of precipitation values are once again broken up by patches of areas with different precipitation values, however, these patches are clustered in the middle of the state and these patches are fewer and larger than the power of 2 outcome. The general trend of more rainfall in the north remains true but we also see that with this output the transition between classes is much smoother than with other output and there is much less area that is categorized as the second class (0.75 - 0.83 in.) than there is in the other outputs.

Before moving onto the last step of the suitability project (combining the rasters to find areas of overlap), it is important to take a moment to evaluate the effectiveness of the parameters we have chosen. To do this compare the outcome of the temperature and precipitation interpolations when we kept the power the same but changed the neighborhood size.

Quality Assurance Check

There are 280 stations in Utah that collect both temperature and precipitation data and many of them are very close to each other. To see how changing the number of sample points impacts the outcome we drop the number of sample points in the neighborhood for the weather stations from 12 to 5 but keep all the other variables the same and use a power of 2. Given that many stations are clustered together, a smaller neighborhood may produce more accurate results.

For both the precipitation and temperature examples, the image on the left shows the output for a neighborhood of 12 whereas the image on the right shows the output with a neighborhood of 5. Swipe to see the impact of neighborhood.

But which neighborhood produced more accurate results, and how can we judge that? The USDA releases data on the plant hardiness regions in the United States which depict the average annual temperature for any given area. The plant hardiness regions reflect an annual temperature and not the monthly temperature, but we can use it as a benchmark to compare the different spatial pattern produced with the two different neighborhood sizes so see which neighborhood most closely follows the overall trend. However, given that the plant hardiness zones only deal with temperature we can only compare our temperature interpolation results to it, not the precipitation results.

On the left: temperature interpolation with a neighborhood of 5. On the right: USDA plant hardiness zones.

It turns out, in this case, a smaller neighborhood is able to produce a more nuanced temperature map which best reflected the plant hardiness zones. Given this outcome to conduct our final suitability model we will use the interpolation output produced with the power 2 and a neighborhood of 5.

To find out which areas meet our criteria for growing corn we can use con to create Booleans that select out which areas in each raster (the interpolated temperature and precipitation rasters with a neighborhood of 5, as well as the NLCD raster) meet our needs

On the map you can see which areas are identified as croplands according to the NLCD, which areas have an average temperature above 60F in May, and which areas have an average rainfall of at least 4 inches in May.

We can then combine these outputs using raster calculator to find the areas in which all three of these variable overlap.

unfortunately, there were no areas that met all three conditions. At most, the most suitable(medium) areas only met two of the criteria: either these areas were cropland and had a high enough temperature or these areas were cropland and experienced enough rainfall. There were no areas that experienced both a high enough temperature and a high enough rainfall in the month of May to support corn growth.

The total area of land in Utah that met the medium threshold of suitability in which that area met 2 of 3 of the conditions necessary for growing corn was 762.9 sq km which is only 0.3% of the total area of Utah.

Click on the areas that are most suitable for corn growth to see how much area they cover.

Discussion

Unfortunately, it looks like we will have to give the USDA a disappointing answer: based on the historic monthly averages of temperature and precipitation it is not going to be possible to start growing corn earlier in the season in Utah. The northern part of the state experiences the most rainfall, but it is the southern part of the state that experiences high enough temperatures to support corn growth so there are no overlapping areas that meet both the rainfall and temperature requirements for growing corn in May.

Next Steps

As always, there is always more that can be done. However, there are a few further steps that seem especially important to enhancing the methods used and expanding the applications of this project.

Firstly, plant hardiness was used as means to assess the impact of neighborhoods on the outcome of interpolation. In the case of this project, plant hardiness was not the best benchmark for verification as it did not closely follow the timeline of the data we were using (annual versus monthly), and plant hardiness only refers to temperature so there was no way to assess the precipitation outcomes. Thus, for better quality assurance, it would be good to explore using different benchmarks to assess the quality of an interpolated outcome.
Secondly, this project focused on just one state, but it would be interesting to increase the scale to include the entire United States. This would change the neighborhood scale and could lead to some interesting explorations and analyses.
Thirdly, as this project only focused on corn in Utah, it might be interesting to focus either on:
1. The needs of a specific crop across multiple states/the entire United States
2. The needs of all of the crops that are grown in a specific state/across the United States to see where there might be room to expand production
3. A different crop in Utah to see how the output changes.