Detecting Floods
Combining Machine Learning & Satellite Data to Measure Earth's Most Common Natural Disaster
"Lord, / when you send the rain / think about it, please, / a little? / Do / not get carried away / by the sound of falling water, / the marvelous light / on the falling water..."
— Untitled , James Baldwin
Intro
In this project, I attempt to fill this gap.
(Click here to read the full paper!)
The Measurement Challenge
What is a Flood?
In much of the world, the land that is "normally" dry changes over the course of the calendar year due to things like monsoon rains and snowmelt.
To show just how important these seasonal changes are, I use data from the Global Surface Water project .
Below, I have mapped surface water in Bangladesh for two months: March (before the monsoon rains) and September (after the monsoon).
Based on this definition, the ideal floods dataset will count a "flood" as any time surface water shifts significantly from the normal seasonal pattern.
From a research standpoint, creating an objective definition of "significantly" and "normal" is important for clearly defining the sample.
Existing databases mostly rely on newspaper articles or government reports to determine when and where floods occur.
This creates three problems:
- The definition of flooding is inconsistent across places and times.
- There is underreporting of floods in poor countries.
- Short-term and flashfloods are largely excluded.
#1: Inconsistent Definition
As a simple way of seeing this issue, I compare the two most widely used flood sources: the Dartmouth Flood Observatory Archive (DFO) and the EM-DAT International Disaster Database.
Both of these databases rely on news agencies and governments to document inundation events.
I take all flooding events reported in these databases since 2000, and for each country and year, calculate the number of floods reported in the DFO and in the EM-DAT.
For instance, I count the number of floods in Brazil in 2006, the number in Spain in 2010, and every other combination of country and year.
In the end, there are 2,086 country-years for which at least one of these two databases records a flood.
In the paper , I provide further evidence of inconsistent definitions of "flood" in existing databases by looking within datasets over time and comparing datasets to government records I digitize from Bangladesh.
#2: Underreporting in Poor Areas
The reliance of existing datasets on news articles and government reports can generate bias if some places or times are more likely to be covered than others.
As a suggestive test for whether this systematic underreporting exists, I conduct a simple exercise using data from the DFO, perhaps the most widely used of all existing flooding datasets.
I link all floods since 2000 from this database with information on income from the World Bank, and ask: are richer countries more likely to report floods?
Of course, rich and poor countries have different geographic features. I adjust for the total size of each country and a simple function of the country's location, though these results should still be taken as suggestive.
#3: The Missing Flash Floods
Perhaps because they rely so heavily on government reports and news articles, existing flood databases omit local floods—particularly short-term ones—instead nearly exclusively containing records of large, major flooding events.
Does this omission matter?
Because we lack data on these floods, this question is difficult to answer. I take a simple approach: ask people living in flood-prone areas.
I survey 2,279 farmers living across 250 villages in the Khulna division of Bangladesh about floods. I ask them to recall every flood they have experienced. For each flooding event, I ask them about whether that flood damaged their crops.
The Limitations of in situ Data
Separately from databases relying on news articles and governments, another important source of existing measurements comes from direct gauges of water levels from river stations and along coasts, or what is known as in situ data.
The farther away from the initial place of measurement, the less confidence we can have in the flooding measure based on in situ data.
This issue poses the greatest challenge in poor countries, where coverage of these sensors remains quite sparse.
The Power of Remote Sensing
Given these challenges with existing approaches to measuring floods, I turn to satellites as an alternative way of measuring both where and when floods occurred.
Radar-Based Measurement
High-Frequency Remote Sensing Data
Although this radar data is not available every day, other types of remote sensing data are collected at this high-frequency.
For instance, a pair of NASA satellites use the MODIS instrument to collect data across seven wavelengths every day for the past two decades.
Unfortunately, what this data has in terms of frequency it lacks in accuracy. Using this information alone to detect surface water would lead to substantial bias because it can be easily distorted by cloud coverage, for instance.
The main idea of my method is that this kind of distortion might still contain meaningful signal about floods—it just needs to be correctly extracted. I take a data-driven approach to this process using machine learning.
The Best of Both Worlds with Machine Learning
The key step of this paper is to use techniques from supervised machine learning to achieve the best of both worlds: the accuracy of radar instruments with the daily frequency of other inputs.
From Surface Water to Floods
Now that I have created a dataset of surface water, the next step is to generate a measure of flooding by removing the "permanent" water—the water that is supposed to be there.
One important challenge is that over the course of the calendar year, where water is supposed to evolves. What's more, over time, rivers can shift and lakes can change size.
To isolate floods from the permanent water therefore requires removing this seasonality. To do so, I residualize surface water on a very flexible function of calendar week. This gives me—for each day and each polygon—a measure of how much surface water deviated from typical water coverage in terms of percent of the total polygon area.
To turn this share into an indicator for "flood", we need to choose a threshold for this deviation in surface water. The ideal threshold would reflect the intuitive definition of a flood. To measure this, I turn once again to the sample of farmers in Bangladesh.
I asked farmers to list every flood they experienced on their land. I assume that farmers can perfectly remember any floods that occurred in the past eight months. For a given threshold, I can calculate the share of floods that farmers recall that I can correctly capture in the remote sensing data, and the share of mistaken floods that the farmers themselves did not list.
Validating the New Measure
Testing the validity of this new dataset of flooding is important to give confidence in its accuracy.
Test #1: News Articles
Using news articles to measure flooding can result in important biases, as discussed above. Nevertheless, they can be useful, particularly in major cities that have substantial coverage and in recent years.
I build a new dataset of every article mentioning flooding and related keywords in the city of Chittagong, Bangladesh since 2021. This is still an imperfect dataset—nevertheless, we should expect a difference between the amount of surface water in unions in Chittagong before and after a news article.
Test #2: Government Flood Reports from River Stations
Next, I code up annual reports from the Bangladesh Flood Forecasting and Warning Centre from 2010 to 2020.
For example, one such entry reads: "The Dharla at Kurigram registered several peaks during the monsoon 2010. It crossed its danger level (DL) on 20th July and continues to flow above DL till 12:00 hours of 24th July, 2010 (4 days). It attained its highest level 26.83m on 22nd July to 18:00 hours, which was 33cm above the DL (26.50 m)."
From these entries, I know when water in river stations exceeded the government's definition of a dangerous height.
I match these stations to my measure of flooding and see how the amount of surface water I estimate changes relative to the dates in the reports.
Test #3: River Station Water Level
Finally, I use daily data on the height of water from river stations.
I compare how the amount of water I estimate in my measure compares to the height of water at these stations.
In the end, this approach can be used to measure flooding around the world at a local level for every day of the past 20 years.
This data can be used to study the impact and incidence of floods—an issue of growing importance under global warming.