TDEC APC Air Quality Sensors Evaluation
Long-Term Evaluation of Low-Cost Sensors in Middle Tennessee

The Tennessee Department of Environment and Conservation-Division of Air Pollution Control (TDEC-APC) is actively researching the utility of small sensor technologies for air quality characterization. As these technologies become more prevalent, it is important for our agency to become familiar with the performance, operations, and potential limitations of these sensors.
TDEC-APC deployed air quality sensors and weather stations at four existing ambient air monitoring sites across Middle Tennessee equipped with regulatory-grade air monitors in August 2020. The sensors and weather stations will remain at these sites for a minimum of one year or until sensor degradation can be measured.
Project Abstract
The Tennessee Department of Environment and Conservation (TDEC) evaluated low-cost sensors for PM2.5 and gaseous pollutants to address the growing interests in their use for air quality characterization. Sensors were selected for study inclusion based on the likelihood of purchase by the public, considering cost, ease of communication, weather ruggedization, and accuracy. Three PM2.5 sensors (Purple Air, Clarity Node, Air Quality Egg) and four gaseous sensors (NO2 Clarity Node & O3 + SO2 + NO2 Air Quality Egg) were tested. Sensors were collocated with regulatory monitors at four sites in Middle Tennessee for a year (Fall 2020 to Fall 2021). Sensors were assessed for data recovery, intercomparison with regulatory monitors, sensor degradation, and performance during special events. During our study, high data recovery, nearly 100%, was observed for PM2.5 and NO2 sensors, with no major signs of sensor degradation. When data adjustments were applied to Purple Airs, it agreed well with PM2.5 from Federal Equivalence Methods (FEM) with linear slopes close to 1 and r2 > 0.79. Additionally, Clarity Node NO2 tracked well with collocated Federal Reference Methods (FRM) but tended to overpredict the FRM. Measurement biases were observed in sensor data, especially with PM2.5 sensors, during a few special events (dust storms, freezing fog). Given our results, PM2.5 sensors can potentially be used for quantitative air quality applications without a collocated regulatory monitor. Alternatively, Clarity NO2 Sensors in this study appear to be better suited for qualitative interpretation of NO2 trends/concentrations and may benefit from being collocated with a FRM/FEM. Last, O3 and SO2, and NO2 Air Quality Egg sensors in this study did not capture known trends, revealing the need for further research to improve data validity. These results help inform the air quality community of appropriate applications of low-cost sensors for air quality characterization.
Methods
Existing sites were chosen to allow for side-by-side operation of non-regulatory small sensor technologies and regulatory ambient air monitoring equipment approved by the EPA. Sensor data was compared with the existing regulatory equipment to determine sensor performance against standard measurement techniques. Data from the weather stations was also evaluated to assess the extent to which sensor integrity may have been affected by meteorological variables.
01 / 04
The tables below indicate the sensors evaluated and collocated regulatory monitoring equipment during our study.
Sensors Evaluated in this Study
Collocated Regulatory Equipment by Site and Pollutant
Of note, for the PM2.5 section, most of our analysis focuses on Purple Air sensors, but a brief discussion of the Clarity Node and Air Quality Egg Sensors is included.
*All regulatory data is preliminary and subject to change*
Sensor Evaluation Criteria
Sensors in our study were evaluated based on the following criteria:
- Data Recovery: How much data is generally recovered from various sensors?
- Regulatory Intercomparisons: Can sensors capture the same variabiility in air pollution as regulatory monitors with accuracy?
- Sensor Degradation: How long is the life of a sensor before data becomes unreliable/erratic?
- Performance during Special Events: How do sensors perform during wildfires, dust events, etc?
PM 2.5 Evaluation
PM 2.5 Sensor Data Recovery
The first goal of our study was to assess data recovery and quality from the sensors. At the four sampling sites, data recovery for PM2.5 Purple Air sensors was generally good during our study, but it varied by individual sensor. For the entire sampling period, the aggregate average of collocated sensors at each site (i.e. 3 to 4 sensors per site) was upwards of 97%. Across individual Purple Air sensors, data recovery was more variable, ranging between 45% to 99% (shown in table below).
Data Recovery of Purple Air sensors at the four sampling sites during our year-long study. Data recovery below 75% is in bold text.
Most data loss observed in our study was attributed to either downtime for periodic sensor maintenance or the presence of erratic data. Erratic data was identified using established quality assurance methods developed by the EPA. Often, erratic data during our sampling period was not associated with a malfunctioning sensor or an air quality event. Instead, some sensors were found to produce more noisy data compared to other collocated sensors without any defined reason. At several of our sampling sites, at least 1 sensor produced relatively noisy data, resulting in some data loss.
These findings underscore the importance of having multiple sensors collocated at a site. In our study, collocating several sensors at a single site not only improved data recovery, but also data quality. For example, data recovery for individual sensors was observed to be as low as 45% (per quarter) across our sampling sites. However, when combining collocated sensor data, recovery was upwards of 97% for the entire study. The increase in data recovery was a result of using collocated sensors to fill in intermittent data gaps from individual sensors. Furthermore, combining collocated sensor data reduced noisy data at each site by dampening data extremes associated with a single sensor. Based on our study, we found having at least three Purple Air sensors at each site optimized data recovery and quality.
PM 2.5 Sensor-FEM Comparisons
The second goal of our study was to evaluate sensor performance against regulatory grade monitors. In our evaluation, we compared Purple Air PM2.5 sensors against two types of regulatory grade monitors: a beta attenuation absorption technique (BAM) and an optical measurement method by Teledyne (T640x). Both BAM and T640x techniques are widely used for continuous regulatory monitoring of PM2.5 mass and are designated Federal Equivalence Methods (FEMs) by the EPA. All four sites in our study were equipped with FEM-BAMs, while only the Lockeland site had both FEM-BAM and a FEM-T640x.
During our study period, Purple Air PM2.5 sensors compared well with regulatory monitoring PM2.5 when appropriate adjustments were applied. This finding is consistent with findings from previous updates . Sensor data was adjusted using two, well established methods. The first adjustment factor was developed by the Lane Regional Air Protection Agency (LRAPA) using sensor data from the Western US during wildfires. The second factor was developed by EPA using nationwide sensor comparisons with regulatory monitors.
The scatterplots below display 24-hr PM2.5 Purple Air data (raw and adjusted) vs FEM-BAM data at each sampling site during the entire sampling period. Across sites, raw sensor PM2.5 generally overestimated regulatory monitoring data. Sensor data was approximately 2 times higher than the FEM-BAM data (slope: 1.99 to 2.3; r2: 9 to 0.87). Once data was adjusted, sensor data closely agreed with regulatory monitor PM2.5 (Slopes: 0.99 to 1.20 and r2: 0.79 to 0.89), indicating the need to adjust sensor data for validity.
Of note, there were no significant differences in sensor-FEM comparisons between the two adjustment methods. Both LRAPA and EPA adjustments resulted in sensor data that compared well with the regulatory monitoring PM2.5. This finding indicates, at least for our study, that no additional, regionally specific data adjustments are needed for good agreement between sensors and regulatory monitors. It may further suggest that LRAPA and EPA adjustment factors can be utilized in other locations/regions; however, additional studies in other locations are recommended to confirm this finding.
Scatterplots of 24-hr PM2.5 Purple Air vs FEM-BAM at the four sampling sites during our year-long study
Scatterplot of hourly PM2.5 Purple Air vs. FEM-BAM (red) and FEM T640x (blue) at the Lockeland site
At the Lockeland site, our team had the unique opportunity to compare the Purple Air with two different FEM measurement techniques (FEM BAM vs. FEM T640x). Overall, we found that Purple Air PM2.5 agreed more closely with the FEM T640x vs. the FEM-BAM.
This is further supported by the results in the scatterplot above showing 24-hr Purple Air data vs different FEMs at the Lockeland site during the entire study. The Purple Air-T640x comparison has a higher r2 value and a slope closer to 1 with respect to the Purple Air-BAM comparison. As discussed in previous updates , this strong agreement between the Purple Air and T640x method may be in part due to the similarities in measurement techniques between the two methods, which draw upon optical scattering methodologies for PM measurements vs. beta attenuation adsorption for the FEM-BAM. This result may be useful for future Purple Air evaluation studies.
PM 2.5 Sensor Longevity
The third goal of our study was to assess sensor lifetime or the extent to which the sensor degraded over time. Sensor degradation was evaluated by examining the sensor’s drift against collocated regulatory monitors (i.e., the difference between PM2.5 from sensor and FEM BAM). Positive or negative drifts in sensor data may indicate an issue with sensor, which in turn, could affect the validity of its data.
After a year of operation, the sensors did not exhibit significant signs of degradation. With few exceptions, most adjusted sensor data remained within 2 ug/m3 of regulatory PM2.5, and no notable upward or downward drifts were observed. When data was outside of the 2ug/m3 threshold, it was generally due to an exceptional air quality event or a minor issue with the sensors at the site.
Per results of this study, it appears that adjusted Purple Air data may be reliable for at least a year, even longer, with minimal, routine maintenance. For our study, routine maintenance was conducted every month and included keeping sensor inlets clean and free from obstructions.
Of note, while most sensors only required monthly checks, some sensors required more frequent checks. For example, sensor data was screened for anomalies several times per week. If data screens indicated a potential issue, our team immediately performed maintenance on the instrument, in addition to the scheduled monthly maintenance, to salvage the sensor and prevent data loss. This proactive maintenance approach could have increased the longevity of the sensors in our study. Our results emphasize the value of both monthly checks and proactive maintenance to maximize sensor lifetime.
Sensor drift analysis at the four sampling sites during our year-long study
PM 2.5 Performance during Special Events
The fourth goal of our study was to determine sensor performance during unique air quality events. Our team focused on this goal to define specific instances when the sensor may perform below expectations. Several special events, such as wildfires and dust storms, occurred during our sampling period. Based on our analysis, sensor performance was affected by a variety of air quality events, and the extent to which the sensor response deviated from regulatory monitors often depended upon the type of air quality event.
For example, positive biases in sensor data were observed during a snowstorm/ice multi-day event occurring in February 2021, which resulted in large snow and ice accumulations across Middle Tennessee. The figures below show a timeseries and scatterplot of Purple Air and regulatory FEM PM2.5 during this event at the Lockeland site. As shown below, Purple Air and FEM-T640x, which are both optical measurement techniques, tracked well with each other during the peak of this event. By comparison, the FEM BAM consistently reported much lower PM2.5 than the FEM T640x and Purple Airs. The Purple Air and FEM-T640x were up to 17 ug/m3 higher than the FEM BAM. The difference in PM2.5 readings during this event may be related to the presence of freezing fog and its impact on various PM2.5 measurement techniques.
Timeseries and scatterplot of hourly Purple Air and FEM PM2.5 at the Lockeland site during a snow & ice event
Alternatively, negative biases were observed in sensor data during a dust event occurring in September 2020. This event resulted in moderate PM2.5 values across the US from long-range transport of dust from the Saharan Desert. Shown below are a timeseries figure and scatterplot of PM2.5 from sensors and FEM monitors during the dust event at the Lockeland site. Unlike the snow/ice storm event, the Purple Air Sensor fail to capture PM2.5 peaks that are observed in regulatory FEM data (both FEM BAMs and FEM T640x), and the sensors are up to 15 ug/m3 below the FEMs. These results suggest a potential limitation with using Purple Air sensors for dust events.
Timeseries & scatterplot of hourly Purple Air and FEM PM2.5 at the Lockeland site during a dust event
While Purple Air sensors deviated from the FEMs during some events, they performed well during other events. During July 2021, wildfires resulted in approximately 10 days of moderately high PM across the Southeast US region. As shown in the timeseries and scatterplot below, Purple Air PM2.5 closely followed FEM BAM and FEM T640x data during this period, and scatter plot analysis show that most of the data fall along the 1 to 1 line (dashed line in scatterplot). At least from our study, it appears that Purple Air sensors are capable of adequately characterizing PM2.5 during wildfire events.
Timeseries & scatterplot of hourly Purple Air and FEM PM2.5 at the Lockeland site during wildfire events
Considering our collective results, Purple Air performance may be affected during unique air quality events, such as dust events, and snow or ice storms. As such, it is important to use caution when interpreting sensor data during these specific events. However, our study was limited to a few events in Middle Tennessee. Further studies in different regions and climates are needed to define specific performance issues during air quality events.
Evaluation of Additional Low-Cost PM Sensors
In addition to the Purple Air Sensors, we evaluated two other low-cost sensors manufactured by Clarity Node and Air Quality Egg during our study. Both Clarity Node and Air Quality Egg PM sensors draw upon a similar optical measurement technology as the Purple Air sensors. Similar to the Purple Air sensors, both Clarity Node and Air Quality Egg PM sensors compared well with the FEM BAM when sensor data was adjusted appropriately. The figures below display scatterplots of EPA adjusted 24 hr data of the three different sensors (Purple Air, Clarity Node, Air Quality Egg) vs. the FEM BAM during our year-long study. Across sites and sensors, slope and r2 values were close to one (slope range: 0.75 to 1.2; r2 range: 0.72 to 0.89), suggesting that all three sensors reasonably capture the variability and magnitude of PM measurements from the FEM BAM.
Scatterplots of 24-hr PM2.5 from Purple Air, Clarity Node, and Air Quality Egg sensors vs. FEM-BAM
Furthermore, like the Purple Airs, the Clarity Node and Air Quality Egg sensors exhibited high data recovery (>95% recovery across sites and sensors) and did not display significant sensor drift. When comparing data recovery across the three different sensors, it is important to note that aberrant data from Purple Air sensors was omitted based on well-established, relatively rigorous quality assurance steps. In some cases, applying these quality assurance steps resulted in omitting close to half of the data collected from an individual Purple Air sensor during a given quarter (~47-50% data recovery per quarter). On the other hand, data from the Air Quality Egg and Clarity Node was only omitted if it appeared substantially different from collocated sensors or FEMs, which resulted in generally less data omission, and in turn, higher data recovery for both sensors. Even with different quality assurance steps, the Clarity Node and Air Quality Egg performed the same as the Purple Airs, with all three sensors showing strong correlations with the collocated regulatory monitors.
NO 2 Evaluation
Clarity Node NO 2
NO 2 Sensor Data Recovery
Data recovery for the 8 Clarity Node NO2 sensors
The Clarity Node NO2 sensor had excellent data recovery with essentially 100% data capture. The table above gives the breakdown of the data collection statistics for each Clarity Node NO2 sensor; only one sensor (ID: 1T) experienced a single hour missed from August 1, 2020 to July 31st, 2021. On February 16, 2021 temperatures in Middle TN dropped to below -10 deg C in the early morning hours. This resulted in a total of 9 hours being invalidated at the Loretto site’s two sensors, according to temperature data collected by the Clarity Node sensor. Negative NO2 values were also measured by each of the Clarity Node sensors, with some measuring negative values more frequently than others. Negative values can occur for several reasons, including interference from other gases and cross-sensitivity. This can indicate a need for periodic zeroing. Having collocated sensors provides Clarity Node users with a possible method for quality assuring the data collected. By comparing two sensors at each site, we could establish a precision limit to reduce erroneous measurements from any single sensor.
Overall, the precision between the pairs of sensors was particularly good, with the exception of the pair at the Near Road site, where the B9 sensor appeared to be faulty. The Lockeland and Hendersonville sites saw the best correlations with their collocated sensor pairs. Applying a simple QA step where data pairs are invalidated if they are X amount different can be useful for removing outliers in the data. Based on data retention statistics, a useful limit could be set somewhere between 5 and 10 ppb, where >75% of the data would still be retained in our study. For the Near Road site, this limit would have resulted in a much more significant data loss compared to the other three sites; so, for this analysis, a precision-based QA step was not used when comparing the Clarity Node sensor to the regulatory grade monitor at the Near Road site.
Collocated sensor performance and data QA options
The revelation of data quality issues after the fact with the Near Road B9 sensor highlights the importance of having multiple collocated sensors. Similar to the Purple Air findings, we found that having three Clarity Node sensors at each site would have enhanced the quality of our data by allowing for the removal of extremes and noise from individual sensors.
NO 2 Sensor-FRM Comparisons
The Near Road site was the only site collocated with an existing regulatory grade monitor (Thermo 42i-TL). Unlike the Clarity Node sensor that uses electrochemical cell technology, the Thermo 42i-TL uses chemiluminescence to measure NO2 and is recognized by EPA as a federal reference method (FRM).
The figure below shows the comparison between the two Clarity Node sensors and the Thermo 42i-TL. The maximum hour of NO2 for each day is plotted. Both sensors capture the day-to-day NO2 trends at the Near Road site nicely, with a noticeable positive bias to the FRM. The positive bias is greater with the B9 sensor compared to the FJ sensor. Overall, the Clarity Node sensor performed best during periods where NO2 concentrations were higher.
Timeplot of max 1-hr NO2 concentrations for each day during our year long study period
Scatterplots were generated for both sensors’ datasets to evaluate the overall measure of linear correlation between the sensors and the regulatory monitor.
Scatterplots for the two Clarity Node NO2 sensors vs FRM-Thermo 42i-TL during our year-long study
The Clarity Node FJ sensor demonstrated a significantly better relationship with the Thermo 42i-TL when compared to the B9 sensor. Both sensors overestimated the regulatory grade NO2 data. The overall performance of the two sensors is summarized in the table below:
NO 2 Sensor Longevity
Sensor degradation was of interest to this study; however, due to the strong relationship between sensor performance and NO2 levels, it is difficult to ascertain degradation due to seasonal NO2 trends. The sensor’s drift from the regulatory grade monitor was assessed to identify degradation trends. The plot below shows the daily average difference between the Clarity Node FJ sensor and the regulatory monitor.
Daily average difference between the Clarity Node NO2 sensor & FRM-Thermo 42i-TL with trendline
It is important to remember the relationship between the sensor accuracy and the magnitude of NO2 when assessing degradation of the Clarity Node sensor. The months with the highest average NO2 were November, December, and March. In the figure above, the sensor error dips closer to zero during these timeframes, however, the average error increases around April and stays slightly higher relative to the previous months. The sensor was able to operate for the full length of the study with no maintenance or interference needed. In the final months of the study, the FJ sensor was still accurately measuring higher levels NO2 (>40 ppb). There does not appear to be any significant degradation associated with this sensor, however this comparitive FRM assessment was limited to only one sensor for one year.
Other Gaseous Evaluations
Air Quality Egg NO 2 , SO 2 & O 3 Results
The Air Quality Egg’s inexpensive gaseous sensors were collocated with regulatory grade monitors at the four sites. The electrochemical sensors are calibrated at the factory prior to shipping to the customer. Poor performance was immediately observed for all sensors compared to the regulatory grade monitors. However, future refinement of the factory calibration method could improve "out-of-the-box" performance. The plots below show the average of all regulatory grade monitors in the region (orange series) alongside the Air Quality Egg sensors individually.
AQ Egg NO 2
Large swings in NO2 concentrations were observed, with some rates of change in the 100 ppb range. The sensors largely underestimated the measured NO2 concentrations. The regulatory grade NO2 data from the Near Road site is shown below for comparison with the Lockeland (lkNO2) and Near Road (nrNO2) Air Quality Egg NO2 data. No sensor calibrations were performed after deployment.
Daily average of Air Quality Egg NO2 sensors and regulatory grade NO2 (orange)
AQ Egg SO 2
Similar results were observed with the Air Quality Egg’s SO2 sensors, with a much larger magnitude of error. In Middle TN, SO2 is measured only at the Near Road site, where little to no SO2 is present. Despite this lack of ambient SO2, the sensor’s SO2 measurements were not static, like the regulatory grade data and there was no correlation with the regulatory monitor. Instead, daily swings of hundreds of ppb were observed with degradation potentially occurring by the final 6 months of the study period. The individual SO2 sensors did trend together across the region, suggesting cross-contamination with other regional ambient gases or environmental conditions. TDEC staff did not perform any calibration of these sensors.
Daily average of Air Quality Egg SO2 sensors and regulatory grade SO2 (orange)
AQ Egg O 3
Regional ozone is measured with regulatory grade monitors at several sites in Middle TN with their daily averages combined in the orange series below. The Hendersonville Air Quality Egg (hvO3) was zeroed in January, bringing the data back to more realistic ozone values. The Air Quality Egg at the Loretto site was relocated to TDEC’s Fairview site (lofvO3), where it could be collocated with a regulatory grade monitor in March. The sensor was also zeroed but did not improve to the same degree that was seen with the Hendersonville sensor. Comparing the regulatory grade ozone data with the Air Quality Egg ozone data reveals an immediate downward drift upon deployment of the sensors at the Hendersonville and Loretto sites. Even after zeroing the sensors, the Air Quality Egg was unable to align with the regulatory grade ozone measurements.
Daily average of Air Quality Egg O3 sensors and regulatory grade O3 (orange)
Overall, these sensors had good data recovery, with minimal downtime. The sensors were calibrated by the manufacturer and did not provide meaningful gaseous measurements "out-of-the-box". None of the Air Quality Eggs showed any correlation with the regulatory grade monitors. It’s important to note that the Air Quality Egg is priced significantly lower than other electrochemical cell sensors on the market that have performed well in other independent evaluations. 1
1 South Coast AQMD Air Quality Sensor Performance Evaluation Center (AQ-SPEC)
Conclusions/Future Uses
The TN Department of Environment and Conservation completed a year-long evaluation of low-cost air sensors for PM2.5 and gaseous pollutants against regulatory monitors in Middle Tennessee. We found that the technology varies considerably across different types of sensors. The table below summarizes our assessment of low cost sensors tested in this study.
Summary of sensors evaluation
We classified sensors based on three categories. The green category suggests that the sensor technology is refined, and future research is not necessary to improve technology. Data from these sensors is expected to be reliable, and these sensors do not require a collocated regulatory monitor for verification of data quality. The yellow category suggests that while sensor technology is mature, future collocated sensor-regulatory monitor studies may help address sensor or data issues observed in this study. The red category suggests that further research and development is necessary to address critical operational or data issues observed during our study.
In addition, we conclude the following:
- PM2.5 sensor technology is further along than gaseous sensor technology, but gaseous sensor technology shows promise for the future, with some gaseous sensors further along than others.
- Air Quality Egg gaseous sensors evaluated in this study revealed the need for frequent interventions and/or further instrument development to achieve good agreement with regulatory monitors. It is possible that that the shortcomings of these sensors could be addressed with a more expensive sensor or more rigorous factory calibration, but this was not tested in our study.
- For PM2.5 and Clarity Node NO2 sensors, collocated sensor data recovery was close to 100% at sampling sites for the entire sampling period; however, data recovery for individual sensors varied.
- Collocating several sensors at a site, ideally 3 per site, is key to optimizing data recovery and quality.
- PM2.5 sensors operated well during our year-long study with no clear signs of sensor degradation. While Clarity Node NO2 sensors also appeared to operate well, sensor degradation was challenging to evaluate using our dataset due to the seasonal variability of NO2.
- Sensor performance, particularly with PM2.5, is affected by a handful of special events, and caution should be exercised when interpreting sensor data during such events.
Future Uses
- Sensors can be used to assess local air quality trends or make personal exposure decisions.
- Sensors can provide an inexpensive method for filling spatial gaps in air quality not captured by regulatory monitors, determining areas of maximum impact and investigating near source pollution.
- Some sensors, e.g., PM2.5 sensors, may lend themselves to more quantitative use of the data, while other sensors, NO2 Clarity Nodes, may be better suited for qualitative applications.
- Partnering with air quality experts, whom are knowledgeable about quality control measurement standards, is recommended when using low cost sensors to facilitate data interpretation.
TDEC APC Project Team