
Service Line Inventory: Statistical Method Guidance
This statistical approach may be used by Kentucky public water systems (PWS) to work towards a Service Line Inventory.
Unknown Service lines are defined as lines of unknown material with no documented records.
Deadline for Submitting to DOW: July 1, 2024.
This approach provides the option to develop a service line inventory while reducing the need to inspect every unknown service line for the initial inventory. Division of Water (DOW) highly recommends consulting with DOW staff or seeking external assistance with this method.
This method is not a final solution to completing the inventory. It should be used to supplement other methods to identify service line materials, including records reviews and documenting materials during routine operations.
This method does not eliminate the requirement to submit annual or triennial updated inventories as required by 40 CFR 141.90(e)(3) . If using this method, communicate with customers as described in Step 10.
Note: The guidance provided in the document below and this story map may be revised based on future input from the EPA.
Click here to access the full document: Guidance for Using a Statistical Method to Complete a Service Line Inventory.
Two Key Factors in the Success of this Strategy:
- The use of a randomly generated list of unknown service lines to be physically inspected; and
- Ensuring that the entire distribution system is well-represented by the randomly selected service lines.
Before Getting Started
Before starting a statistical approach, the water system must describe how it will address the following:
- A data management plan (i.e., how the data will be stored and handled throughout the lifetime of the project).
- Standard operating procedures and plans for communicating with customers.
- A list of any assumptions made during the statistical approach.
Identification Process
Stop and read before moving forward!
Public water systems (PWS) must first complete a records review before implementing a statistical method. If no lead service lines (LSLs) are identified through the records review , the steps outlined here may be used.
If ANY service line is found to be a lead service line, then the methodology of this document will need to be supplemented with additional steps. Contact the Division of Water for further guidance.
Step 1: Define Strata (Groups) of Service Lines
Groups of unknown service lines must be established so that random sampling is representative of the entire distribution system and population.
First, Separate Data:
This method will require separating out the unknown service lines from the database or spreadsheet that contains all of the system's service lines.
Separate service lines that have been verified (on both the utility and customer sides) via records review. These lines will not be a part of the statistical method. You must still keep complete records of all known lines. If one side of the service line is known and the other side is unknown, classify the line as Unknown.
See the example below: Service lines are separated by known (non-lead) and unknown lines and then the known service lines are removed from this data set. Click the icon at the bottom left to see the map legend.
Slide the bar to the left to remove all the known (non-lead) service lines. The known service lines are represented in blue and unknowns in red (not real data).
Second, Organize the Unknowns into Groups:
Table 1: Suggested breakdown of Strata
Now, take the unknown service lines and divide them into at least three groups, called strata (singular: stratum). The exact strata set may vary by water system, but the groupings should be based on building age. Table 1 shows the suggested breakdown of groups.
Choose a set of strata that will adequately represent the entire distribution system, without being too complex.
If there is a large number of buildings of unknown age, choose a different variable to define the groups. Choose a variable that also influences the likelihood of LSLs being present. Alternatively, consider choosing a "simple random sample" of all the service connections but will still be responsible for demonstrating that the samples are representative of the entire distribution system.
See the example below: The unknown service lines are grouped into the strata defined in Table 1. Click the icon at the bottom left to see the map legend.
Slide the bar to the left to sort the unknown service lines into four groups based on building age (not real data).
Step 2: Identify how many Service Lines must be Physically Inspected
Table 2: Minimum number of service lines requiring verification
PWSs with fewer than 1,500 Unknown service lines must verify at least 20% of the lines.
PWSs with more than 1,500 Unknown service lines should use Table 2 to determine the minimum number of service lines requiring verification.
In the example water system, there are 100 unknown service lines, so 20% of the lines, or 20 total, must be physically inspected or verified.
If the total number of unknown service lines falls between two values on the chart, round up to the higher number. It is recommended that systems choose a slightly larger number to verify in order to account for any errors or service lines that must be skipped because of customer refusal or lack of access.
Selecting more than the minimum number of sites can offset future problems, such as having to skip sites due to inability to access them. Sampling more than the minimum also provides insurance that the entire distribution system will be well-represented by the sampling effort and that any replacement samples needed are randomly generated.
Step 3: Randomly select Service Lines within each Stratum for Physical Inspection
Now, identify the number of service connections in each stratum from Step 1. Based on this, allocate a portion of the total number of samples needed in each stratum to ensure that number of verifications will represent the system well.
A breakdown based on the example water system can be seen in Table 3. In this example, the PWS has 100 total Unknown service lines and requires a minimum of 20 lines that must be randomly selected.
Option 1: Take the minimum number of service lines requiring verification and calculate the number of lines that must be verified in each stratum. This is done by multiplying 20 by the percentage of unknowns in each stratum.
Option 2: Choose a slightly larger number of service lines to verify and calculate the number of lines that must be verified in each stratum. This is done by multiplying 25 by the percentage of unknowns in each stratum.
Table 3: Breakdown of the number of samples to randomly generate in each stratum for Options 1 and 2.
As you can see, the total number of samples across the 4 strata is larger than the original number selected for both options. This is due to rounding up when calculating the number of samples required from each group. To ensure that the sample set fully represents the entire distribution system, avoid rounding down.
Now, randomly select sample sites from each stratum. For example, if Option 2 is selected, randomly select four sample sites from Stratum 1. Selection within each stratum must be uniformly random and not selected based on any specific criteria that can introduce bias. See Appendix C of the guidance document for an easy way to generate a uniformly random set of service lines for inspection. Click the button below for a brief online tutorial.
Note: Have a plan for choosing alternate sites randomly, in the event that a site chosen for physical inspection cannot be sampled for some reason. In order to ensure randomness when needing alternate sites, use these two criteria:
- Outline very specific criteria about when a sample site can be skipped (including documented refusal by homeowner, extreme difficulty accessing one or both sections of the service line, etc.), and
- Use samples in order of the initial randomly generated list, without skipping any unless the criteria outlined above occur.
The need to choose alternate samples later can be avoided by verifying a few more random samples than required, as demonstrated in Table 3.
Step 4: Ensure each Stratum, and the Distribution System as a Whole, is Well-Represented by the Samples
After randomly selecting the service lines to be sampled, review where they are located within the entire distribution system to ensure that all areas of the system are well-represented by the sample set. This is easiest done if the unknown service lines and samples are plotted on a map.
See the example below: the randomly selected service lines that need to be physically verified from each stratum are indicated by stars on the map.
Click on a point to see information about the service connection (not real data).
Another way to test whether the sample set is representative of the entire distribution system is to compare the samples to other variable data in a table or histogram. For example, if the distribution system has distinct areas that all need to be adequately sampled, like zones, census blocks, or meter-reading tracts, those can be used to assess whether the model represents all areas.
In the example distribution system, there may be zones that influence the likelihood that a lead service line is present, other than building age. For example, the area could be split into the following zones:
- Residential Zones in blue (R-1 and R-2)
- Agricultural Zone in green (A-1)
- Business Zone in red (B-1)
The samples and zones are displayed on the map below, but also examine them with a table or histogram to ensure representativeness.
Map of unknown service lines that were randomly selected for verification in each of their zones (not real data). This map represents Option 2.
To examine the percentage of unknown service lines and randomly selected sites in each zone display the results in a table or histogram. Table 4 displays the percent of unknown service lines that must be sampled in each zone. The minimum required number of samples (20) is used to calculate the percent of samples that are cover in that zone. Option 2 shows more than the minimum number of samples and Option 1 shows the minimum required.
Table 4: Breakdown of unknown service lines in each zone and the percentage of samples in each
Even though in Step 3 there were sufficient samples in each of the strata chosen, this example shows that there would be insufficient samples in Zone R-1 if only the minimum number of samples were selected, indicating that Zone R-1 may be inadequately sampled. By initially generating more random sample sites (27) than were required (20) the number of sample sites in each zone meets the minimum required percentage of unknowns that must be verified in all zones. This demonstrates the benefits of choosing a slightly larger number in Step 2.
Complete this exercise with at least two different variables (do not include the variable used for defining strata in Step 1) to demonstrate that samples are representative of the entire distribution system and population.
Variables may include the geographic data mentioned above, or socioeconomic factors (income, home size, presence of vulnerable populations, disadvantaged community), land value, town/county/community lines, or other factors that are relevant to the area served. Some tools that may help with this are:
- U.S. Census Bureau Census Block Viewer - this is a good tool to view how many households are in each census block in a community.
- EPA's EJ Screen - this is an environmental justice mapping tool that demonstrates socioeconomic trends.
- The Center for Disease Control and Prevention’s Social Vulnerability Index - these data can help ensure that samples cover vulnerable communities.
- Climate and Economic Justice Screening Tool - this map highlights disadvantaged census tracts.
Take some time to play around with EPA's EJ Screen below. Search for a location of interest and click on the data to the left to view trends on the map. You may find that locations with smaller populations don't display as much information as more populous areas such as Lexington, KY.
Use the EJ Screen to investigate socioeconomic trends in your water system service area or town.
For less populous areas, look at the U.S. Census Block Viewer below. Census blocks break up areas into smaller sections. Use the search bar to look at a specific location or zoom in on the state of Kentucky to see what you can find. Once you zoom in you can click on each block to display more information about it, such as the number of housing units.
This data can be added to a GIS map with your PWS data and used to better understand how well the random samples represent the distribution system.
Step 5: Conduct a Two-point (or more, if needed) Physical Inspection
Physical identification of a minimum of two points of the unknown service lines is required.
The service line must be physically inspected on both sides of the curb stop.
- One point on the customer-owned section must be verified, and
- One point on the utility-owned section must be verified.
Physical identification methods include excavation, televising, in-home inspections, and other emerging methods. All physical inspections should be conducted or overseen by water system personnel.
Identifying Service Line Materials.
Refer to EPA's " Guidance for Developing and Maintaining a Service Line Inventory ," Chapter Five, for methods of physically identifying service lines.
Only skip a site from the randomly generated list when absolutely necessary. Clearly document each case and replace each skipped sample with a randomly selected sample, meeting all the same criteria from Steps 1-4. Even if additional sites were generated in Step 2, clearly document the reason for skipping a site and demonstrate that removal of such site will not affect the representativeness outlined in Step 4.
Step 6: Record Results of the Physical Inspection Process
Record the results of each physical inspection in the database. In the DOW Service Line Inventory template : record the material and verification method in the System-owned and Customer-owned Service Line columns. (Verification method: choose the “Field verified” option that applies).
Kentucky DOW Service Line Inventory Template ( click here to download the template ).
Step 7: Enter Results for Remaining Unknown Service Lines
Complete this step once all randomly selected sites have been verified to have non-lead service lines on both the customer and utility side.
For the remaining Unknowns: record the verification method as Statistical Method or Predictive Model. Record the system-owned and customer-owned service line material as non-lead. Record this as you would other service lines in the database.
Note: If lead service lines are found during physical verification of service lines in any stratum, further investigation will be warranted. Do NOT record the remaining unknowns in the system as non-lead. Consult DOW, at DrinkingWaterCompliance@ky.gov , for further guidance.
Step 8: Retain Identification Records
Retaining detailed records is a key part of the Inventory process.
Place an emphasis on this step throughout the entire statistical method process.
Create, compile, and retain documentation of all service line identification efforts. It is especially important to record all evidence of physical inspections of service lines.
DOW may ask PWSs to produce or submit these records. Develop an organization system so records can quickly and easily be identified according to the service line ID from the inventory.
Step 9: Continuously Update Records with New Information
Create a standard procedure to continuously document service line materials during routine operations and maintenance (O+M) and with any distribution system upgrades, and record those into the Service Line Inventory. Any service line whose verification method has been recorded as 'Statistical method' or 'Predictive Model' should be updated as field-verified information is available. In addition, service line materials verified based on historic records alone should be updated if a field inspection identifies a material different than what the historical records stated.
Update the Service Line Inventory with the results of future field inspections during O+M. If any lead service lines are identified, the statistical method will need to be modified; consult with DOW.
Step 10: Communicate with Customers
Because the water system does not have data about the service lines that were not identified with a records review or physically verified, it is important to communicate clearly with the consumers what the statistical method is demonstrating.
Avoiding lead contamination in drinking water is essential to public health.
An example of an appropriate way to present this information is:
[Water system name] found n out of [27] randomly selected service lines were not lead. Therefore, we are 95% confident that fewer than 1% of the unknown service lines are lead. We are going to classify all of those unknown service lines as non-lead. We will continue to document service line materials in the future during routine operations in the distribution system. If you would like [Water system name] to physically inspect your service line to verify the material, please contact us at [phone number].
Remember that some customers will not understand what a service line is, or who owns the pipe that serves their home, so provide definitions and explanations as needed.
In addition to sending the letter to customers, include a brief statement in the Consumer Confidence Report (CCR) that states that a statistical model was used to complete the Service Line Inventory and that customers may contact the water system to request inspection of their service line material.
Submission to DOW
If choosing to use this statistical method, submit the following to DOW prior to submitting the initial Service Line Inventory:
- The following, as described in Appendix A of the official document:
- Description of the data management and storage plan.
- Description of standard procedure(s) used and how it will be communicated to customers.
- A list of any assumptions made.
- Contact information of person(s) who is(are) responsible for executing this method at the utility, and a brief explanation of their qualifications to do so. This could be water system staff, a technical assistance provider, or a consultant.
- Description of the strata used, and what data sources were used to determine the stratification scheme.
- Documentation of physical inspections that have been/will be performed.
- List of circumstances under which sites may be skipped and how skipped sites are replaced with new sites.
- Procedure to be used to ensure that service line materials are documented during routine operations in the months and years after the initial Service Line Inventory is completed. Demonstrate a plan for continual updates to the Service Line Inventory database with information gathered during routine work such as leak repairs, meter replacements, main line updates, etc.
- Example letter that will be sent to customers whose service line is listed as ‘Non-lead’ via the statistical method.
Submit these documents to DOW by July 1, 2024.
These should be submitted via eForm 169: Drinking Water Information and Data Submittal. Click here for instructions using eForm 169.
Definitions of key terms
- LSL = lead service line; the pipe running between the main water line and the building premise plumbing (including sections owned by the utility and owned by the customer), made primarily of lead
- Representative sample set = for the purpose of this document, a representative sample set is one that includes samples from all areas of the distribution system, with an emphasis on ensuring the sample set includes samples from areas that are most likely to have lead service lines (e.g., areas that are were constructed in the early 1900s or before).
- Sample = for this statistical method, a sample is a service line to be physically verified. A single sample includes both the utility-owned and the privately-owned sections of service line, as well as both active and inactive service lines.
- Stratum (plural: strata) = a group defined by certain variable(s) used to identify how to focus the selection of samples (e.g., groups of service lines, where each group is defined by the variable: age of building construction)
- Unknown service line = service line of unknown material with no documented records or inspections of the material composition, or where accuracy of existing records is questionable. Records that should be used to identify service line materials are defined in 40 CFR 141.84(a)(3)
- Variable = a criterion that can be measured and that may influence the type of service line material, such as ‘age of construction’ or ‘development zone.’ Variables are used to identify strata for the selection of samples.
Note: The guidance provided in the document above and this story map may be revised based on future input from the EPA.