Map Medicare spending
Maps are important decision-making tools. They help determine problem areas and indicate where resources can be better spent. But maps don't always present a single truth. Sometimes you can find different views of the truth in the same set of data. In this case, you'll consider the per capita (per person) cost of the Medicare program in 2022. These costs vary appreciably from place to place. When you make a map, you need to make decisions about how to group these varying cost values. Which range of costs is high or low? Your decisions help create spatial patterns, and these patterns lead map users to draw conclusions. This raises concerns as to the best way to visualize data and find reliable patterns.
First, you'll compare some common techniques for classifying (grouping) data and see how your choices affect spatial patterns on the map. You'll work with Medicare cost data aggregated by county. Medicare is a United States government health insurance program covering about 50 million people who are over age 65 or who meet certain medical conditions. Information about the Medicare program is available from the Centers for Medicare & Medicaid Services.
Open the map
In this section, you'll open a map, familiarize yourself with its features and attributes, and save your own version of the map for further work.
- Open the Medicare Spending by County map.
The map appears in Map Viewer showing all the counties in the United States.
The map contains a layer with data on 2022 Medicare spending in each county. You will use this data to style the map to show where there are high and low levels of spending.
- If necessary, sign in to your
ArcGIS organizational account.
Note:
If you don't have an organizational account, see options for software access.
- If necessary, at the bottom of the Contents (dark) toolbar, click Expand.
- On the Contents toolbar, click Layers.
The map contains two layers: State Boundaries (visibility is turned off) and Medicare Spending by County.
- In the Layers pane, for the State Boundaries layer, click the Visibility button.
You can now see where the state lines align with the county boundaries.
Next, you will explore the pop-ups for the Medicare Spending by County layer.
- On the map, click a county.
A pop-up appears, showing the name of the county, state, and the amount of Medicare spending in 2022 per capita.
The per capita cost data you'll work with reflects the standardized risk-adjusted cost. It differs from the actual cost in two ways. First, it's standardized to even out differences in wages and the cost of goods and services from one part of the country to another. Second, it's risk adjusted to account for differences in age, sex, existing health conditions, and other relevant demographic factors. The standardized risk-adjusted value is the best estimate of what the actual costs would be if socioeconomic, demographic, and health conditions were uniform across the country.
Note:
For a detailed explanation of how standardization and risk adjustment are calculated, see Medicare Data for the Fee-for-Service Geographic Variation Public Use File: A Methodological Overview (May 2024 Update).
- Click a few other regions to review their pop-ups, and then close the pop-up.
Pop-ups tell you about individual features, but they don't help you see spatial patterns. To see patterns, you must symbolize the data. You'll do this in your own version of the map so you can save the changes.
- On the Contents toolbar, click Save and open button and choose Save as.
- In the Save map window, for Title, type Medicare Costs per Capita in 2022 and add your name or initials.
Note:
You cannot create two layers in an ArcGIS organization with the same name. Adding your initials to a layer name ensures that other people in your organization can also complete this tutorial. Once a layer has been created, you can rename it in the map to remove your initials, which will not affect the name of the underlying data layer.
- Click Save.
You opened and explored a web map containing Medicare spending by each county in the United States in 2022. You have saved a copy of the map so you can now style the map to answer your research question: Where is there significantly high Medicare spending in the country?
Style by natural breaks
A typical way to present spatial patterns on a map is to associate ranges of data values with a color ramp. There are a few common methods for specifying value ranges. In this section, you'll use the Natural Breaks method.
- In the Contents toolbar, click Layers.
- In the Layers pane, ensure the Medicare Spending by County layer is selected.
The blue line next to the layer name indicates that the layer is selected.
- On the Settings (light) toolbar, click Styles.
- In the Styles pane, for Choose attributes, click Field.
- In the Select fields window, click Standard Payment per capita and click Add.
Once you choose the attribute, available drawing styles are presented. A suggested style is automatically applied and is indicated by a check mark in the Styles pane.
On the map, the county are now drawn in shades of blue. Darker shades represent regions where Medicare expenditures were higher in 2022.
To better understand the layer style, you will explore the layer's style options.
- In the Change style pane, for the Counts and Amounts (color) style, click Style options.
The Style options pane appears with all the ways you can configure styling the layer. The Style options pane includes a histogram, which shows you the range of values in the field you are using to style the layer with and the corresponding colors to symbolize those values.
The histogram provides a lot of helpful information about how your data and layer is being styled. The top and bottom of the histogram show you the lowest and highest values in your dataset. On the side of the histogram, the middle value is the mean value of the data. The values above and below the mean are the standard deviation values.
The style is currently using a continuous, unclassed method, meaning the symbol colors simply change gradually from the minimum value to the maximum value.
Next, you will experiment with using a classified method. By classifying the data—that is, divide it into classes or groups—you change the ranges and breaks for each of the classes. By changing the classes using different classification methods, you can create different-looking maps.
- Toward the bottom of the Style options pane, turn on Classify data. For Number of classes, type 5.
Choosing five classes provides more variation in the map without there being too many classes, which could make it difficult to see differences between each class.
In the map legend, the range of cost values is grouped into five classes by the default Natural Breaks classification method.
Natural Breaks uses clusters and gaps in the value range to define classes.
One characteristic of the Natural Breaks method is that value ranges may be different from class to class. Here, the value range of the lowest class ($4,244 to $9,263) is $5,019, while the range of the next class ($9,263 to $10,848) is just $1,585. Another characteristic is that classes may have different numbers of members. For example, the highest class includes 118 counties while the lowest includes 536 counties.
It is also important to consider if you have any null values.
- In the Style options pane, under the histogram, turn on Show features with out of range or no values.
The counties in the state of Connecticut now are shown in gray.
In 2022, Connecticut made changes to their county boundaries. The Medicare spending data for 2022 used different boundaries than the current boundaries, resulting in null values. For the purpose of this tutorial, you will exclude this data from your analysis.
Note:
To learn more about the change in Connecticut's counties, see Change to County-Equivalents in the State of Connecticut for 2022 ACS.
Next, you will adjust layer style by changing the color ramp.
- In the Style options pane, click the symbol under Symbol style to change the symbol and specify the symbol settings.
- In the Symbol style window, click the color ramp for Colors. In the Ramp window, choose Purple 2.
Tip:
To see the name of a color ramp, point to the color ramp.
- Click Done.
The layer style updates.
- In the Style options pane, click Done twice.
- On the Contents toolbar, click Save and open and choose Save to save your map.
Explore spatial patterns
You'll explore the spatial patterns on the map by looking at the legend of the Medicare Spending by County layer and then zooming to different geographic areas and opening pop-ups.
- In the Layers pane, click the Visibility button for the State Boundaries layer.
The map shows distinct patterns. High rates of expenditure are shown throughout the South, especially in Texas, Louisiana, Mississippi, and Florida. High levels of spending were also prevalent through the Great Plains region, notably in Oklahoma and Kansas. There are also isolated high spending in other areas of the country.
- In the Contents toolbar, click Legend.
The legend shows the value range associated with each color. In any classification scheme, class breaks are important because they lead map users to form judgments: in one place costs seem to be high, while in another they seem to be very high. In fact, however, the difference between a given pair of values in different classes may be small.
- On the Contents toolbar, click Bookmarks and in the Bookmarks pane, click Southwest.
- Click one of the counties styled for the middle of the five classes.
San Bernardino County reported $11,986 per capita Medicare Expenditure in 2022. The middle class ranges from $10,848 to $12,452 per capita.
- Close the pop-up. Click the county in the highest class in Nevada and observe the pop-up information.
In Clark County, the 2022 expenditure was $12,660. The difference between the two regions is only $674, but it's enough to put them in different classes—at least with the Natural Breaks classification method.
- Close the pop-up.
- Zoom to the Midwest and Northeast bookmarks and compare other counties' Medicare expenditure.
Some counties had very high amounts of Medicare spending compared to their neighboring counties, such as Monroe County in the southern part of Iowa. In some cases, the amount spent in one county styled in the highest class does appear strikingly high compared to a neighboring county styled in the lowest class. But there may be some examples where the difference does not seem very large despite the class style difference. There are other methods to style the data that might better communicate the differences between counties.
- When you're finished exploring, in the Bookmarks pane, click USA to zoom back to the continental United States.
- In the Layers pane, turn off the State Boundaries layer.
- Save the map.
In the next section, you will experiment with different classification methods.
Classify the data by other methods
Natural Breaks isn't the only available classification method. You'll see to what extent the spatial patterns change when you use the Equal Interval and Quantile methods.
- In the Layers pane, ensure the Medicare Spending by County layer and in the Settings toolbar, click Styles.
- In the Styles pane, for the Counts and Amounts (color) style, click Style options.
- Under Classify data, for Method, choose Equal interval.
These class breaks are different. The defining characteristic of the Equal interval method is that value ranges are the same among all classes. In this case, the range is about $4,307. A class can have any number of counties, or even no counties.
Although a fairly similar pattern of high and low values is evident, a different impression is created. Fewer regions fall into the lowest and highest classes, making them stand out, and the map has an overall homogeneous appearance.
- In the Style options pane, for Method, choose Quantile.
The classes change again. The defining characteristic of the Quantile method is that all classes have the same number of members (in this case, either 626 or 627 counties). The value ranges among classes may be very different. Here, the value range of the lowest class is $5,266, while the range of the middle class is $714.
In contrast to the previous map, the Quantile method tends to emphasize highs and lows and may exaggerate their importance.
None of the classification methods you've looked at is right or wrong. The Quantile and Equal interval methods give accurate results when data is continuously and evenly distributed throughout the value range. This is often not the case, however. When there are gaps and clusters in the data, the Natural Breaks method is recommended.
In this situation, the data has a fairly normal, or bell-shaped, distribution, as shown by the gray bar chart adjacent to the color ramp. With this linearly or even distribution, the Quantile method is recommended. Because features are grouped in equal numbers in each class, if you have an unevenly distributed dataset, the resulting map can often be misleading.
The Quantile method is also a useful style for determining resource allocation. For example, if you need to develop a health policy that targets supporting areas with the most need, you could use the Quantile method with five classes, the highest class represents the top 20 percent of counties that should receive this funding first.
Note:
To learn more about classification methods, see Use style options (Map Viewer) - Classification methods and the video Configure a choropleth map.
- In the Styles pane, click Done.
- Save the map.
Decisions about how to classify data are at least partly subjective. You might like the way a map looks or you might want to convey a certain message. No classification method is wrong, and each may help emphasize an aspect of the data that's not apparent from the others. But you may wonder if it's possible to get a better sense of which spatial patterns are stable and reliable, to know which places definitely stand apart from the others.
The answer is yes: there are analysis techniques that help you group and visualize data in less subjective ways. It is recommended that you try multiple classification methods before choosing the one you will use for presentations or decision-making purposes. Next, you'll explore hot spot analysis and see how statistical evaluation can find spatial clusters of significantly high and low values in your data.
Analyze Medicare spending hot spots
In the previous section, you observed how the spatial patterns on a map change depending on the data classification method. Next, you'll run a hot spot analysis on the data in order to draw more definite conclusions about patterns. Hot spot analysis applies statistical tests to find areas where values are significantly different from the norm.
Find hot spots
In the previous section, the maps you styled showed a variation in the amount of Medicare spending around the country. In 2022, the amount of spending was relatively higher in states in the South and the Great Plains. Spending was low in New England, parts of the Midwest, in the Northwest, and Rocky Mountain range region. But styling the map by different classification methods does not tell you if there are statistically significant differences. The Find Hot Spots tool identifies statistically significant spatial clustering of high values (hot spots) and low values (cold spots) or data counts using the Getis-Ord Gi* statistic.
- If necessary, ensure that you are signed into your ArcGIS organizational account and open your Medicare Spending by County map.
- In the Contents (dark) toolbar, click Layers. In the Layers pane, click the Spending by County layer so it is selected.
- In the Settings (light) toolbar, click Analysis.
- In the Analysis pane, click Tools. In the Tools pane, on the search bar, type hot spot and press Enter.
The Find Hot Spots tool appears in the list of results.
The Find Hot Spots tool employs a spatial statistical technique to identify spatial patterns, providing a confidence level for the presence of high or low-value clusters.
- In the list of results, click the Find Hot Spots tool.
The Find Hot Spot pane appears. At the top of the pane, next to the tool name, the Help button will take you to a web page with more information about the tool.
The first parameter is Input features. The Input features group includes the Input layer parameter, which is the layer that contains the point or polygon features on which hot spot analysis will be performed.
- In the Find Hot Spot pane, for Input features, choose Medicare Spending by County.
In the Hot spot settings section, Analysis field is the field that will be analyzed for clusters of high values (hot spots) and low values (cold spots). You want to analyze hot and cold spots of Medicare spending.
- For the Analysis field, choose Standard Payment per capita.
The remaining settings can be left to the default settings. Finally, you will provide the name of the layer that will be created when the tool is run.
- For Output name, type Medicare Spending Hot Spots and add your name or initials.
- Click Estimate credits.
Credits are the currency used across ArcGIS Online. They are consumed during specific transactions, such as performing analytics, storing features, and geocoding.
Running this tool will require 3.143 credits.
Note:
To learn more about credits, see Understand credits. You can learn how many remaining credits are in your ArcGIS Online account if your organization administrator has enabled you to view that information. If it is enabled, at the top of the page, click your username and choose My settings. On the My settings page, click Credits to see how many remaining credits are in your account. If it is not enabled, contact your organizational account administrator.
- Click Run.
As the tool runs, you can view its progress by clicking the History tab in the Analysis pane.
After a few minutes, the Medicare Spending Hot Spots layer is added to the map.
On the map, red and blue areas represent statistically significant clusters of high and low costs, respectively. In regions symbolized in white, the amount of spending did not stand out as significantly high or low.
The confidence levels of significance reveal the likelihood of high or low values in the study area being clustered. Hot and cold spots with over 90 percent confidence imply that this spatial clustering is likely not due to random chance, but rather the result of some spatial process. A higher confidence level increases our certainty that the observed patterns are occurring for a specific reason.
The results clarify that there is in fact statistically significantly high spending in several states in the South and the great plains. One high spending hot spot that was not as obvious earlier in the tutorial is counties in New Jersey and around New York City.
- On the Contents toolbar, click Legend.
The labels on the layer legend explain the symbols. For example, a hot spot with 99 percent confidence means there is only a 1 percent chance that a cluster of high costs occurred randomly.
The legend heading above the symbols is generated by a field name alias in the layer table. You'll change this heading to something meaningful later in the tutorial.
- Save the map.
Using the Find Hot Spot analysis tool allowed you to see more clear, statistically significant differences in the amount of Medicare spending in 2022.
Change layer symbology
Users of your map may find it more helpful to see the hot and cold spots displayed with state boundaries than the county boundaries.
- In the Layers pane, select the Medicare Spending Hot Spots layer and on the Settings toolbar, click the Styles button.
- In the Styles pane, for the Counts and Amounts (color) style, click Style options.
- In the Style options pane, click the Symbol style button.
- In the Symbol style window, for the Outline width, type 0.
- In the Style options pane, click Done twice.
- In the Layers pane, click the Visibility button for the State Boundaries layer.
- Drag the State Boundaries layer to the top of the Layers pane.
The map now only shows the state boundaries.
The hot spots of Medicare spending are in the states in the Gulf Coast region, Oklahoma, Kansas, and New Jersey. The major cold spot regions are in the Northwest, Rockies, portions of the Midwest, New England, and Virginia.
- Save the map.
Understand Hot Spot results
In this section, you will further explore the resulting hot spot layer to better understand what the tool analyzed. First, you will explore the hot spot layer fields that were generated by the Find Hot Spots tool.
- In the Layers pane, for the Medicare Spending Hot Spots layer, click the Options button and choose Show table.
The table appears.
Tip:
To better view the field names in the table, you can close panes and collapse toolbars. You can also point to a field name to see the full field name.
You will configure the table to show the key fields you want to explore and compare the results for different counties to better understand the Find Hot Spot analysis results.
- At the top of the table, click the field visibility button.
- In the Field Visibility window, uncheck Source_ID and Standard Payments per capita, and click Done.
Only five fields are visible in the table.
- On the map, click a red county.
A selected feature highlights in bright cyan.
- In the table, click the Show selected button.
The table filters to only show the selected record.
- On the map, click a blue cold spot county and a no significance white county.
The counties you clicked add to the filtered table. Now you have three different hot spot result records to compare to one another as you learn more about the resulting fields.
- In the table, observe the GIPValue and GiZScore fields.
The Gi part of the field name refers to the Getis-Ord Gi* (pronounced G-i-star) statistic, which is used calculate z-scores and p-values. Gi* correlates each feature with its neighbors and then compares the local average to the average of all the features in the study area to calculate the probabilities of this value cluster being significantly higher or lower in the overall study area. This tool works by looking at each feature within the context of neighboring features.
Note:
To learn more about the Getis-Ord Gi* statistic, see How Hot Spot Analysis (Getis-Ord Gi*) works.
The number at the end of the field alias, 174529, is distance band used to decide the neighborhood size.
The GiPValue field is the p-value, in which a value less than 0.01 indicates statistical significance with 99 percent confidence. The word fixed in the field alias name indicates that the neighborhood method used is fixed distance band.
The GiZScore field is the resulting z-score, which is the measure of standard deviation. For example, a z-score of 2 means the amount of Medicare spending in that county was 2 standard deviations higher than all the other counties.
Note:
Learn more about z-scores and p-values.
- Notice that the red hot spot record has a z-score of 7.29 and a p-value of 0.00.
Tip:
If you can't remember which record was for which county, you can click the check boxes at the beginning of each record to highlight them on the map.
This means the amount of Medicare spending in this county was more than seven standard deviations higher than all the other counties in the country. The low p-value means there is 99 percent confidence that this result is not random.
- For this same record, notice the next field, Gi_Bin.
The Gi_Bin field identifies statistically significant hot and cold spots.
- Features in the +/-3 bins reflect statistical significance with a 99 percent confidence level.
- Features in the +/-2 bins reflect a 95 percent confidence level.
- Features in the +/-1 bins reflect a 90 percent confidence level.
- The clustering for features in bin 0 is not statistically significant.
FDR stands for False Discovery Rate, and the FDR correction is applied to Find Hot Spots in Map Viewer by default.
The FDR correction reduces the significance threshold (p-value) to account for the common multiple testing problem in statistical testing and also the spatial dependency due to the repetitive testing across all features in one dataset.
Note:
Learn more about FDR corrections.
For the red hot spot county record, the Gi_Bin field is 3, meaning this county is determined to have statistically significantly higher Medicare spending with the FDR correction.
The Statistical Significance field is a text field, which can serve as a label for the overall hot spot analysis results.
- Using what you have learned, answer the following questions about the other two records:
- For the record that has a Gi_Bin value of 0, which field explains why it is not statistically significant?
- How many standard deviations below the mean is the cold spot county?
- Which field tells you the cold spot record has a 99 percent confidence level?
Tip:
Consider selecting more counties on the map that are in Gi_Bin +/- 1 or 2 to compare in the table.
Next, you will observe the last field generated by the Find Hot Spots analysis tool.
- In the table, observe the NNeighbor field.
The NNeighbor field name also includes the scale of analysis value like the other fields. But where does this number come from?
To better understand these values, you will view the analysis history results.
- In the Settings toolbar, click Analysis. In the Analysis pane, click the History tab.
- For the Find Hot Spots tool history item, click the options button and click View details.
The Results tab appears for the Find Hot Spots window.
The Results tab provides important details, such as how many outliers were determined and that they were not included in calculating the scale of analysis value.
The tool by default calculates the optimal fixed distance band by averaging the distance to the nearest 30 neighbors. The fixed distance band used in this analysis is 174,529 meters, which is the value you saw at the end of the field names in the attribute table. For each feature, the features inside the 174,529 meter buffer zone will be considered neighbors of the feature. This field tells you how many neighbors are used within the 174,529 meter buffer for each feature.
Under Hot Spot Analysis, you also see that the FDR correction determined statistical significance for 2,127 of the 3,123 features.
In this tutorial, you explored classification methods to show where Medicare expenditures were highest across the country. Using the Find Hot Spots tool, you took the analysis further and determined statistically significant areas of high and low Medicare spending. You also explored the hot spot analysis results to better understand the resulting fields and calculations that went into the analysis tool.
While these maps do not provide a causal explanation for why spending was high or low in certain regions of the country, it provides insight that there are geographic differences in health-care spending. Creating maps often invites further exploration and inspires new questions. For example, the maps created in this tutorial raise questions such as Why is the Medicare spending especially high in the Gulf Coast region states? Why is New Jersey a lone hot spot in the Northeast?
There are additional spatial statistical methods to help answer these questions, such as a regression analysis. Regression analysis is a statistical technique that helps you understand relationships among variables in your data. Consider taking the analysis further by applying more spatial statistical methods to better understand why health care costs more in some areas than others.
Note:
To learn more about regression analysis, explore the tutorial Determine how location impacts interest rates.
You can find more tutorials in the tutorial gallery.