Map and explore the temperature measurements

First, you'll download the temperature measurements and add them to a map. Then, you'll explore the data with a histogram to confirm that an urban heat island effect is present.

Download and explore the project

You'll download the project containing the temperature measurements and open it in ArcGIS Pro.

  1. Download the Analyze_Urban_Heat_Using_Kriging.zip file.
  2. Locate the downloaded file on your computer.
    Note:

    Depending on your web browser, you may have been prompted to choose the file's location before you began the download. Most browsers download to your computer's Downloads folder by default.

  3. Right-click the file and extract the contents to a convenient location on your computer, such as your Documents folder.
  4. Open the unzipped folder to view the contents.
  5. If you have ArcGIS Pro installed on your computer, double-click Analyze_Urban_Heat_Using_Kriging.ppkx to unpack and open the project.
    Note:

    If you don't have ArcGIS Pro or an ArcGIS account, you can sign up for an ArcGIS free trial.

  6. Sign in using your ArcGIS Online account.
  7. If necessary, open the Madison Temperature map and click the Contents tab.

    Temperature measurements in Madison, Wisconsin

    The Madison Temperature map consists of the Light Gray basemap and two feature layers: Temperature_Aug08_8pm and Block_Groups. The Temperature_Aug08_8pm layer contains 139 points spread across Madison, Wisconsin, covering the city center and surrounding rural areas. Each point represents the location of a sensor measuring temperature changes at 15-minute intervals. The points in the Temperature_Aug08_8pm layer represent temperature measurements in degrees Fahrenheit taken on August 8, 2016, at 8:00 p.m. at each of the sensors.

    Temperature legend

    In the layer, sensor locations are symbolized in shades of yellow to red representing change in temperature in degrees Fahrenheit. The lightest shade of yellow corresponds to 73 degrees Fahrenheit (22.78 degrees Celsius), and the darkest shade of red corresponds to 86 degrees Fahrenheit (30 degrees Celsius).

  8. On the ribbon, on the Map tab, in the Navigate group, click Explore.
  9. Pan and zoom around the city of Madison to get a sense of the area and the location of sensors.
    Note:
    A city center can be over 10 degrees warmer than the surrounding countryside. In Madison, higher temperatures are found in the middle of the city, and lower temperatures in the surrounding suburban and rural areas. This suggests the presence of the urban heat island effect, but more quantitative analysis is needed to confirm the effect.
  10. In the Contents pane, right-click Temperature_Aug08_8pm, and choose Attribute Table to open the attribute table for this layer.

    The table contains a record of attribute values for each of the 139 individual sensor points. The TemperatureF field maintains the temperature measurement value.

  11. In the Temperature_Aug08_8pm table, right-click the TemperatureF field and choose Sort Descending.

    In the TemperatureF field, the highest temperature recorded is 83.869 degrees Fahrenheit and the lowest recorded value is 73.429.

  12. Close the Temperature_Aug08_8pm table.
  13. On the ribbon, on the Map tab, in the Navigate group, click Explore.
  14. Using the Explore tool, zoom to the location of sensors within the Madison city center.

    The city center is located roughly between Lake Mendota and Lake Monona.

    Note:
    Many of the highest temperature locations found within the city center are also in close proximity to lakes, which may be contributing to higher temperatures in the summer (August) by increasing humidity levels in the surrounding areas. For this study, we will ignore this factor, but it may warrant additional exploration later as you refine your workflow.
  15. In the Contents pane, check Block_Groups.
  16. Right-click Block_Groups and choose Zoom To Layer.

    Block groups of Madison, Wisconsin

    The Block_Groups layer represents census block groups in the city of Madison and surrounding townships. Block groups are symbolized by the density of residents over the age of 65, calculated by dividing the population over age 65 by the area of the block group in square kilometers.

    These block groups will serve as the extent of the study area for the exercise. As a final step, you'll predict the average temperature in each block group to locate areas of Madison that are characterized by both high average temperatures and a high density of residents over the age of 65.

  17. In the Contents pane, uncheck Block_Groups.

Create a histogram chart for temperature

The first step in developing an interpolation workflow for temperature in Madison is to explore the data and look for interesting features. You can gain a lot of insight by looking at the symbolized points on the map, but you should also explore the data using interactive charts. For this data, a histogram chart is most relevant. The histogram chart allows you to see the distribution of temperature values in order to determine which temperatures are most prevalent in the data points. You will also use selections to identify the points representing the highest and lowest temperature measurements.

  1. In the Contents pane, right-click Temperature_Aug08_8pm, point to Create Chart, and choose Histogram.

    Opening the Histogram chart

    The Histogram pane opens. Initially, it is empty.

  2. In the Histogram pane, click the Properties button, to open the Chart Properties pane.
  3. In the Chart Properties pane, on the Data tab, under Variable, for Number, choose TemperatureF.
  4. In the Statistics group , leave Mean checked, and check Median and Std.Dev.

    Statistics for temperature values

    The chart updates to show a histogram of temperature measurements and the chart title Distribution of TemperatureF appears. Additionally, the Statistics group in Chart Properties updates, showing various statistics for the TemperatureF histogram field.

    Histogram of temperature values

    In the chart, a red vertical line is displayed at the mean (average) temperature value (79.4 degrees). Temperature values are spread fairly evenly between the minimum and the maximum, with the largest number of points showing a temperature between 79.5 and 81.3 degrees. The median temperature is displayed in purple and the Standard Deviation in orange.

    In the chart statistics, the Count value is 139 points and the Min and Max temperature values are 73.4 and 83.9 degrees, respectively.

  5. In the Distribution of TemperatureF histogram, drag a box over the left two bins to select all points that represent locations with the lowest temperature measurements.

    Select the lowest temperature measurements

    The points with the lowest temperature measurements are selected on the Madison Temperature map. These lower temperature measurements are located mostly in the suburban and rural areas surrounding the Madison city center.

    Lowest temperature measurements selected on the map

  6. In the Distribution of TemperatureF histogram, drag a box over the last two bins on the right to select locations with the highest temperature measurements.

    Highest temperature measurements selected on the map

    In the Madison Temperature map, most of the highest temperature measurements are located in the downtown city center area of Madison and additionally in adjacent areas to the northeast and southeast of the city center.

  7. Close the Chart Properties and Chart panes.
  8. On the ribbon, on the Map tab, in the Selection group, click Clear to unselect features.
  9. In the Project pane, save your Madison Temperature project.

You've now used a histogram to explore distribution of temperature measurements. You found that higher temperature measurements were situated in and around the city center, and that lower temperature measurements were observed in the surrounding suburban and rural areas. This distribution of the temperature values strongly suggests the presence of the urban heat island effect. Next, you'll use the Geostatistical Wizard to interpolate temperature measurements to create a temperature map for the entire city of Madison and surrounding townships.


Interpolate temperature using simple kriging

Previously, you mapped and explored the distribution of temperature measurements in Madison, Wisconsin, on August 8, 2016, at 8:00 p.m. By looking at the points symbolized with a graduated yellow-to-red color range and using selections in the histogram chart, you found strong visual evidence of the urban heat island effect at that date and time. Next, you'll use the Geostatistical Wizard to interpolate the point temperature measurements and create a continuous surface that predicts the temperature at every location in Madison and surrounding areas.

Interpolate temperature using simple kriging

The Geostatistical Wizard is a guided step-by-step environment for building and validating interpolation models. At each step in the model-building process, you'll make important choices that will affect the final temperature map. You can learn more about the Geostatistical Wizard in Get started with Geostatistical Analyst in ArcGIS Pro.

  1. If necessary, open your project.
  2. On the ribbon, click the Analysis tab. In the Tools group, click Geostatistical Wizard.

    Open the Geostatistical Wizard

    The Geostatistical Wizard opens and shows the available interpolation methods in the left pane and dataset options in the right pane.

  3. Under Geostatistical methods, choose Kriging/CoKriging.

    Choose the kriging option.

    The right side of the Geostatistical Wizard updates to show applicable Kriging/CoKriging options.

  4. Under Input Dataset 1, confirm and if needed, set the following parameters:

    • Source Dataset: Temperature_Aug_08_8pm
    • Data Field: TemperatureF

    Choose the temperature measurements.

    By choosing Temperature_Aug_08_8pm as the source dataset and TemperatureF as the data field, you specify that you want to perform simple kriging on the temperature measurements. By not providing a second dataset, you'll perform kriging rather than cokriging. You can learn more about cokriging in Understanding cokriging.

  5. Click Next.

    On the second page of the Geostatistical Wizard, you'll specify which type of kriging you want to perform and configure options applicable to that type of kriging.

  6. In the left pane, under Simple Kriging, confirm that Prediction is checked.

    Choose the prediction output for simple kriging.

    Note:

    Simple kriging is one of the oldest and most-studied kriging models, and it will serve as a robust baseline for temperature interpolation. Choosing the Prediction option specifies that you want to predict the value of the temperature. Other options allow different types of outputs. You can learn more about the other output options in What output surface types can the interpolation models generate?

  7. For Dataset #1, change Transformation type to None.

    This parameter specifies that you won't perform any transformations.

    Do not apply a transformation.

  8. Click Next.

    The Semivariogram/Covariance Modeling page opens.

    Geostatistical Wizard semivariogram

  9. In General Properties, change Function Type to Semivariogram.

    This parameter updates the graph from covariance to semivariogram.

    Change to the semivariogram view.

    The graph on the left now updates to display a semivariogram instead of covariance. The semivariogram is the mathematical backbone of kriging, and fitting a valid semivariogram is almost always the most difficult and time-consuming step in building a kriging model.

    Graph of semivariogram

    Note:

    The semivariogram can be considered a quantification of Waldo Tobler's First Law of Geography: "Everything is related to everything else, but near things are more related than distant things."

    The semivariogram defines exactly how similar the values of the points are given how far apart they are. The x-axis of the semivariogram is the distance between any two data points, and the y-axis is the expected squared difference between the values of the two points. For any two locations on the map, you can use a semivariogram to estimate the similarity in the data values of the two locations. Because near points are more similar than distant points, the semivariogram always increases with distance before eventually becoming flat.

    The semivariogram pane is composed of three sections:

    • Semivariogram—The graph in the upper left of the pane, containing binned values (red points), averaged values (blue crosses), and the semivariogram model (blue curve).
    • General Properties—The parameters in the right pane of the page, used to configure the shape of the blue semivariogram model.
    • Semivariogram map—Located on the lower left of the page, used to detect anisotropy. Anisotropy will not be discussed in these lessons.

    Semivariogram map

    The semivariogram is configured by three parameters that are found in General Properties:

    Semivariogram configuration

    • Nugget—The value of the semivariogram at the y-axis, which represents the expected squared difference in the value of points that are zero distance apart. While in theory the expected squared difference for these points should be zero, a nugget value greater than zero often occurs due to microscale variation and measurement errors.
    • Major Range—The distance where the semivariogram becomes flat. If two points are separated by a distance larger than the major range, the points are considered uncorrelated.
    • Partial Sill—The value of the semivariogram at the major range is called the sill. The partial sill is calculated by subtracting the nugget from the sill and represents the expected squared difference in value between points that are spatially uncorrelated. This value provides information about the variance of the underlying spatial process.
    Note:

    The details of the semivariogram parameters do not need to be deeply understood for these lessons. You can learn more about the nugget, range, and sill in Understanding a semivariogram: The range, sill, and nugget.

    The goal of the semivariogram page is to configure the parameters in General Properties such that the blue semivariogram passes as closely as possible through the middle of the binned and averaged values in the semivariogram graph.

    Note:

    The binned (red points) and averaged (blue crosses) values in the semivariogram graph are calculated directly from the input points using sectors that are defined by the Lag Size and Number of Lags parameters in General Properties. These averaged and binned values are together called an empirical semivariogram. The semivariogram model (blue curve) is then fitted to this empirical semivariogram using a simple curve-fitting algorithm. This process does not need to be understood for these lessons, but you can learn more about the procedure in Empirical semivariogram and covariance functions.

  10. In General Properties, change Model #1 to Spherical.

    Change to a spherical semivariogram model.

    Watch as the blue semivariogram slightly changes after changing the model.

    Note:
    There are many ways to fit a semivariogram to the same binned and averaged points, and every semivariogram model will estimate a different semivariogram for the same binned and averaged points. All semivariogram models will honor the same nugget, range, and sill, but they will have slightly different shapes.

    There is a lot of detail packed into the semivariogram page, and it is often difficult even for experienced geostatisticians to determine the appropriate parameters of a semivariogram. For this reason, the Optimize model button was created.

  11. In General Properties, click the Optimize model button.

    Optimize the semivariogram.

    The purpose of the Optimize model button is to automate finding a nugget, major range, and partial sill that result in the smallest root mean square cross-validation error (cross-validation will be shown and explained later in this lesson). Because this optimization can sometimes take a long time to calculate, it is not done automatically by default.

    After optimizations, the semivariogram and parameters are updated. These are the values that you will use for your first kriging model.

  12. Click Next.

    Searching Neighborhood page of the Geostatistical Wizard

    The wizard updates to display the Searching Neighborhood page, which consists of a preview of the prediction map along with parameters that control the searching neighborhood.

    Note:

    You can click anywhere in the preview surface and see the predicted value at that location in the Identify Result section on the lower right. Alternatively, you can type an x,y coordinate, and the center of the searching circle will move to the specified location.

    Each prediction is based on neighboring input points, and this page allows you to control how many neighbors will be used and which direction the neighbors will come from. Because your temperature measurements are evenly spread over the map, the default searching neighborhood does not need to be altered. If the input points were more clustered or unevenly spaced, you would need to account for this in the searching neighborhood.

  13. In Identify Result, change X to 571000 and Y to 290000. Press Enter between each entry.

    Change the prediction location.

    The center of the searching circle moves to the specified x,y coordinate in the middle of a hot part of the city.

    Have you pinpointed the center of a heat island at this location? No. Heat islands don't really have a center—they tend to spread out across a city.

    Preview of location with high temperature

    At this x,y location, Identify Result predicts that the temperature is 83.26 degrees with a standard error of 0.51 degrees. Standard errors quantify the uncertainty in the predicted values. The larger the standard error of the prediction, the higher the uncertainty in the predicted value.

    Note:

    If the predictions are normally distributed, you can construct margins of error for each predicted value based on this rule: Double the standard error and add it to and subtract it from the predicted value to create a 95 percent confidence interval.

    • In this location, for example, the lower bound of the 95 percent confidence interval is (83.26 – 2 * 0.51) = 82.24.
    • The upper bound of the confidence interval is (83.26 + 2 * 0.51) = 84.28.

    Therefore, the best estimate for the temperature at this location is 83.26 degrees Fahrenheit, but you can be 95 percent confident that the true temperature is somewhere between 82.24 and 84.28 degrees Fahrenheit.

  14. For Identify Result, change X to 572000 and Y to 307000. Press Enter between each entry.

    Change prediction location.

    The prediction location moves to the top of the study area in the coldest part of the map. The predicted value for this location is about 75.22 degrees with a standard error of 1.76. At this location, the standard error is much larger. This is because there are fewer temperature measurements toward the top of the map than there are in the city center. This results in larger uncertainty in temperature predictions in areas with fewer measurements.

    Next, you'll explore the cross-validation page. The cross-validation page displays various numerical and graphical diagnostics that allow you to assess how well your interpolation model fits your data. Cross-validation is a leave-one-out validation method that sequentially hides each input point and uses all remaining points to predict back to the location of the hidden point. The measured value at the hidden point is then compared to the prediction value from cross-validation; the difference between these two values is called the cross-validation error.

  15. Click Next to display the cross-validation page.
    Note:

    The logic of cross-validation is that if your interpolation model is accurate and reliable, the remaining points should be able to accurately predict the measured value of the hidden point. If the predictions from cross-validation are close to the measured temperature values, this gives you confidence that your model can accurately predict temperature values at new locations.

  16. Review the Summary panel on the right side of the cross-validation page.

    The summary is useful for quickly assessing the overall accuracy and reliability of the model. Each summary statistic provides different information about the model.

    Cross-validation summary statistics

    DiagnosticValueSignificance

    Count

    139

    The number of input points.

    Mean—The average of the cross-validation errors

    0.144

    This provides a measure of bias. A biased model is one that tends to predict values that are either too high or too low on average. If the model is unbiased, this value should be close to zero.

    Root-Mean-Square—The square root of the mean squared error

    1.775

    This RMS measures how close the predicted values are to the measured values on average. The smaller the value, the more accurate the predictions.

    Mean Standardized—A standardized version of the mean error

    0.044

    A value close to zero indicates that the model is unbiased. Because this value is standardized, it can be compared between different models that use different data and units.

    Root-Mean-Square Standardized—A standardized version of the root mean square

    1.075

    This value quantifies the reliability of the standard errors of prediction. This value should be close to one. Significant deviation from one indicates that the standard errors of prediction are not accurate. It is standardized, so it can be compared between different models.

    Average Standard Error—The average of the standard errors at the input point locations

    1.568

    This value should be close to the root mean square. If this value significantly deviates from the root mean square, this indicates that the standard errors may not be accurate.

    Overall, these statistics are adequate to justify the accuracy of your kriging model.

    • The Mean statistic indicates that on average the temperature predictions are 0.14 degrees too high, which is a small amount of bias and should not be concerning.
    • The Root-Mean-Square statistic indicates that on average the predictions differed from the measured values by a little less than two degrees.
    • Because the Root-Mean-Square Standardized statistic is larger than one, this indicates that the standard errors are being slightly underestimated.
  17. On the graphical diagnostics pane, click the Predicted tab to select it, if necessary.

    Predicted versus measured cross-validation graph

    The Predicted graph displays a scatterplot of the cross-validation predictions (x) versus measured values (y) for each input point. In addition, a blue regression line is fitted to the data and a gray reference line is used to compare the blue regression line to the ideal. If your interpolation model is valid, the predictions should be approximately equal to the measured values, so the regression line would follow a 45-degree angle.

    In your graph, the blue regression line follows the reference line very closely, which gives you further confidence in the accuracy of your model.

  18. Click the Error tab.

    Measured versus error cross-validation graph

    Notice in the Error graph, your blue regression line is decreasing. This indicates that the interpolation model is smoothing the data, meaning that large values are being underpredicted, and smaller values are being overpredicted. Some degree of smoothing occurs in almost every geostatistical model, and in this result, smoothing is not severe.

  19. Click the Normal QQ Plot tab to display the distribution of standardized errors versus the equivalent quantiles from the standard normal distribution.

    Normal QQ Plot cross-validation graph

    In the Normal QQ Plot graph, if the red dots fall close to the gray reference line, it indicates that the predictions follow a normal distribution. In your graph, the red points do generally fall close to the reference line, but there are some deviations, especially for the points on the upper right part of the graph. While interpreting QQ plots is not an exact science, your graph indicates that you are justified in assuming that the predictions follow a normal distribution.

  20. Click Finish.

    The final page of the wizard is the Method Report page, which displays all the parameters and settings that were used for the interpolation.

  21. On the Method Report page, click OK.

    The Geostatistical Wizard closes and a layer named Kriging, showing predicted temperature values, is added to the Contents pane of your map.

    Simple kriging temperature surface

Explore the Kriging layer on the map

In the previous section, you used the Geostatistical Wizard to interpolate the temperature measurements using simple kriging. You finished by creating a geostatistical layer of your kriging results. Geostatistical layers are custom layers that are only created and analyzed in the Geostatistical Analyst extension. They allow fast visualization and analysis, and they can be exported to raster or feature formats. In this section, you'll explore your geostatistical layer on the map.

  1. In the Contents pane, uncheck the Temperature_Aug_08_8pm layer.
  2. Expand the Kriging layer legend to review the symbology used to indicate warmer and cooler interpolated temperatures.

    Legend of simple kriging layer

    The urban heat island effect is clear just from looking at the map. The highest predicted temperatures are in the downtown area of Madison, with temperatures generally in the range of 80 to 84 degrees. Lower predicted temperatures are in the surrounding suburban and rural areas, with temperatures in the range of 73 to 78 degrees.

  3. On the ribbon, click the Map tab. In the Navigate group, click Explore.
  4. Click several locations on the map to preview predicted temperatures and the standard error of the prediction. Make sure to click some areas in the middle of the city as well as some locations in the suburban and rural areas outside the city.

    Pop-up with predicted temperature and standard error

    As you investigate higher predicted temperature locations in the middle of the city, notice the associated lower standard errors. It is safe to assume that the predicted temperatures are higher due to the urban heat island effect, and the standard errors are lower because there are more temperature measurements in the middle of the city.

  5. In the Contents pane, for the Kriging layer, collapse the legend and turn the layer off.
  6. Turn on Temperature_Aug_08_8pm.
  7. Save the project.

You used the Geostatistical Wizard to create a map predicting the temperature in Madison, Wisconsin, on August 8, 2016, at 8:00 p.m. You started with 139 points measuring the temperature across the city. In the first lesson, you found evidence of the urban heat island effect by exploring the temperature measurements using symbology and the histogram chart. To verify this observation, you used the Geostatistical Wizard to interpolate the temperature measurements using simple kriging. By creating a continuous map predicting the temperature across Madison and surrounding townships, you confirmed that there is nearly a 10-degree difference in temperature between the middle of the city and surrounding rural areas.

Next, you'll interpolate the temperature measurements again using a newer type of kriging called empirical Bayesian kriging. You'll then compare the results from empirical Bayesian kriging to the results from simple kriging.


Interpolate temperature using empirical Bayesian kriging

Previously, you explored the temperature measurements in Madison, Wisconsin, and used the Geostatistical Wizard to create a simple kriging layer predicting the temperature across the entire city, which confirmed the presence of the urban heat island effect. The simple kriging model that you created is a classical kriging model, and it is the exact kind of model that you would expect to find in geostatistical textbooks and published scientific journals. In recent years, however, the rapid increase in computer processing power has led to the development of more sophisticated kriging models that are both more accurate and easier to configure. In this lesson, you will interpolate the temperature measurements using one of these new kriging models known as empirical Bayesian kriging.

Empirical Bayesian kriging (EBK) was developed specifically to overcome some of the more difficult theoretical and practical limitations of classical kriging. By far, the biggest limitation of classical kriging is the assumption that one single semivariogram can accurately represent the spatial structure of the data everywhere. Recall that the semivariogram represents the expected difference in data value for pairs of points that are a given distance apart. Regardless of where the points are on the map, if two pairs of points are the same distance apart, they are supposed to have the same difference in data values. However, for most datasets this assumption is not reasonable. One semivariogram model may fit best in one part of the map and a completely different semivariogram model may fit best in a different part of the map. In situations like this, you cannot hope to find a single semivariogram model that accurately represents the data everywhere on the map.

Even if there were a single semivariogram that fit well everywhere in the dataset, you would still need to estimate it. Unfortunately, the mathematical equations behind classical kriging assume that the semivariogram has been modeled perfectly, and any inaccuracy in the semivariogram parameters will not be properly accounted for in the predictions and standard errors. Because the math of kriging is based entirely on this single semivariogram, it is critical to estimate it as well as you possibly can. This is why there are so many parameters that can be used to change the shape of a semivariogram: you need as much flexibility as possible to accommodate all of the possible spatial structures of different datasets.

Empirical Bayesian kriging overcomes these problems through a process of subsetting and simulation. EBK starts by dividing the input data into small subsets. In each subset, a semivariogram is estimated automatically, and this semivariogram is used to simulate new data values in the subset. These simulated data values are then used to estimate a new semivariogram for the subset. This simulation and estimation process repeats many times, and it results in many simulated semivariograms in each subset. These simulations are then mixed together to produce the final prediction map.

By estimating the semivariograms on small subsets, different semivariograms will be estimated in different regions of the study area. This allows the model to change locally, and you no longer need to assume that a single semivariogram model can fit the data everywhere. Additionally, by simulating many semivariograms in each subset, you do not have to worry as much about the accuracy of any single semivariogram. When all math is based on a single semivariogram, you must be very careful to make sure that it is as good as it possibly can be, but when many semivariograms are simulated, it is not critical that each of them be perfect.

Perform empirical Bayesian kriging in the Geostatistical Wizard

You'll use the Geostatistical Wizard to interpolate the temperature measurements using empirical Bayesian kriging.

Note:

Due to the computational cost of the simulations in EBK, many mathematical operations are optimized for different processors. Depending on the hardware of your computer, you may get slightly different results in this section. These differences can be as large as 1 percent in some cases.

  1. If necessary, open your project.
  2. On the ribbon, on the Analysis tab, in the Tools group, click Geostatistical Wizard.
  3. For Geostatistical methods, choose Empirical Bayesian Kriging.
  4. Under Input Dataset, for Source Dataset, choose Temperature_Aug_08_8pm.
  5. For Data Field, choose TemperatureF.

    Provide empirical Bayesian kriging inputs.

  6. Click Next to update the Empirical Bayesian Kriging semivariogram and preview.

    The top left pane displays a preview of the interpolated surface with a searching circle centered in the middle of the data extent.

    EBK Preview surface

    The lower right displays Identify Result.

    EBK Identify results

    General Properties shows parameters for the semivariograms and the searching neighborhood.

    EBK General properties

    Parameters in General Properties provide control over subsets and simulations in EBK:

    • Subset Size specifies the number of points in each subset.
    • Overlap Factor allows you to control how much these subsets overlap each other.
    • Number of Simulations controls how many semivariograms will be simulated in each subset.

    The Simulated semivariograms (blue lines) and Empirical semivariogram (blue crosses) are displayed in the lower left. The median semivariogram is solid red, and the first and third quartiles are displayed as dashed red lines.

    EBK Semivariogram graph

  7. In General Properties, for Subset Size, type 50 and press Enter.

    Update Subset Size.

    The preview surface updates to reflect the new subset size. With 139 input points, using a subset size of 50 will create approximately three subsets. This ensures that the semivariograms will be sufficiently estimated at a local level, while still maintaining enough points in each subset to reliably estimate the semivariogram parameters.

  8. In Identify Result, change X to 571000 and Y to 290000. Press Enter between each entry.

    Predict at a specified location.

    The predicted temperature at this location is about 83.39 degrees with a standard error of 0.63 degrees. In the previous lesson, simple kriging predicted 83.26 degrees with a standard error of 0.51 degrees at this same location.

    Note:

    Both simple kriging and EBK predict nearly the same temperature, but there is a notable difference in the standard errors of the predictions. This is because simple kriging almost always underestimates standard errors due to only using a single semivariogram. While a larger standard error in EBK seems to imply that EBK has larger uncertainty than simple kriging, the truth is that the standard errors of simple kriging are incorrectly low.

    At this location (571000, 290000), the semivariograms seem to pass through the averaged values (blue crosses) fairly well, particularly at short distances. The averaged values at the largest distances tend to be on the lower end of the spectrum, but it is most critical to properly model the semivariogram at short distances, as these are the distances that will contribute most to the predicted values.

    Simulated semivariograms at the specified location

  9. In Identify Result, change X to 572000 and Y to 307000. Press Enter between each entry.

    The prediction location moves to the top of the study area in the coldest part of the map. The predicted value for this location (572000, 307000) is about 74.14 degrees with a standard error of 2.28. Simple kriging predicted about 75.22 degrees with a standard error of 1.76. This time, the two predictions differ by a full degree, but this is likely due to the larger uncertainty in the predicted values at this location. This uncertainty can be seen in the larger standard errors, different than the previous x,y location.

    Simulated semivariograms at new specified location

  10. Click other locations on the preview surface to see the predicted values and the simulated semivariograms until you are satisfied that the semivariograms seem to fit the averaged values well almost everywhere on the map.
  11. Click Next to display the cross-validation page.

    As with simple kriging, the cross-validation page displays summary statistics on the right and graphical diagnostics on the left. In the EBK summary statistics, there are now three additional statistics that did not appear in simple kriging:

    • Average CRPS—This statistic simultaneously quantifies the accuracy and stability of the model, and it should be as small as possible. Unfortunately, it has no direct interpretation, and it can only be used to compare different interpolation models.
    • Inside 90 Percent Interval—The percent of cross-validation points contained in a 90 percent prediction interval. This value should be close to 90. Your value of 89.928 is nearly perfect.
    • Inside 95 Percent Interval—The percent of cross-validation points contained in a 95 percent prediction interval. This value should be close to 95. Your value of 96.403 is quite close to the ideal value of 95.

    The following table shows a comparison of cross-validation summary statistics from EBK and simple kriging:

    Note:

    Your values may differ slightly from the table below due to rounding.

    Summary statisticSimple krigingEBK

    Mean

    0.144

    0.158

    Root-Mean-Square

    1.775

    1.715

    Mean Standardized

    0.044

    0.049

    Root-Mean-Square Standardized

    1.075

    0.995

    Average Standard Error

    1.568

    1.684

    • Larger Mean and Mean Standardized values in EBK indicate that it has slightly more bias than simple kriging, but overall both models have very small amounts of bias.
    • The slightly lower Root-Mean-Square value indicates that on average EBK predicts slightly more accurate temperature values.

    The biggest difference in the two models is that the standard errors in EBK are much more accurate.

    • The larger Average Standard Error value in EBK shows that on average, EBK is estimating larger standard errors than simple kriging.
    • The nearly perfect Root-Mean-Square Standardized value in EBK (recall that ideally it should be one) indicates that these standard errors are being more correctly estimated.
    • The Average Standard Error value of EBK also more closely matches that Root-Mean-Square value than it does in simple kriging.

    Taken together, this is strong evidence that the EBK model is more reliable than the simple kriging model.

  12. Confirm that the graphical diagnostics pane is displaying the Predicted graph.

    Predicted versus measured cross-validation graph

    The graph shows predicted values from cross-validation versus measured values. The blue regression line is so close to the gray reference line that you can hardly see the reference line. In simple kriging, the regression line was not as perfectly aligned with the reference line. This should give you further confidence that the EBK model is more reliable.

  13. Click the Error tab.

    Measured versus error cross-validation graph

    Like the simple kriging model before, the blue regression line is slightly decreasing, which indicates that the model has performed smoothing of the data, but this smoothing is not severe.

  14. Click the Normal QQ Plot tab.

    Normal QQ plot for empirical Bayesian kriging

    The red points very closely follow the gray reference line. There is still some deviation from the reference line for the largest values, but this deviation is smaller than it was in simple kriging. Based on this graph, you can safely assume that the predictions follow a normal distribution.

  15. Click Finish.
  16. On the Method Report page, click OK.

    The Geostatistical Wizard closes and the Empirical Bayesian Kriging geostatistical layer is added to the Contents pane. This layer has the same symbology as the Kriging layer, so they can be visually compared.

    Empirical Bayesian kriging temperature surface

  17. In the Contents pane, turn off Temperature_Aug_08_8pm. Turn on Kriging and keep Empirical Bayesian Kriging turned on too. Click Empirical Bayesian Kriging to select it.
    Note:

    Slight variations may be noticed as a result of rounding.

  18. On the ribbon, on the Appearance tab, in the Effects group, click Swipe. Swipe up and down or left and right to display the difference between the Empirical Bayesian Kriging and Kriging layers.

    Compare empirical Bayesian kriging to simple kriging results

  19. On the Map tab, in the Navigate group, click Explore.
  20. Click several locations on the map to preview predicted temperatures and the standard error of the prediction. Make sure to click some areas in the middle of the city as well as some locations in the suburban and rural areas outside of the city.
  21. When finished, turn off Empirical Bayesian Kriging and Kriging, and collapse their legends, if necessary.
  22. Save the project.

You've interpolated the temperature measurements using empirical Bayesian kriging in the Geostatistical Wizard. As with simple kriging in the previous lesson, you could confirm the presence of an urban heat island on the prediction map; the center of the city is notably warmer than the surrounding areas. Using cross-validation, you showed that EBK produced a moderately more accurate temperature prediction map, particularly for the standard errors of predicted temperatures.

Next, you'll use an even more sophisticated version of kriging called EBK Regression Prediction, which will allow you to incorporate the locations of impervious surfaces into the interpolation.


Incorporate explanatory variables with EBK Regression Prediction

Previously, you learned how to use the Geostatistical Wizard to interpolate temperature measurements in Madison, Wisconsin, on August 8, 2016, at 8:00 p.m. You first used a classical interpolation method called simple kriging. You then learned to use a more modern and robust method called empirical Bayesian kriging (EBK) that provided moderately more accurate predictions using fewer parameters and settings. In this lesson, you will learn how to incorporate explanatory variables into the interpolation using EBK Regression Prediction.

An explanatory variable (sometimes called a covariate) is any dataset that is related to the variable you are investigating and can be incorporated into a model to improve its accuracy or reliability. As the name implies, EBK Regression Prediction is a regression-kriging method that is a hybrid of EBK and linear regression. EBK Regression Prediction allows you to use explanatory variable rasters that you know are related to the variable you are interpolating.

For these temperature measurements, you will incorporate the locations of impervious surfaces into the interpolation. Impervious surfaces are important contributors to urban heat islands because these surfaces (usually buildings and other manmade structures) trap the heat in the middle of dense cities and prevent it from diffusing into surrounding rural areas.

A deep understanding of regression is not required to complete this lesson, but a little background will be helpful. Both kriging and regression make predictions by explicitly separating an estimate of the average value and an estimate of the error:

Prediction = Average + Error

In regression, the average component of the prediction is estimated with a weighted sum of explanatory variables, and the error component is assumed to be random noise. In this sense, all of the predictive power in regression comes from the average component, and the error component is just noise that you want to minimize.

In kriging, however, the predictive power comes from the error component, and the average is equal to the average of the measured values of all the input points (or some other specified constant). The error component is estimated by the semivariogram and the values of the neighboring points. If the values of the neighbors tend to be above the average value of all input points, the error component will be positive, and the prediction will be larger than the average value of all the points. Conversely, if the values of the neighbors are below the average, the error component will be negative, and the prediction will be lower than the average.

At their mathematical cores, regression operates only on the average component and kriging operates only on the error component. Regression-kriging, however, operates on both components at the same time. It simultaneously estimates the average using linear regression and the error component using EBK. Because both kriging and regression are special cases of regression-kriging, EBK Regression Prediction has higher predictive power than either kriging or regression individually.

Note:

Due to the computational cost of the simulations in EBK and EBK Regression Prediction, many mathematical operations are optimized for different processors. Depending on the hardware of your computer, you may get slightly different results in this section. These differences can be as large as 1 percent in some cases.

Incorporate an Impervious Surface layer from the Living Atlas

In this section, you'll add a raster layer from the ArcGIS Living Atlas of the World and extract the Impervious Surface values within your study area. This layer comes from the National Land Cover Database (NLCD) and the value of each cell represents the proportion of the cell that is impervious to water as a result of development.

  1. If necessary, open your project.
  2. On the ribbon, on the Map tab, in the Layer group, click Add Data.
  3. In the Add Data window, expand Portal and click Living Atlas.

    Choose Living Atlas in the Add Data browser.

  4. In the search box, type Impervious and press Enter.
  5. In the search results, locate and choose USA NLCD Impervious Surface 2011.

    Choose Impervious Surface.

  6. Click OK to add the layer to your map.
    Note:

    It may take a few minutes for the layer to load.

    Impervious Surface map]

    The USA NLCD Impervious Surface 2011 layer covers the entire continental United States, but your study area covers the extent of the Madison, Wisconsin, area. As a result, you'll create a subset of the source data to the extent of your study area by using the Extract By Mask geoprocessing tool.

  7. On the ribbon, on the Analysis tab, in the Geoprocessing group, click Tools.

    Open the Geoprocessing pane.

    The Geoprocessing pane opens.

  8. In the Geoprocessing pane search box, type extract.
  9. In the search results, click Extract by Mask

    Search for the Extract by Mask tool.

  10. In the Extract by Mask tool, set the following parameters:

    • For Input raster, choose USA NLCD Impervious Surface 2011.
    • For Input raster or feature mask data, choose Block_Groups.
    • For Output raster, type Impervious_Surfaces.

    The output raster will be saved in the default geodatabase of the project.

    Provide parameters for the Extract by Mask tool.

    In addition to extracting Impervious Surface values within your study area, you also want to update the coordinate system to the same projection as the rest of your data and additionally resample the source data to a more suitable cell size of 100 meters. These changes will allow faster calculations later in the lesson.

  11. In the Geoprocessing pane, click the Environments tab and change the following parameters:
    • For Output Coordinate System, choose Block_Groups.
    • For Cell Size, type 100.

    The output coordinate system for the output is now set the same as the Block_Groups layer, which is NAD_1983_2011_Wisconsin_TM, and the output cell size is set to resample to 100 meters.

    Provide environments settings for the Extract by Mask tool.

  12. Click Run.

    You don't need USA NLCD Impervious Surface 2011 any more, so you will remove it.

  13. In the Contents pane, right-click USA NLCD Impervious Surface 2011 and choose Remove.

    Clipped Impervious Surface layer

    Your raster layer Impervious_Surfaces is a subset of the USA NLCD Impervious Surface 2011 layer and contains extracted values covering the extent of the Block_Groups layer that are resampled to 100-meter cell size in the correct projection needed for your analysis.

  14. Using the Explore tool, zoom to the city center.

    The highest percentage of impervious surfaces are in the middle of the city and along transportation corridors, and fewer impervious surfaces are located in suburban and rural areas surrounding the city, which generally have higher percentages of vegetation and open space.

    Impervious surfaces in urban corridors

    There are also no impervious surface values covering the lakes. As a result, EBK Regression Prediction will not make temperature predictions across lakes. This is desirable because all your source temperature measurements were taken on the land, and are thus unlikely to reliably predict temperature over the lakes. Temperature variation across water is driven by different factors than land temperatures.

Create a scatter plot of temperature and impervious surfaces

You have strong reason to believe that impervious surfaces are related to and contribute to urban heat, but you need to quantify this assumption. To visualize the relationship, you'll extract the values of the Impervious_Surfaces layer and add these values to the temperature layer, and then visualize the relationship using a scatter plot.

  1. In the Geoprocessing pane, click Back twice to get back to the search box.

    Press back to return to the search page.

  2. In the search box, type extract values. In the search results, click Extract Values to Points.
  3. Set the following parameters:

    • For Input point features, choose Temperature_Aug_08_8pm.
    • For Input raster, choose Impervious_Surfaces.
    • For Output point features, type Impervious_Points.

    Provide the parameters for the Extract Values to Points tool.

  4. Click Run.

    The Impervious_Points layer is added to the Contents pane of the map. This layer is identical to the Temperature_Aug_08_8pm layer except that source points have a new field named RASTERVALU appended. This attribute represents the impervious surface value extracted from the Impervious_Surfaces raster layer for each point location.

  5. In the Contents pane, for Impervious_Points, right-click the point symbol and change the symbol color to green.

    Change the point symbol color to green.

  6. In the Contents pane, right-click Impervious_Points, point to Create Chart, and choose Scatter Plot.
  7. If necessary, click the Properties button in the chart area to open the Chart Properties pane. In the Chart Properties pane, set the following parameters:

    • For X-axis number, choose TemperatureF.
    • For Y-axis number, choose RASTERVALU.

    The chart updates to display the scatter plot and is titled Relationship between TemperatureF and RASTERVALU.

    Scatter plot of temperature versus impervious surfaces

    The scatter plot shows a clear positive relationship between the measured temperature (TemperatureF) and the percentage of impervious surfaces (RASTERVALU). In addition, the relationship appears to be roughly linear, as the trend line appears to pass through the middle of the points. The higher the percentage of impervious surfaces, the higher the temperature. This linear relationship between the variables is important because linear regression rests on this assumption.

  8. When you are done exploring the Relationship between TemperatureF and RASTERVALU scatter plot, close both the chart and chart properties panes.
  9. In the Contents pane, remove the Impervious_Points layer.

    You only needed this layer to review the scatter plot.

  10. Check off the Impervious_Surfaces layer.

Interpolate temperature using the EBK Regression Prediction tool

In the previous section, you verified that impervious surfaces are an important explanatory variable for predicting temperature in Madison, Wisconsin. In this section, you'll use the EBK Regression Prediction geoprocessing tool to interpolate the temperature measurements using the impervious surfaces as an explanatory variable. You'll then compare the cross-validation results from EBK Regression Prediction to the previous two kriging models and apply meaningful symbology to your results.

Note:

EBK Regression Prediction can be executed from both the Geostatistical Wizard and a geoprocessing tool. The primary advantage of using a geoprocessing tool is the ability to incorporate the tool in a model or script for automation and documentation of a workflow, while using the Geostatistical Wizard is an excellent way to explore data and test various interpolation techniques and parameters before committing to one specific choice.

  1. In the Geoprocessing pane, click the Back button. In the search box, type EBK.
  2. In the search results, click EBK Regression Prediction.
  3. In the EBK Regression Prediction tool, set the following parameters:

    • For Input dependent variable features, choose Temperature_Aug_08_8pm.
    • For Dependent variable field, choose TemperatureF.
    • For Input explanatory variable rasters, choose Impervious_Surfaces.
    • For Output prediction raster, type Temperature_Prediction.

    EBK Regression Prediction parameters

  4. In the tool, expand Additional Model Parameters. For Maximum number of points in each local model, type 50.

    Change the maximum number of points in each local model.

    This parameter specifies that each subset will have 50 points, which matches the values used in EBK in the previous lesson.

  5. Click Environments. For Extent, choose Block_Groups.

    The control updates to As Specified Below and the output minimum and maximum extent values are updated to match the minimum and maximum extent of the Block Group layer.

    Change the extent to the block groups.

  6. Click Run.
    Note:

    It may take several minutes for the tool to execute and the resultant layer will be added to the contents pane upon completion.

    Two layers, named EBKRegressionPrediction1 and Temperature_Prediction, are added to the Contents pane.

  7. In the Contents pane, turn off Temperature_Prediction.

    Temperature predictions from EBK Regression Prediction

    The EBKRegressionPrediction1 layer shows the same interpolation pattern of urban heat as both simple kriging and EBK , but it clearly has a lot more precision. The contours are more refined, and the temperature values change over much shorter distances, indicating a higher degree of accuracy. No interpolation has occurred over the lakes, and as a result, we see a more realistic temperature map, which once again needs quantitative verification using cross-validation.

  8. In the Contents pane, right-click EBKRegressionPrediction1 and choose Cross Validation to display a cross-validation window.

    Cross Validation of EBKRegressionPrediction1 layer

    This window is identical to the final page of the Geostatistical Wizard and allows the exploration of the geostatistical layers results. Summary statistics are organized on the right and graphical diagnostics on the left.

    Cross-validation statistics for EBK Regression Prediction

    The following table compares summary statistics for this EBK Regression Prediction as well as for the EBK and simple kriging you completed in previous lessons:

    Note:

    You may notice slight variations due to rounding.

    Summary statisticSimple krigingEBKEBK Regression Prediction

    Average CRPS

    N/A

    0.894

    0.713

    Inside 90 Percent Interval

    N/A

    89.928

    91.971

    Inside 95 Percent Interval

    N/A

    96.403

    93.431

    Mean

    0.144

    0.158

    0.068

    Root-Mean-Square

    1.775

    1.715

    1.300

    Mean Standardized

    0.044

    0.048

    0.031

    Root-Mean-Square Standardized

    1.075

    0.994

    0.950

    Average Standard Error

    1.568

    1.684

    1.353

    • For EBK Regression Prediction, the Average CRPS value is about 20 percent lower than EBK, and the Root-Mean-Square value is about 25 percent lower than EBK. These are both strong indications that EBK Regression Prediction is more accurate than EBK or simple kriging.
    • The smaller Mean and Mean Standardized values also show that EBK Regression Prediction has the lowest level of bias, and the Average Standard Error value is closely aligned with the Root-Mean-Square value.
    • There is some evidence that the standard errors are being slightly overestimated because the Root-Mean-Square Standardized value is less than one, and the Inside 90 Percent Prediction Intervals and Inside 95 Percent Prediction Intervals contain a slightly different percentage of points than they are expected to (91.971 and 93.431 percent, respectively), but the standard errors look accurate overall.

    Based on these statistics, EBK Regression Prediction is clearly the most accurate and reliable of the three kriging models.

  9. Confirm that the Predicted tab is active in the graphical diagnostics pane.

    Predicted versus measured cross-validation graph for EBK Regression Prediction

    In the Predicted graph, the regression line (blue) is almost perfectly aligned with the reference line (gray). There is a lot of variability in the points around the regression line, but this graph should give you further confidence in the accuracy of the model.

  10. Click the Error tab.

    Measured versus predicted cross-validation graph for EBK Regression Prediction

    Like the two models before, the regression line in the Error graph is trending down. This indicates some smoothing in the model, but once again, the smoothing is not severe.

  11. Click the Normal QQ Plot tab.

    Normal QQ plot for EBK Regression Prediction

    Points in the Normal QQ Plot graph fall closer to the reference line than in either of the previous two models. Even the largest values fall very close to the line. There is some minor deviation from the line for the smallest values, but you can safely assume that the predictions follow a normal distribution based on this graph.

    Based on the numerical and graphical cross-validation diagnostics, you now have strong evidence that the EBK Regression Prediction model provides the most accurate predictions of the three models you have used in these lessons. This is the model that will serve as your recommended procedure for interpolating temperature in Madison, Wisconsin.

    Now that you have decided on using the EBK Regression Prediction model, you'll apply attractive and meaningful symbology to the Temperature_Prediction raster.

  12. Close the Cross validation window.
  13. In the Contents pane, turn off EBKRegressionPrediction1. Turn on Temperature_Prediction.

    You will now apply more meaningful symbology to Temperature_Prediction by importing a custom stretch renderer from an existing layer file.

  14. In the Contents pane, right-click Temperature_Prediction and choose Symbology.
  15. In the Symbology pane, click the Menu button and choose Import.

    Import symbology to the temperature prediction raster

  16. On the Import Symbology dialog box, browse to the location where you extracted the downloaded project in the first lesson, double-click analyze-urban-heat-using-kriging, and choose EBKRP_Symbology.lyrx.

    The EBKRP_Symbology.lyrx file contains predefined symbolization methods and properties suitable for the Temperature_Prediction layer.

  17. Close the Symbology pane.

    Symbolized temperature prediction raster

    The layer is symbolized with a stretched color scheme ranging from 73 degrees Fahrenheit in the lightest shade of yellow to 86 degrees in the darkest shade of red. This color ramp matches the one that was used for temperature measurement points in the Temperature_Aug_08_8pm layer.

    The urban heat effect is obvious just by viewing the layer. The hottest temperatures are in the middle of the city, and the coldest temperatures are in the surrounding rural areas. However, by including the impervious surfaces layer, you are getting far greater detail in the predicted surface. In some areas, you can even pick out urban corridors and view how the heat flows between the buildings and along the highways and freeways.

  18. Pan and zoom around the map to investigate any areas that interest you. Click several locations within the city center and suburban and rural areas to identify predicted temperature.

Estimate the average temperature within each block group

In this section, you'll predict the average temperature within each of the block groups using zonal statistics. You'll then join the predictions to the block groups and apply relevant symbology to visualize average temperatures.

  1. In the Contents pane, turn off Temperature_Prediction. Turn on Block_Groups.
  2. In the Geoprocessing pane, click the Back button, and search for zonal statistics. In the search results, click Zonal Statistics as Table.
  3. In the Zonal Statistics as Table tool, set the following parameters:

    • For Input raster or feature zone data, choose Block_Groups.
    • For Zone field, choose OBJECTID.
    • For Input value raster, choose Temperature_Prediction.
    • For Output table, type Mean_Temperature.
    • For Statistics type, choose Mean.

    Zonal Statistics as Table parameters

    Choosing Mean for the statistics type indicates that you want to determine the average of all temperature predictions within a block group.

  4. Click Run.

    The table appears in the Contents pane, under the Standalone Tables section. It contains 269 records, one for each of the 269 block groups in the study area. In the table, the OBJECTID field identifies individual block groups and the Mean field contains the average predicted temperature within each block group.

    Next you'll join the Mean_Temperature table to the block groups in order to add the Mean field values to each individual block group polygon.

  5. In the Geoprocessing pane, click the Back button, and search for Add Join. In the search results, click Add Join.
  6. In the Add Join tool, set the following parameters:

    • For Layer Name or Table View, choose Block_Groups.
    • For Input Join Field, choose OBJECTID.
    • For Join Table, choose Mean_Temperature.
    • For Output Join Field, choose OBJECTID.

    Add Join parameters

  7. Click Run.

    Attribute fields from the Mean_Temperature table are now joined to block groups using the OBJECTID to identify each unique block group.

  8. In the Contents pane, right-click Block_Groups and choose Attribute Table.
  9. In the Block_Groups attribute table, scroll to the far right and confirm that the Mean field has been appended to the table.

    This field contains the average predicted temperature for each block group.

  10. Close the Block_Groups attribute table.

    Next, you'll symbolize the block groups by the predicted average temperature and apply symbology from an imported layer file.

  11. In the Geoprocessing pane, click the Back button, and search for Apply Symbology. Open the Apply Symbology From Layer tool.
  12. In the Apply Symbology From Layer tool, set the following parameters:

    • For Input Layer, choose Block_Groups.
    • For Symbology Layer, browse to the location where you extracted the downloaded project, double-click analyze-urban-heat-using-kriging, and choose BG_temperature.lyrx.
    • Under Symbology Fields, for Type, verify that the value is Value field.
    • For Source Field, verify that the value is Mean_Temperature.MEAN.
    • For Target Field, verify that the value is MEAN.

    Apply Symbology From Layer parameters

  13. Click Run.
    The block group symbology updates to show each block group polygon shaded by the average predicted temperature within that block group. The color range used is the same as the original Temperature_Aug_08_8pm layer. The average temperature follows the same patterns as the prediction raster: the hottest block groups are located in and around the center of the city, and the coldest block groups are in the surrounding suburban and rural areas.
  14. Open the pop-ups for several block groups that show high mean temperatures.

    Predicted average temperature in each block group

Identify block groups with high numbers of vulnerable residents

In the previous section, you used zonal statistics to predict the average temperature within each of the block groups. In this section, you'll use a query to identify any block groups that have both high average temperatures and a high density of residents over the age of 65. Elderly residents over 65 are most susceptible to heat-related illnesses, so priority for remedial measures should be given to areas of Madison that have the highest numbers of these at-risk residents. You'll build a query expression to select all block groups where the mean temperature is greater than 81 and the density of residents 65 years of age or older is greater than 100,000.

  1. In the Geoprocessing pane, search for Select Layer.
  2. In the search results, click Select Layer by Attribute.

    Query expressions use the following syntax:

    Field name + Operator + Value or Field

  3. In the Select Layer by Attribute tool, set the following parameters:

    • For Input Rows, choose Block_Groups.
    • For Selection type, choose New selection.

  4. In the Expression group, click New expression.
  5. Create the expression Mean is Greater Than 81. You may need to remove values after the decimal.

    Build an expression for mean temperature

  6. Click Add Clause.

    Expressions can include additional clauses or conditions that are connected to the original clause using a connector such as And or Or. Connectors indicate whether one or both clauses need to be true to select a feature.

  7. In the Expression group, click Add Clause to add a second clause to your query.
  8. Create the expression And DensityOver65 is Greater Than 100000.

    Build an expression for Population over 65

  9. Click Enter.

    Verify your expressions and make adjustments if necessary.

    Select Layer By Attribute expressions

    This expression selects block groups with an average temperature above 81 degrees Fahrenheit and a density of residents over the age of 65 that is greater than 100,000 people per square kilometer.

  10. Click Run.
  11. Close the Geoprocessing pane.

    Block groups with the highest temperature and density of residents over 65 years old

    Five block groups are selected based on your criteria. They are all located in downtown areas and areas along transportation corridors, and they represent the areas of the city where there is high potential for heat-related illnesses in the vulnerable population. In an emergency, these are the areas that should be prioritized by health care authorities.

    As a final check, you'll create a scatter plot of the average temperature versus the density of elderly residents to visualize the overall relationship.

  12. In the Contents pane, right-click Block_Groups, point to Create Chart, and choose Scatter Plot. If necessary, click Properties in the chart area to open the Chart Properties pane.
  13. In Chart Properties, for X-axis number, choose Mean.
  14. For Y-axis number, choose DensityOver65.

    Scatter plot of average temperature versus density of elderly residents

    The scatter plot updates to show the relationship between average temperature and density of elderly residents. The five selected block groups remain selected in the scatter plot and indicate occurrences where the average temperature is above 81 degrees Fahrenheit and the density of residents over the age of 65 is above 100,000.

    There appears to be no relationship between average temperature and density of elderly residents. The trend line is very flat with a slightly negative slope, and the scatter plot does not show any marked patterns. This is good news, because it means that elderly residents over 65 do not tend to live in the hottest parts of Madison, Wisconsin.

  15. In the Relationship between MEAN and DensityOver65 chart, click the single point located at the top of the graph.

    The selected block group has a high density of elderly residents (over 700,000) and falls in the middle of the temperature range (around 80 degrees Fahrenheit).

    Block group with very high density of elderly residents

    Because this block group has such a high density of elderly residents, the temperature of the block group should be closely monitored by the emergency managers of Madison, Wisconsin. Fortunately, on August 8 at 8:00 p.m., this block group did not experience higher temperatures compared to the rest of Madison.

  16. Save the map.

Share your work

You've completed your analysis of temperature in Madison, Wisconsin, for August 8, 2016, at 8:00 p.m. You've developed a workflow for identifying block groups with high numbers of at-risk individuals and performed several different types of kriging. After comparing their results, you applied attractive and meaningful symbology. All you need to do now is to identify an efficient and suitable way to deliver your results to authorities and the public.

ArcGIS offers several ways for you to share your findings, each appropriate for different audiences. The traditional, static approach is to create a layout that can be printed or exported to a PDF or an image file. For a more dispersed audience, you could consider a more dynamic approach and share results online in the form of a web package, web layer, or web map.

Output options

Printed maps are still popular and offer a more accessible way to share results with many users. It is also possible to export a map to various image formats, such as PNG or JPEG, that can be embedded into in a presentation for use by those who do not have access to GIS software. Maps can also be exported to a PDF file that users can interact with by turning layers on and off.

Printed maps, PDF files, and images are generally the result of creating a map layout. A map layout allows you to communicate your map's message to users, so depending on the purpose, you'll need to make decisions based on the audience and the goal of the map.

When designing a layout, take note of the following elements:

  • Page size
  • Scale
  • Extent
  • Landscape or portrait orientation
  • Basemaps
  • Operational layers
  • Group layers
  • Coordinate system
  • Annotation

The addition of map elements further help to communicate the message of your map to your audience and may include many of the following elements:

  • Title
  • Map frame
  • Legend
  • North arrow
  • Scale bar
  • Overview or reference map
  • Supporting text (author, information about data, date)
  • Chart
  • Logo
  • Coordinate grids

When sharing dynamic content, options include publishing layers, maps, data, and projects in the form of various package types or as a web layer or web map. Users can access shared content directly throughArcGIS Proor through ArcGIS Online. Packages are intended for sharing projects between ArcGIS Pro users, while web layers and web maps can be seen by a broader audience over the Internet.

If you choose to share parts of your ArcGIS Pro project or the entire project, you can create a package. Packages include layer packages, map packages, or project packages. Packages can be saved locally, or they can be shared on ArcGIS Online so that users can download your maps and data. When other users access a package that you have shared, they can unpack it locally and edit and modify the local copy of the shared map, layer, or project packages.

  • Layer packages contain layer properties and the source data referenced by the layer.
  • Map packages contain layer properties for each layer in the map, and the source data referenced by all layers.
  • Project packages contain layer properties, maps, layouts, referenced data, models, toolboxes, geodatabases, and all other associated project elements.

When packaging a layer, you perform the following steps:

  • Select whether to upload to a file or to your ArcGIS Online account.
  • Provide a name for the package.
  • Give the package an item description.
  • Provide tags.
  • Set sharing options.
  • Analyze the package and correct any errors.
  • Share the package.

A web layer is like a feature layer in ArcGIS Pro, but it is hosted online instead of stored locally on a computer. Web layers are used for map visualization and can be edited and queried. Web layers can be created from any feature layers you have in an ArcGIS Pro project.

A web map is an interactive collection of map layers that can used to create maps for visualization, editing, querying, and analysis. Web maps always contain one basemap and additional supporting operational layers. Web maps are often used for generating apps such as Story Maps.

When sharing a web layer, you perform the following steps:

  • Provide a name for the web layer.
  • Select features or tiles to share.
  • Supply the web layer with an item description.
  • Provide tags.
  • Set sharing options.
  • Analyze the web layer and correct any errors.
  • Share the web layer.

For examples and instructions on how to create some of these forms of outputs, you can look at the following lessons: Get Started with ArcGIS Online walks you through creating a web app. Cartographic Creations in ArcGIS Pro shows a detailed, professional layout view, with explanatory text and elements. If you want to combine a web map with storytelling, look at Get Started with Story Maps to learn how to create a high-quality and highly accessible story map.

In this lesson, you learned how to develop a workflow to access interpolation procedures for analyzing urban heat in Madison, Wisconsin. By exploring the temperature measurements on the map and performing interpolation, you verified the presence of a suspected urban heat island in downtown Madison.

To make a temperature map for all of Madison, you first interpolated the data using simple kriging, one of the oldest and most researched geostatistical methods. This resulted in a scientifically and statistically defensible baseline for the interpolation. Once this baseline was established, you improved the results of the interpolation by using empirical Bayesian kriging. By using locally simulated semivariograms, you improved the accuracy and stability of the interpolated temperatures. Using a scatter plot chart, you then determined that the locations of impervious surfaces were highly related to temperature, and you incorporated this information into the interpolation using EBK Regression Prediction. This resulted in a 25 percent reduction in the Root-Mean-Square cross-validation error compared to EBK.

You competed the workflow by querying and locating census block groups in Madison that have the highest average temperature and the highest density of residents over the age of 65, who are at highest risk for heat-related illnesses.

Using selections, you identified five block groups with an average temperature above 81 degrees Fahrenheit and a population density of residents over the age of 65 above 100,000 people per square kilometer. A scatter plot chart revealed that the population density of elderly residents does not seem to be correlated with temperature. This was a desirable result because if elderly residents tended to live in the hottest parts of the city, that would pose extra challenges for emergency managers and health care providers when trying to mitigate the effects of extreme heat events.

The urban heat island effect is present in virtually every major city in the world, and the workflow you developed in these lessons can be used to analyze other cities and other dates. During the creation of these lessons, various potential explanatory rasters were investigated, including elevation, distance to industry, distance to open spaces, population density, and canopy cover. These variables did not significantly improve the interpolation results for Madison, Wisconsin, on August 8, 2016, at 8:00 p.m., but any of these (and many more) could prove useful for interpolating temperature in other urban settings. You are encouraged to attempt to repeat these exercises using temperature data from different cities on different days. You may find that different explanatory variables are useful for different locations and dates, and you should try to find the variables that work best for your data.

You can find more lessons in the Learn ArcGIS Lesson Gallery.