Import census tracts and join tables

You will prepare an indicator layer for New York City. You will combine attributes from several existing data sources into one spatial layer. By having all the attributes in the census tracts layer, you can map and analyze the tracts using the additional attributes. You will use existing spatial and tabular data from the American Community Survey (ACS), raster data to measure tree canopy, and proximity analysis to measure access to specific women's resources to build the indicator layer.

Download data and prepare the project

First, you'll download the data that you'll use in the tutorial.

  1. Download the data that you'll use in the tutorial.
  2. In Microsoft File Explorer, create a folder on the C:\ drive named IndicatorData.
  3. Extract the contents of the downloaded .zip file to the IndicatorData folder.
  4. Start ArcGIS Pro. If prompted, sign in using your licensed ArcGIS organizational account.
    Note:

    If you don't have access to ArcGIS Pro or an ArcGIS organizational account, see options for software access.

    When you open ArcGIS Pro, you're given the option to create a new project or open an existing one. If you've created a project before, you'll see a list of recent projects.

  5. Under New Project, click Map.

    Map under New Project

  6. In the New Project window, for Name, type Indicators, for Location, accept the default folder, and click OK.

    New project named Indicators in the New Project window

    Now that you have a project, you will add a folder connection so you can easily access your data.

  7. View the Catalog pane. If you don't see it, from the View menu, in the Windows group, choose Catalog Pane.

    Catalog Pane in the Windows group on the View tab

  8. In the Catalog pane, right-click Folders and choose Add Folder Connection.

    Add Folder Connection for the Folders folder in the Catalog pane

  9. In the Add Folder Connection window, browse to and select the IndicatorData folder, select it, and click OK.
  10. Expand the IndicatorData folder to view its contents.

    Contents of the IndicatorData folder under the Folders folder

    In the folder, there are several .csv and .xlsx files that store attribute information. There is also a shapefile named nyct2020 that you will import into the geodatabase. Once you import the shapefile into the geodatabase as a feature class, it will act as the foundation for the remaining indicator information. Joining the tabular information to the spatial data allows you to analyze and visualize all the information.

Import a shapefile into the geodatabase

You have downloaded the data, created a project, and connected to a folder to access the data. Now you will import the shapefile into the geodatabase.

  1. In the Catalog pane, expand Databases.

    Every project in ArcGIS Pro comes with a default geodatabase, named the same as the project. The project geodatabase is called Indicators.gdb.

  2. Right-click Indicators.gdb, point to Import and choose Feature Class(es).

    Feature Class(es) option in the Import menu for the Indicators database in the Catalog pane

    The Feature Class To Geodatabase tool opens.

  3. For Input Features, click the browse button, browse to the IndicatorData folder, click nyct2020, and click OK.

    The nyct2020.shp file selected for Input Features in the Feature Class To Geodatabase tool pane

    Because you right-clicked the geodatabase and chose to import data into it, the Output Geodatabase parameter is already set to the project geodatabase.

  4. Click Run.
  5. In the Catalog pane, expand the Indicators geodatabase.

    The nyct2020 census data imported into the Indicators geodatabase in the Catalog pane

    The shapefile was converted into the geodatabase and stored as a feature class that contains polygons. Feature classes are collections of data of the same type, point, line, or polygon.

  6. From the Indicators geodatabase, click the nyct2020 layer and drag it onto the map to add it.

    The nyct2020 layer showing census tracts on the map

    The census tracts appear on the map and are displayed using a default color.

    Note:

    Your census tracts may display using a different color than what is shown in the image.

  7. In the Contents pane, click the nyct2020 layer once to select it and click it again to make the name editable.
  8. Type NY Census Tracts and press Enter.

    The NY Census Tracts layer renamed

    You have imported a shapefile into a geodatabase feature class, added it to a map, and renamed the layer. Now that you have the foundational spatial dataset in the database, you will add information to it through table joins.

Explore tabular data

Next, you will join ACS data to the census tracts. The ACS data for the entire state of New York is currently in a .csv file, or a nonspatial format. You will join the two datasets based on a common attribute to incorporate the ACS data into the census tracts.

  1. In the Catalog pane, expand Folders and expand the IndicatorData folder.

    The Data4Join.csv file contains the ACS data for the whole state that you want to join to the NY Census Tracts layer.

  2. Drag the Data4Join.csv file onto the map.

    Data4Join.csv file under the IndicatorData folder in the Catalog pane

    The .csv file appears in the Contents pane under Standalone Tables.

    The Data4Join.csv file under the Standalone Tables section in Contents pane

    Tables such as a .csv file do not have a spatial component, so they are listed in the Standalone Tables section of the Contents pane. While tabular data doesn't appear on the map by default, you can use it to enhance your feature layers by joining data, or if the table has coordinates, you can display the data based on the coordinates.

    Next, you'll inspect the table.

  3. Right-click the Data4Join.csv table and choose Open.

    Open option for the Data4Join.csv table in the Contents pane

    The table contains many attributes that you can use for mapping. Currently, the table is in .csv format and doesn't have an OBJECTID field, which means you cannot join it with another layer. Also, the GEO_ID field that you will use as the matching field in the join is a different type than the same field in the census tracts layer. To join tables, you must have common fields that have the same data type.

  4. In the table, click the options button and choose Fields View.

    Fields View option in the table options menu

  5. Locate the GEO_ID field and notice that its Data Type value is Big Integer.

    Big Integer field in the Fields view of the Data4Join.csv table

  6. In the Contents pane, right-click NY Census Tracts and choose Attribute Table.

    Attribute Table option for the NY Census Tracts layer in the Contents pane

  7. As you did with the .csv table, click the options button and choose Fields View.

    The GEOID field in the census tracts layer contains the same information as the GEO_ID field in the table, but it is in text format.

    GEOID field properties in the Field View for the NY Census Tracts attribute table

    The field types must match for a join to work properly. To ensure you can use this table in a join, you will import it into the geodatabase and add and calculate a text field to store the information.

  8. Close all tables and Fields views.

Prepare the data for a join

Now that you have identified the need to import the .csv table into your geodatabase and add and calculate a field to use for the join, you will perform these operations to prepare the data appropriately.

  1. In the Catalog pane, right-click the Indicators geodatabase, point to Import, and choose Table.
  2. For Input Table, click the drop-down menu and choose Data4Join.csv.

    Data4Join.csv chosen for the Input Table parameter in the Table To Geodatabase tool pane

  3. Click Run.
  4. In the Catalog pane, in the Indicators geodatabase, right-click Data4Join.csv and choose Rename. Type ACS_Data and press Enter.

    ACS_Data table in the Indicators geodatabase in the Catalog pane

  5. Add the ACS_Data table to the map.
  6. In the Contents pane, right-click Data4Join.csv and choose Remove.

    Remove for the Data4Join.csv table in the Contents pane

    Before you join the tables, you must add a text field and calculate it.

  7. Right-click the ACS_Data table, point to Data Design, and choose Fields.
  8. At the bottom of the fields list, click Click here to add a new field.

    Click here to add a new field at the bottom of the Fields view for the ACS_Data table

  9. For Field Name, type GEOID, and for Data Type, choose Text.

    GEOID field properties set to Text Data Type in the ACS_Data table

  10. On the ribbon, on the Fields tab, in the Manage Edits group, click Save.

    Save in the Manage Edits group on the Fields tab

  11. Close the Fields view.
  12. Open the ACS_Data table.
  13. Scroll to the end and find the GEOID field. Right-click it and choose Calculate Field.

    Calculate Field for the GEOID field in the ACS_Data table

  14. In the Expression section, for Fields, double-click GEO_ID to add it to the expression.

    Calculate Field expression setting the GEOID field to equal GEO_ID value

    You are using the GEO_ID field to populate the GEOID field that you added.

  15. Click OK.

    Resulting calculated field for the GEOID field in the ACS_Data table

    The field is now of the correct type and populated with the correct information.

  16. Close the table.

    You are now ready to perform the join. In this case, you will join the ACS_Data table to the NY Census Tracts layer to supplement your spatial data.

Join ACS data to the census tracts layer

Next, you will join ACS data to the census tracts. You will join the two datasets based on a common attribute to incorporate the ACS data into the census tracts.

  1. In the Contents pane, right-click the NY Census Tracts layer, point to Joins and Relates and choose Add Join.

    Add Join option in the Joins and Relates menu for the NY Census Tracts layer in the Catalog pane

    The Add Join window appears. Here, you can input the parameters for the join, such as the tables involved and the matching fields.

  2. In the Add Join tool, enter or verify the following parameters:
    • For Input Table, verify that NY Census Tracts is selected.
    • For Input Field, verify that GEOID is selected.
    • For Join Table, verify that ACS_Data is selected.
    • For Join Field, verify that GEOID is selected.
    • Uncheck Keep all input records.
    • For Join Operation, choose Join one to first.

    Add Join entered to join ACS_Data fields to the NY Census Tracts table

    You have entered all the parameters for the join. Next, you will validate the join to ensure it will work properly before you run the tool.

  3. Click the Validate Join button.

    The Message window appears.

    Message confirming the Add Join tool successfully completed the join

    There are 2,325 matching records for the join. This is the same number of census tracts in the feature layer. Even though the ACS table contains data for the whole state of New York, it will only join the information that matches with the tracts, based on the common fields.

  4. In the Message window, click Close, and in the Add Join window, click OK.

    The join is complete, but there is no visible change on the map. Where you will see the difference is in the layer's attribute table.

  5. Open the NY Census Tracts attribute table.
  6. Scroll to the right and notice the ACS_Data fields.

    ACS_Data fields joined to the NY Census Tracts table

    Now all the fields from the ACS_Data table are joined to the census tracts based on the common field.

  7. Close the table.
  8. From the Quick Access Toolbar, click Save to save the project.

    Save project button on the Quick Access toolbar

    You have joined the ACS attributes to the census tracts layer. Now you can use these fields for symbology, labeling, and analysis.

Export the joined layer

The join is virtual in the layer, but not stored separately as its own data source. You will export the census tracts layer as a feature class to store the joined fields with the census tract features.

  1. In the Contents pane, right-click NY Census Tracts, point to Data and choose Export Features.

    Export Features option in the Data menu for the NY Census Tracts layer in the Contents pane

    The Export Features window appears. The Input Features parameter is already set properly because you right-clicked the layer to export it.

  2. For Output Feature Class, replace the default name with NY_ACS_Tracts and click OK.

    Output Feature Class set to NY_ACS_Tracts in the Export Features window

  3. Click OK.
  4. Remove the NY Census Tracts layer from the map.
  5. In the Contents pane, rename NY_ACS_Tracts to NYC Census Tracts.

    Now the census tracts layer you have in the map contains all the attributes from the ACS table and is its own data source. If you share this data source in any form, all the attributes will be present.

  6. Save the project.

Add and calculate fields

Next, you will add and calculate two fields, both percentage fields to account for education level and reproductive age.

  1. In the Contents pane, right-click NYC Census Tracts, point to Data Design, and choose Fields.
  2. Scroll to the bottom and click Click here to add a new field two times.

    Two rows appear. Next, you will edit the field properties.

  3. For the first row, enter the following properties:
    • For Field Name, type or copy and paste Bachelors_degree_higher_women.
    • For Alias, type or copy and paste Is a bachelor's degree or higher attainable for women?
    • For Data Type, choose Double.
  4. For the second field, enter the following properties:
    • For Field Name, type or copy and paste Percent_reproductive_age.
    • For Alias, type or copy and paste What percent of women are at reproductive age?
    • For Data Type, choose Double.

    Field Name, Alias, and Data Type set for the two new fields in the NY Census Tracts Fields view

    Note:

    The green boxes next to the field indicate there are unsaved changes.

  5. On the ribbon, on the Fields tab, in the Manage Edits group, click Save.
  6. Close the Fields view.

    To measure the indicators for education and reproductive health, you will calculate the fields to percentages.

  7. Open the attribute table for NYC Census Tracts and scroll to the end of the table to see the two fields you just added.

    Two new fields at the end of the attribute table for NY Census Tracts

  8. Right-click Is a bachelor's degree or higher attainable for women? and choose Calculate Field.
  9. In the Expression section, for EducationForWomen =, copy and paste the following expression: (!Women_getting_a_Bachelor_s_Degree_or_higher! / !Total_Female_Population_for_Education!) * 100.

    Expression to calculate percent education in the Calculate Field window

  10. Click the green check mark to validate the expression and click Apply.

    A message window appears and states a warning because not all records have a value. This is fine and you will proceed.

  11. Close the warning window and click OK.

    Calculated education percentage field in the NYC Census Tract attribute table

    You have calculated the percentage of women with a bachelor's degree or higher. Next, you will calculate the other field in a similar manner.

  12. In the attribute table, right-click What percent of women are at reproductive age? and choose Calculate Field.
  13. In the Expression section, for WomenAtReproductiveAge =, clear the existing expression.
  14. Copy and paste the following expression: (!Women_at_reproductive_age_15_to_44! / !Total_Female_Population_for__reproductive_health!) * 100.

    Expression to calculate reproductive age percentage in the Calculate Field window

  15. Click Apply.

    A similar warning appears, which is fine and expected.

  16. Close the warning window and click OK.

    Calculated percent of women at reproductive age in the NYC Census Tract attribute table

    In the United States, it is commonly known that higher education leads to higher incomes. When you look at this table, you may think about whether women living in these areas have good models of success. This success model is measured by median income, educational attainment, and earnings relative to men. The percentage of women of reproductive age can be used to measure the impact of changes in state laws, such as an abortion ban. You can use the measure to increase outreach for gender-specific health services.

  17. Close the table and save the project.

    You have added and calculated two fields to account for key indicators in your analysis: percentages of women at certain education levels and at reproductive age.

You have created an indicator layer from existing data sources, including a shapefile and CSV file. You imported the data into the geodatabase, added fields, joined the data, exported it, and calculated fields. Next, you will create an environmental indicator using raster data.


Use raster data to create a tree canopy layer

In this section, you will prepare an indicator to measure tree canopy. Tree canopy is often a measure used for environmental indicators and can be paired with other environmental indicators such as temperature to provide a more full picture of the area. Another aspect of tree canopy is that historically, there is often unequal distribution of trees across many of America's cities. Tree canopy is a luxury in an urban space such as New York City. You'll use tree canopy as an environmental indicator to understand tree distribution and which women have access to shade.

Explore a land-cover image

You will start by adding land-cover data created from lidar data for New York City. The image has classified eight types of land cover.

  1. Return to the project in ArcGIS Pro.
  2. Go to the IndicatorData folder connection and expand the Land_Cover folder.

    Land-cover image in Land_Cover folder in the Catalog pane

    This image is a 6-inch resolution land-cover raster dataset for New York City.

  3. Add the image to the map.
  4. In the Contents pane, turn off the NYC Census Tracts layer.

    Land-cover image added to map

    This image layer has been classified into eight classes. Next, you'll review the image data by exploring its attribute table.

  5. Open the attribute table for the NYC_2017_LiDAR_LandCover.img layer.

    Notice the eight land-cover classifications that are present in the table. There are 7,446,483,259 cells in the raster classified as Tree Canopy.

    Tree canopy cell count in the attribute table for NYC_2017_LiDAR_LandCover.img

    When you think of places such as New York City or other urban spaces, you probably think of all the buildings, sidewalks, and busy streets. This reality makes trees and grass a luxury.

  6. Close the table.

Reclassify the land-cover image

Of the eight land-cover classifications in the image, you are only interested in the Tree Canopy classification. Next, you'll use a geoprocessing tool to reclassify the image and isolate only the cells classified as tree canopy.

  1. On the ribbon, click the Analysis tab. In the Geoprocessing group, click Tools.

    Tools in the Geoprocessing group on the Analysis tab

    The Geoprocessing pane appears. From here, you can search for tools by name or by the toolbox they are stored in.

  2. In the Geoprocessing pane, in the Find Tools bar, type reclass. Click the Reclassify (Spatial Analyst Tools) tool.

    Reclassify tool in the Geoprocessing pane

  3. In the Reclassify tool, set the following parameters:

    • For Input raster, click the drop-down menu and choose NYC_2017_LiDAR_LandCover.img.
    • Ensure that Reclass field is set to Class.
    • In the Reclassification table, for the Tree Canopy row, leave the value in the New column set to 1. Change the value in the New column for all other classes, except for NODATA, to 0.

    Parameters set in the Reclassify tool pane

  4. For Output raster, click the Browse button and browse to the IndicatorData folder. For Name, type TreeCanopyNYC.tif and click Save.

    Output raster parameter set to TreeCanopyNYC.tif in the Reclassify tool pane

    Note:

    Depending on your system, the Reclassify tool may take up to 20 minutes to complete.

    Alternatively, you can download the results data to use the TreeCanopyNYC.tif image file. To use this data instead, download and extract the .zip file to your computer and add it to your project in place of TreeCanopyNYC.tif.

  5. Click Run.

    When the image is finished processing, it appears on the map.

    Resulting image showing the Tree canopy layer with two classes

  6. In the Contents pane, remove the NYC_2017_LiDAR_LandCover.img layer.

    The TreeCanopyNYC.tif  layer has two classes: Tree Canopy in one class and all other land-cover classifications in the other class. You can use this raster to calculate the presence of tree canopy variable that will be the measure for the environment indicator.

  7. Save the project.

Next, you will use the Zonal Statistics as Table tool to summarize the amount of tree canopy in each census tract.

Summarize tree canopy within each census tract

For the indicator, you're interested in the presence of trees, and the higher value will represent more trees or a positive environmental factor. To determine tree cover in each census tract, you will summarize the tree canopy cells based on the census tract polygons.

  1. In the Geoprocessing  pane, click the back button. Search for and open the Zonal Statistics as Table (Spatial Analyst Tools) tool.

    Zonal Statistics as Table tool in the Geoprocessing tool pane

    This tool will summarize the number of tree canopy cells within each census polygon and provide a count of the total number of cells within each zone (polygon). This will allow you to calculate the percentage of the polygon cells covered with trees.

  2. In the Zonal Statistics as Table tool, enter the following parameters:
    • For Input Raster or Feature Zone Data, choose NYC Census Tracts.
    • For Zone Field, choose GEOID [GEOID].
    • For Input Value Raster, choose TreeCanopyNYC.tif.
    • For Output Table, type TreePixels.
    • For Statistics Type, choose Sum.

    Zonal Statistics as Table parameters set

    Note:

    Depending on your system, the Zonal Statistics as Table tool may take up to 30 minutes to complete.

    Alternatively, you can download the results data, extract the zip file, and add the TreePixels table to your project.

  3. Click Run.

    When the tool completes, the TreePixels table appears in the Contents pane under  Standalone Tables.

    TreePixels table under the Standalone Tables section in the Contents pane

  4. Open the TreePixels table.

    The table contains two columns of interest: COUNT, which is the total number of pixels within each census tract, and  SUM, which is the sum of tree canopy pixels.

    COUNT and SUM fields in the TreePixels table

    You'll calculate the percent of tree canopy for each census polygon using the following formula: PctTreeCanopy = (Sum / Count) * 100.

  5. In the attribute table, click Calculate.

    Calculate button in the table for TreePixels

    The Calculate Field tool appears. Previously, you created fields before opening the Calculate Field tool. This time, you will create the field and calculate it simultaneously.

  6. In the Calculate Field  tool, for Field Name (Existing or New), type PctTreeCanopy.
  7. For Field Type, choose Double (64-bit floating point).
  8. Under Expression, for PctTreeCanopy =, build the expression (!SUM! / !COUNT!)*100.

    Calculate Field parameters to calculate the percent of tree canopy in each census tract

  9. Click OK.

    PctTreeCanopy field added and calculated in the TreePixels table

    The PctTreeCanopy field appears at the end of the attribute table and is calculated.

    The PctTreeCanopy value represents the percentage of the census tract with tree cover and is the measure for the environment indicator.

  10. Close the TreePixels table, turn off TreeCanopyNYC.tif, and save the project.

You have reclassified a land-cover image to isolate the cells that you want to include in the indicator: tree cover, and summarized the tree cover by census tracts. Now you know the percentage of tree cover in each census tract in New York City. The TreePixels table is ready to join the layer of the NYC Census Tracts layer.


Add an indicator based on proximity

The next indicators you create will measure access to specific things. Oftentimes, organizations are trying to determine where things are located, for example, gender-based resources. Once you identify the locations, the next step is determining access to these locations. Usually, access to something is measured in proximity to that location. You will create point layers that represent the locations of women's facilities. Then you will buffer the facilities by a half-mile to determine proximity to those facilities. Also, you will do the same with eviction locations because studies have shown that Black and brown women are often negatively impacted by forced ejectments. You want to know the areas in New York City where women are experiencing forced ejectments from their homes or rentals.

Create points from a table

You have worked with tabular data throughout this tutorial, but thus far, all of it was nonspatial, or didn't have some type of spatial component, such as coordinates. Next, you will map evictions from a table that contains coordinates of their locations.

  1. In the Catalog pane, from the IndicatorData folder, add Evictions.csv to the map.
  2. Open the Evictions.csv table and scroll to the right until you see the Latitude and Longitude fields.

    Latitude and Longitude fields

    The Latitude and Longitude fields store the coordinates for each eviction. You will use these fields to map the evictions as points on the map.

  3. Close the table.
  4. On the ribbon, on the Map tab, in the Layer section, click XY Table To Point.

    XY Table To Point tool

    The XY Table To Point tool appears in the Geoprocessing pane.

  5. In the XY Table To Point tool, set or verify the following parameters:
    • For Input Table, choose Evictions.csv.
    • For Output Feature Class, replace the default name with Evictions.
    • For X Field, verify that Longitude is selected.
    • For Y Field, verify that Latitude is selected.
    • For Coordinate System, verify that GCS_WGS_1984 is selected.

    XY Table To Point tool parameters

    The XY Table To Point tool chooses smart parameter defaults based on the field names.

  6. Click Run.

    Eviction points on the map

    Note:

    You will get a warning about null values and can ignore it.

    Next, you will add a table containing women's facilities and map those locations using the same tool.

  7. From the Catalog pane, add Womens_Facilities.csv to the map.

    Womens_Facilities.csv file

  8. On the Map tab, click XY Table To Point.
  9. In the XY Table To Point tool, set the following parameters:
    • For Input Table, choose Womens_Facilities.csv.
    • For Output Feature Class, change the name to WomensResources.
    • For X Field, choose Location 2.
    • For Y Field, choose Location 1.
    • For Coordinate System, verify that GCS_WGS_1984 is selected.

    Women's facilities parameters

  10. Click Run.
  11. In the Contents pane, turn off Evictions to see the WomensResources points.

    Women's facilities on map

    Note:

    To see the points better, you can change the color.

    You have created two feature layers from nonspatial tables to map important criteria for the indicators.

Filter data to only show certain types of features

Now that you have all the points on the map, you will narrow the focus of your analysis to only include a specific type of eviction. For evictions, you're only interested in ejectments, so you will filter out what you need. A big part of analysis is narrowing the focus of your data to include only specific things, such as tree canopy cover and ejectments.

  1. Open the attribute table for Evictions.
  2. Scroll and locate the Ejectment field.

    Ejectment field

    You'll use this field to make the attribute selection.

  3. In the table, click Select By Attributes.

    Select By Attributes

  4. For Where, click the drop-down menu and choose Ejectment.
  5. For the second drop-down menu, keep is equal to and for the last drop-down menu, choose Ejectment.

    Selection expression

  6. Click OK.
  7. In the lower-left corner of the table, click Show Selected Records.

    Show Selected Records button

    Now only the selected records show. There should be 67 records selected. You will switch the selection to select the features that you don't want to use and delete them.

  8. In the table, click Switch Selection.

    Switch Selection button

    Now, 89,835 records that you don't need are selected.

  9. Click Delete Selection.

    Delete Selection button

  10. Click Yes to confirm the deletion.
  11. Click Show All Records.

    Show All Records button

  12. Close the table and save the project.

    Now the Evictions table contains only the 67 records that you want to include in your analysis.

Create walk-time buffers

Next, you will incorporate proximity to the evictions and women's facilities into your analysis. You will create half-mile buffers around the features to represent walking distance.

  1. In the Geoprocessing pane, search for and open the Pairwise Buffer tool.

    Pairwise Buffer tool in the Geoprocessing pane

  2. In the Pairwise Buffer tool, set the following parameters:
    • For Input Features, choose WomensResources.
    • For Output Feature Class, replace the default with ResourcesBuffer.
    • For Distance, type 0.5.
    • Under Linear Unit, choose US Survey Miles.
    • For Method, choose Geodesic (shape preserving).
    • For Dissolve Type, choose Dissolve all output features into a single feature.

    Parameters entered in the Pairwise Buffer tool to create buffers around the WomensResources layer

  3. Click Run.
  4. In the Contents pane, ensure that the only visible layers, aside from the basemaps, are WomensResources and ResourcesBuffer.

    ResourcesBuffer layer on the map

    You have created buffers for the resources points. Next, you will create buffers for the evictions features.

  5. In the Pairwise Buffer tool pane, which is still open, update the following parameters:
    • For Input Features, choose Evictions.
    • For Output Feature Class, replace the default with EvictionsBuffer.

    Input Features and Output Feature Class parameters updated in the Pairwise Buffer tool to create buffers around the Evictions point layer

  6. Click Run.
  7. In the Contents pane, turn off WomensResources and ResourcesBuffer and turn on Evictions and EvictionsBuffer.

    EvictionsBuffer layer added to the map

    You have created layers to represent half-mile buffers around the evictions and women's resources points. Having these buffers allows you to incorporate proximity into your indicator preparation.

Create indicator tables

Now you are ready to create the indicator tables.

  1. In the Geoprocessing pane, click the back arrow. Search for and open the Tabulate Intersection tool.
  2. In the Tabulate Intersection tool, set the following parameters:
    • For Input Zone Features, choose NYC Census Tracts.
    • For Zone Fields, choose GEOID [GEOID].
    • For Input Class Features, choose EvictionsBuffer.
    • For Output Table, type EvictionsIndicator.
    • For Sum Fields, choose SHAPE_Area.

    Parameters entered in the Tabulate Intersection tool

  3. Click Run.

    In the Contents pane, the EvictionsIndicator table appears under Standalone Tables.

    EvictionsIndicator table added

    Next, you will create the indicator table for women's resources.

  4. In the Tabulate Intersection tool, change only the following parameters:
    • For Input Class Features, choose ResourcesBuffer.
    • For Output Table, change the name to ResourcesIndicator.

    Parameters updated in the Tabulate Intersection tool pane for the ResourcesIndicator table parameters

  5. Click Run.

    In the Contents pane, the ResourcessIndicator table appears under Standalone Tables.

  6. Open both indicator tables.
  7. Click the tab for one of the tables and drag it until you see the options for docking. Dock it to the right of the other table.

    ResourcesIndicator table docked to the right side of the pane.

    Each table contains a PERCENTAGE field that measure access to two different things.

    Indicator tables side by side to compare the PERCENTAGE field

    Higher percentage values for evictions are bad because they represent forced unhousing of people. On the other hand, access to women's resources is a good measure. Therefore, higher percentages mean increased access to gender-specific services.

  8. Undock the table, close both tables, and save the project.

    Next, you will join the evictions and resources indicator tables to census tracts so you have percentages of each for each tract.

Organize the Contents pane

Now that you have all the data that you want for the indicators, you will quickly organize the Contents pane before you join the data. You'll create a group layer to help organize the layers.

  1. In the Contents pane, press Ctrl and click all the layers except NYC Census Tracts to simultaneously select them.

    Selected layers in the Contents pane

  2. Right-click one of the selected layers and choose Group.

    Group option for all the selected layers in the Contents pane

    This groups all selected layers in a group called New Group Layer.

  3. Click the name New Group Layer one time to select it and click it again to make it editable.
  4. For the name, type Working Data.

    Working Data group layer expanded in the Contents pane

    Next, you will join indicator data.

Join indicator tables to census tracts

You have three indicators in stand-alone tables: TreePixels, EvictionsIndicator, and ResourcesIndicator. To get this information into the census tracts, you will perform three join operations to append the fields from the indicator tables to the census tracts.

  1. In the Contents pane, right-click NYC Census Tracts, point to Joins and Relates, and choose Add Join.
  2. In the Add Join tool, enter the following parameters:
    • For Input Table, choose NYC Census Tracts.
    • For Input Field, choose GEOID [GEOID].
    • For Join Table, choose TreePixels.
    • For Join Field, choose GEOID.
    • Leave Keep all input records checked.
    • For Join Operation, choose Join one to first.

    Parameters entered in the Add Join tool to join the TreePixels table fields to the NYC Census Tracts layer

  3. Click OK.

    Nothing happens on the map, but the attributes are appended to the NYC Census Tracts table. You will complete the other two joins and explore the table.

    Next, you'll repeat the join for the EvictionsIndicator and ResourcesIndicator tables.

  4. Open the Add Join tool for the NYC Census Tracts layer and enter the following parameters:
    • For Input Table, choose NYC Census Tracts.
    • For Input Field, choose GEOID (there are many now due to the joins, but any will work).
    • For Join Table, choose EvictionsIndicator.
    • For Join Field, choose GEOID.
    • For Join Operation, choose Join one to first.
    • Leave Keep all input records checked.
  5. Click Run.

    Finally, you will join the WomensResources table to the census tracts.

  6. Open the Add Join tool for the NYC Census Tracts layer and enter the following parameters:
    • For Input Table, choose NYC Census Tracts.
    • For Input Field, choose GEOID (there are many now due to the joins, but any will work).
    • For Join Table, choose ResourcesIndicator.
    • For Join Field, choose GEOID.
    • For Join Operation, choose Join one to first.
    • Leave Keep all input records checked.

    You have joined all the tables that you need to the NYC Census Tracts layer. Next, you will export the joined layer to its own feature class and clean up the fields in the process.

Export census tracts

The NYC Census Tracts layer now has four tables joined to it. As you did earlier with the join, you'll export the layer to its own data source.

  1. In the Contents pane, right-click NYC Census Tracts, point to Data, and choose Export Features.
  2. In the Export Features tool, change the Output Feature Class parameter to Indicators.

    When you join data, you are appending many fields into one table and you may want to either delete or rename some field aliases to make the data more clear. Next, you will clean up the fields before you export the data.

  3. Expand Fields, check Use Field Alias as Name, and click Edit.

    Use Field Alias as Name

    The Field Properties window appears. You'll keep only the fields for the exploratory analysis and rename the indicator fields.

  4. If necessary, point to the vertical divider next to the Fields section and resize it so you can see the full field aliases.

    Resizing the divider

  5. In the Fields section, click What's the median income for women? In the Properties section, for Alias, type Median Income Women.

    Alias changed

  6. Using the same workflow, change the alias for each of the following fields as stated:
    • Change Are women earning more than men? to Pay Equity.
    • Change Is there an abortion ban? Yes or No to Abortion Ban.
    • Change Are child marriages legal? Yes or No to Child Marriages.
    • Change Percent White Women to White Women.
    • Change Percent Black Women to Black Women.
    • Change Percent American Indian or Alaska Native Women to AIAN Women.
    • Change Percent Asian Women to Asian Women.
    • Change Percent Native Hawaiian or Other Pacific Islander Women to NHOPI Women.
    • Change Percent Mixed Race Women to Mixed Race Women.
    • Change Percent Hispanic or Latino Women to Hispanic or Latino Women.
    • Change EducationForWomen to Education.
    • Change WomenAtReproductiveAge to Women at Reproductive Age.
    • Change PctTreeCanopy to Tree Canopy.
    • Change PERCENTAGE (EvictionsIndicator.PERCENTAGE) to Evictions.
    • Change PERCENTAGE (ResourcesIndicator.PERCENTAGE) to Gender Based Resources.

    Alias name updated

    Next, you will delete some fields that you don't need.

  7. In the Fields list, click Total Female Population for Education and click the Remove button.

    Remove button

  8. In the same manner, remove the following fields:
    • Women getting a Bachelor's Degree or higher.
    • Total Female Population for reproductive health.
    • Women at reproductive age 15 to 44.
  9. Click OK to close the Field Properties window and click OK again to run the export.

    The Indicators layer appears on the map and in the Contents pane.

  10. Open the attribute table for the Indicators layer and scroll to the right until you see the updated aliases being used as the field header.

    Field aliases in table

Modifying the aliases during the export was a good way to make the table easier to interpret. Now you have all the indicators available in the tracts layer. You can use those indicator fields for symbology, labeling, querying, and analysis.

You have created point layers from coordinates in tables to map evictions and women's resources. You buffered the evictions and women's resources points by a half-mile and used the buffers to create indicators for each variable. You also performed several joins to get all the indicators into the census tracts layer and exported it to its own feature class. The two indicator tables you created measure proximity, but for very different reasons. Higher percentages for evictions are bad because it represents forced unhousing, but it is important to highlight areas burdened by this issue. On the other hand, access to women's resources is a positive measure because women have more support in these areas. Next, you'll join the evictions and women's resources tables to the census tracts and dig deeper into the data relationships using exploratory data analysis.


Explore the data using charts and symbology

Now that you have all the indicators in one layer, you will explore the variables in a scatter plot matrix to gain a better understanding of their relationships. An important part of conducting any analysis is to evaluate the resulting data after calculations are complete. This will help you determine whether the dataset contains skewed data distribution, which could impact your analysis and inform if additional adjustments or methods need to be implemented for the most accurate analysis results.

Explore the indicator data

You will create a scatter plot matrix to compare the relationship between each indicator. This is a helpful way to determine positive and negative correlations and the degree or magnitude of those correlations.

  1. In the Contents pane, right-click the Indicators layer, point to Create Chart, and choose Scatter Plot Matrix.

    Scatter Plot Matrix in the Create Chart menu for the Indicators layer in the Contents pane

    The Chart Properties pane and an empty chart window appear. When you set properties in the Chart Properties pane, the chart will automatically display and update in the chart window.

  2. In the Variables section, click Select.

    Select under Numeric fields in the Chart Properties pane

    A list of the attributes in the Indicators layer appears. For a scatter plot matrix, you must select at least three variables. One of the variables that you want to explore is Median Income, but it is not showing up in the list.

  3. Open the Fields view for the Indicators layer.
  4. Locate the Median Income Women field and view its Data Type.

    Median Income Women Data Type set to Text in the Fields view for the Indicators layer

    The Median Income Women field has a type of Text. You cannot plot a text field in a scatter plot matrix, so you must add a numeric field and calculate it to store the income values.

  5. Using the skills you have performed in this tutorial, add a field called WomensMedianIncome with an Alias of Womens Median Income and a Data Type of Double.
  6. Calculate the WomensMedianIncome based on the Median Income Women field.

    Calculation expression for median income in the Calculate Field window

    You can disregard any warnings in the calculation.

    Womens Median Income field populated with value from the Median Income Women field

  7. In the Chart Properties pane, click Select.
  8. In the variables list, check the boxes for Pay Equity, Education, and Womens Median Income.

    The selected variables are listed.

    Selected variables appear under the Numeric fields in the Chart Properties pane

    The variables appear on the scatter plot matrix.

    Scatter plot matrix showing the relationship between the three specified fields

  9. Under Trend, click Show trend line.

    Show trend line option checked under Trend in the Chart Properties pane

    The trend lines appear for each variable to indicate how the variable is trending.

    Scatter plot matrix with trend lines visible, showing all three correlations have a positive trend

  10. In the Matrix Layout section, for Lower left, verify that Scatterplots is selected, and for Upper right, click the drop-down menu and choose Pearson's r.

    Lower left set to Scatterplots and Upper right set to Pearson's r in the Matrix Layout section in the Chart Properties pane

    The scatter plot matrix allows you to explore many relationships in a single chart. It visualizes the bivariate relationship between the variables you selected. Next, you'll explore the relationship of economic outcomes for white, Black, and Latino women.

  11. In the Chart Properties pane, for Variables, click Select and check the boxes for White Women, Black Women, and Hispanic or Latino Women.

    The White Women, Black Women, and Hispanic or Latino Women fields checked in the list of variables for Numeric fields in the Chart Properties pane

    These mini plots show r-values with diverging colors that correspond to the strength and direction of the relationship.

    Scatter plot showing the relationship between the six variables

    Next, you'll sort the mini plots.

  12. In the Chart Properties pane, in the Sort section, click the Sort by drop-down menu and choose Pearson's r. For Target field, choose Womens Median Income, and for Sort direction, choose Descending.

    Sort parameters set in the Chart Properties pane

    Generally, the values will be between +1 and -1. There are three relationships to look for in the scatter plot matrix:

    • Positive correlation, values closer to +1.
    • No correlation, values close to 0.
    • Negative correlation, values close to -1.

    Three plots show a strong positive relationship, with values of 0.8, 0.55, and 0.6, respectively. Next, you'll explore the variables for each of the relationships.

    Three plots with strong positive relationships in the scatter matrix plot

  13. In the chart, click the box with the Pearson's r value of 0.8.

    The corresponding scatter plot for Education and Womens Median Income is outlined in the scatter matrix plot.

    Scatter plot that outlines when the 0.8 Pearson's r is selected

    The plot with a value of 0.8 represents the relationship between the Education and Womens Median Income variables. It is expected that as education increases, income would also increase.

  14. Click the box with the r-value of 0.55.

    Scatter plot that outlines when the 0.55 Pearson's r is selected

    The plot for the White Women variable is outlined. There is a strong positive relationship between white women and median income, so as the percentage of white women increases, so does the median income.

  15. Click the box with the r-value of 0.6.

    The plot showing the relationship between the White Women and Education variables is outlined. Based on the chart, as the percentage of white women increases, the percentage of women with a bachelor's degree or higher also increases. Next, you'll explore whether there is a similar relationship for Black women.

  16. In the chart, click the box with the r-value of -0.26 and -0.32.

    Plots that outline when the -0.32 and -0.26 r-values are selected

    The plots for Black Women highlights, showing the relationships between Black women, income, and education show a negative correlation; therefore, as the percentages of these two groups increase, both income and education decrease.

  17. To explore the relationship between Hispanic or Latino women, income, and education, click the r-values of -0.43 and -0.47.

    Scatter plots that outline when the r-value for the relationship between Hispanic or Latino income and education relationships are selected

    The relationships between Hispanic or Latino women, income, and education show a negative correlation; therefore, as the percentages of these two groups increase, both income and education decrease.

  18. Select the box with the r-value of -0.63.

    Relationship between Black and white women in the scatter plot matrix

    The selected plot represents the relationship between percentages of Black and white women, which means as the percentages of one group increase, the other decreases. Therefore, it is likely that these two groups often don't live in the same areas.

  19. Close any open windows except the map. Close the Chart Properties pane and save the project.

    You've just explored the data using a scatter plot matrix with Pearson's r values. If you were to use these indicators in an index, you would consider whether they are important to the outcomes and/or whether the indicator is the focus of the index. For example, you wouldn't include race and/or ethnicity in the index value calculations; however, you may use these factors to disaggregate the index. Next, you'll consider another example: pay equity. Pay equity is a derived variable of income between women and their male counterparts. Pay equity provides great insight into how gender parity is measured by income, but for an index with the current set of indicators, you may want to exclude it. You already have median income as a variable. Additionally, if you were to expand these topic areas and consider having subindices like economics having a median income, pay equity, and a few other data points, it would work better.

Map an indicator

Now that you've explored the indicator data using a scatter plot matrix and gained an understanding about the variables, you will display the Indicators layer using bivariate symbology. You'll create a relationship map of education and income. Relationship maps show a visual representation of two variables. This will help you see the interaction of the indicators in more than one dimension, which is often referred to as superdiversity or intersectionality.

  1. In the Contents pane, right-click the Indicators layer and click Symbology.

    The Symbology pane appears.

  2. For Primary symbology, click the drop-down menu and choose Bivariate Colors.

    Bivariate Colors for Primary symbology in the Symbology pane

  3. For Field 1, choose Education.
  4. For Field 2, choose Womens Median Income.
  5. For Method, verify that Quantile is selected.
  6. For Grid Size, verify that 3 x 3 is selected and keep the Pink-Blue-Purple color scheme.

    Field 1, Field 2, Method, and Grid Size parameters set in the Symbology pane

    Next, you'll change the outline color.

  7. For Template, click the existing color.

    Template in the Symbology pane

  8. Click the Properties tab. For Outline color, click the existing color and choose Gray 30%.

    Gray 30% for Outline color on the Properties tab in the Symbology pane

  9. For Outline width, change the current value to 0.2 pt.

    Outline width set to 0.2 pt

  10. Click Apply.

    Indicators layer styled by bivariate colors

    This will symbolize the relationship between education and median income from low to high. Where both education and median income for women are high, those areas will be shaded purple. This area is primarily in Manhattan and a portion of Brooklyn.

  11. Change the name of the layer to Education x Median Income for Women.
  12. Save the project.

    You've just completed two methods for exploratory data analysis: charts and mapping. Using charts, you can investigate relationship strength and identify indicators to exclude from an index. Typically, these will be highly correlated indicators that can skew index values. Mapping visualizations allows you to see patterns of multiple indicators, which is a key to understanding social processes.

In this tutorial, you introduced the geographic approach to racial equity and social justice and applied it to indicator development. You prepared indicator layers using the American Community Survey data to obtain education, pay, and income data. You also learned how to reclassify imagery and calculate tree canopy based on pixels in the polygon tracts. Then, you developed an indicator based on proximity to look at access to gender-based resources. The final step was to perform an exploratory data analysis that you can use to identify highly correlated indicators, which can skew an index.

You can apply this indicator development methodology to other areas of interest around the world and can include data specific to your community. When preparing your own indicators, use data processing and indicators specific to your long-term goals, outcomes, and populations. You can find more on exploratory data analysis in this blog post.

You can find more tutorials in the tutorial gallery.