Prepare data for analysis

Before you begin your analysis of where to implement the mentoring program, you'll filter and adapt some data from a couple of sources.

Open the project

First, you'll download and open the default project data in ArcGIS Pro.

  1. Download the Identify_Schools.zip file.
  2. Locate the downloaded file on your computer.
    Note:

    Depending on your web browser, you may have been prompted to choose the file's location before you began the download. Most browsers download to your Downloads folder by default.

  3. Right-click the file and extract it to a location you can easily find, such as your Documents folder.
  4. Open the extracted Identify_Schools folder.

    The folder contains an ArcGIS Pro packaged project file (.ppkx) and a comma-separated values (.csv) table.

  5. If you have ArcGIS Pro installed on your machine, double-click the Identify_Schools.ppkx file. If prompted, sign in using your licensed ArcGIS account.
    Note:

    If you don't have access to ArcGIS Pro or an ArcGIS organizational account, see options for software access.

    Chatham County schools and high school attendance zones

    The project contains a map that shows the schools in Chatham County, Georgia, United States, and the attendance zones for the high schools. The schools are represented as point features, and the zones are represented as polygon features.

    These two layers were downloaded from the SAGIS Open Data Portal, the authoritative data clearinghouse for the area.

Add a definition query

There are many more schools than there are high school attendance zones. That's because the schools layer contains all schools, not just public high schools. You'll often find that data from authoritative sources contains more features and attributes than you need for your project.

  1. In the Contents pane, right-click Schools and click Attribute Table.

    Attribute Table option for the Schools layer

    Each row in this table has values for one school. The text at the bottom of the table tells you that there are 100 schools. The table has fields that contain values that describe each school.

    The TYPE field specifies the type of school for each point. Since you are interested in high school graduation rates, you'll use this attribute to filter the layer so it only shows high schools. The SCHOOL_TYP field identifies public and private schools.

  2. In the Contents pane, double-click Schools.

    The Layer Properties window appears.

  3. In the Layer Properties window, click the Definition Query tab and click New definition query.

    New definition query option

    A new query, Query 1, appears in the Definition Query pane. On the Where line, there are drop-down lists that allow you to choose a field, a relationship, and a value.

  4. Choose the field TYPE, choose is equal to, and choose High.

    Query that reads Type is equal to High

    This query will filter the layer to show high schools. Only the public schools are of interest for this project, so you'll add another clause to the query before running it.

  5. Click Add Clause.

    Another line appears in the Definition Queries pane.

  6. Leave the And operator and create the query SCHOOL_TYP is equal to Public School.

    Query that reads And SCHOOL_TYP is equal to Public School

    The clauses are joined by an And operator, so only school point features that are both high schools and public schools will be displayed on the map.

    The SQL switch at the upper right of the query builder allows you to see (and edit) the query clauses represented as Structured Query Language (SQL) code, but most of the time it is simpler to use these controls to build queries.

  7. Click Apply.
  8. Click OK.

    Map showing only public high schools

    You've filtered the Schools layer with a definition query. Now the layer only shows public high schools, the schools that are important for this part of the analysis.

Hide extra fields

The Schools layer has several fields that are not needed for this project. You'll hide them.

  1. In the attribute table for the Schools layer, click the header for the GRADES field.

    The field changes color to indicate that the field has been selected.

  2. Press Ctrl while clicking the headers for the AFFILIATIO, RESTRICTIO, ADDRESS, CITY, and ZIP fields.

    Now, multiple fields are selected.

  3. Right-click RESTRICTIO and choose Hide Field.

    Hide Field option

    These fields are still in the data table, but they are now hidden.

    The AttendanceZones layer also has fields that are not needed now. Since you'll be hiding most of the fields in the layer, you'll use the Fields view to hide the fields.

  4. In the Contents pane, right-click AttendanceZones, point to Data Design, and choose Fields.

    Fields option

    The Fields view appears.

  5. In the Fields view, uncheck the header box for Visible to turn off visibility for all fields.

    Visible checkbox turned off

  6. Check the Visible box next to NAME to turn that field back on.

    Visible checkbox for the NAME field

  7. On the ribbon, on the Fields tab, in the Changes group, click Save.
  8. Close the Fields view pane.
  9. In the Contents pane, right-click AttendanceZones and choose Attribute Table.

    The attribute table for the AttendanceZones layer appears and the fields that you don't need for the analysis are hidden. Only the NAME field is displayed.

Compare the two tables

Next, you'll compare the tables for the Schools and AttendanceZones layers.

  1. If necessary, open the attribute tables for the AttendanceZones and Schools layers.

    The two tables may be arranged as two tabs within a single table pane. To compare them side-by-side, you'll drag one of the tabs out of that pane.

  2. Drag the AttendanceZones tab and dock it to the side of the attribute table for the Schools layer.

    AttendanceZones table tab to side dock zone

    As you drag the pane—represented by a blue shadow—docking targets appear in the center of the table view and at the edges of the application window. Each target represents an area where the pane can be positioned.

  3. In the Schools table, right-click the NAME field header and choose Sort Ascending.
  4. In the AttendanceZones table, right-click the NAME field header and choose Sort Ascending.

    Based on the two tables, most public high school students attend a school located in their attendance zone, but some attend schools that draw students across zone boundaries.

    Eight high schools—Beach, Groves, Islands, Jenkins, Johnson, New Hampstead, Windsor Forest, and School of Liberal Studies (at Savannah High School)—are in both tables.

    Schools in both tables

    However, there are three additional schools are not in the AttendanceZones zones. The names of these schools are Savannah Arts Academy, Savannah Early College (at Savannah High School), and Woodville Tompkins High / Twilight Program. These three schools draw students from across attendance zones as magnet schools.

    In the United States, magnet schools are public schools or education programs that offer a wide range of distinctive curriculum programs, often specializing in an academic subject, such as math, science, or technology. The purpose of magnet schools is to provide students access to specialized academic opportunities that are not bound by residential geography and school zone designations. Acceptance into magnet school programs varies by school districts but is often offered based on a combination of merit, a lottery system, and other criteria.

Add information by joining tables

Neither of the two current layers contains student counts or graduation rates for the schools. Frequently, you'll have to enhance existing data by combining it with data from other sources. Luckily, a friend in your office found a state report that has the numbers and has entered them in a comma-separated value text file, or .csv (CSV is a very simple table format). You'll use the state school ID code value to join this data to the Schools layer.

  1. On the ribbon, on the Map tab, in the Layer group, click the Add Data button.

    Add Data button

    The Add Data window appears.

  2. Browse to your extracted Identify_Schools folder, click GraduationRates.csv, and click OK.

    The table is added to the Contents pane under Standalone Tables.

    This table has a field named SCHOOLCODE that has the state ID code for each school. The Schools layer has the same data values in a field named School_ID.

    You'll connect the two tables by joining them on the matching values in these fields.

    Note:

    Sometimes it is possible to use a name field as a join field, but numeric or code fields are usually better. Names often have variations in spelling, capitalization, and completeness that prevent them from matching. For example, the NAME fields in the Schools and AttendanceZones layers use different naming conventions for school names. The Schools table has the name combined with High (for example, Beach High) and the AttendanceZones table has the name combined with High School (for example, Beach High School). These differences may not be significant to a person, but they would prevent the tables from joining correctly.

  3. In the Contents pane, right-click Schools, point to Joins and Relates, and choose Add Join.

    Add Join option

    The Add Join window appears.

  4. In the Add Join window, enter the following parameters:
    • For Input Table, confirm that Schools is selected.
    • For Input Join Field, choose SCHOOL_ID.
    • For Join Table, confirm GraduationRates.csv is selected.
    • For Join Table Field, choose SCHOOLCODE.

    Add Join tool parameters

    Note:

    The alert symbol next to Input Join Field is there because the field is not indexed. Indexing join fields improves performance, but it is not necessary for such a small table. By clicking Validate Join, the tool will verify that the expression works.

  5. Click OK.

    The GRCLASSZ (high school class size) and GRDNUM (graduation numbers) fields are joined to the feature class. Now you can calculate the graduation rates.

  6. Close the AttendanceZones table.

Add fields to hold new values

Even when you have data in tables, the tables may not contain exactly the information that you need. In this case, the table contains two relevant fields that hold the graduating class size and the number of students who graduated. Fortunately, the graduation rates can be calculated from these values. You'll add a field to store the results before doing the calculation.

  1. In the Schools table, click Add.

    Add button

    The Fields view for Schools appears with an editable new field at the bottom of the table.

  2. For Field Name, type GradRate. For Alias, type Graduation Rate.
  3. Set the Data Type to Double.

    New field parameters

  4. On the ribbon, on the Fields tab, in the Changes group, click Save.

    While you're adding fields to the table, you'll add two more fields to store two other pieces of information about the schools—whether or not they are magnet schools (schools drawing from across all of the zones), and whether they fall above or below the state graduation rate average.

  5. At the bottom of the Fields view, click Click here to add a new field.

    A new field is added.

  6. For the new field, change the following parameters:
    • For Field Name, type Magnet.
    • For Data Type, choose Text.
    • For Length, type 5.

    New field row for Magnet

    This field will allow you to differentiate between schools that draw students from a single zone and those that draw from across zones.

  7. Add another field with the following parameters:
    • For Field Name, type Under82.
    • For Data Type, choose Text.
    • For Length, type 5.

    This field will allow you to differentiate between schools that are above and below the state graduation rate of 82 percent.

  8. On the ribbon, on the Fields tab, in the Changes group, click Save.
  9. Close the Fields view tab.

    The new fields are added to the end of the Schools table. You may need to scroll to the right in the table to see them.

    Schools table with new fields

    Now that the fields have been added, you're ready to calculate the graduation rate for each school and tag schools with other values for your analysis.

Calculate new values

Next, you'll calculate a graduation rate for each school. To calculate a graduation rate, you'll divide the number of students who graduated (GRDNUM) by the total number of students in the class (GRDCLASSZ). You'll multiply the result by 100 to get the graduation rate expressed as a percentage.

  1. In the Schools attribute table, right-click Graduation Rate and choose Calculate Field.

    Calculate Field option

    The Calculate Field tool appears.

  2. In the Calculate Field tool, under Expression, in the Fields column, double-click GRDNUM.

    GRDNUM field

    The text !GraduationRates.csv.GRDNUM! is added to the field calculation expression box.

    Because this is a calculation on a layer that has a join, the source table name is added as a prefix to the field name, and they are delimited by exclamation marks to indicate that this is a field name.

  3. Click the division button (/).
  4. In Fields column, double-click GRCLASSZ.
  5. In the expression box, type parentheses around the expression. At the end of the expression, type * 100.

    Finished expression

    The finished expression reads:

    (!GraduationRates.csv.GRDNUM! / !GraduationRates.csv.GRCLASSZ!) * 100

    That is, the GRDNUM field value (number of graduates), divided by the GRCLASSZ field value (class size), multiplied by 100. Both GRDNUM and GRVLSSZ are prefixed with GraduationRates.csv to indicate that they are in the GraduationRates.csv table. The percentage value will be stored in Schools.GradRate, the GradeRate field of the Schools table.

  6. Click OK.

    The graduation rates appear in the Graduation Rate column of the Schools table.

    There are three schools—Savannah Arts Academy, Savannah Early College (at Savannah High School), and Woodville Tompkins High / Twilight Program—that draw students from across the attendance zones. You'll add values to the Magnet field to differentiate their graduation rates from the other schools.

  7. In the Schools attribute table, click the row header for Savannah Arts Academy to select it. Press the Ctrl key while clicking the row numbers for Savannah Early College and Woodville Tompkins.

    The rows for each school should be highlighted to indicate that they are selected.

    Magnet schools in the attribute table

  8. In the Schools table, right-click the Magnet field header and choose Calculate Field.
  9. In the Calculate Field tool, in the expression box, type "Yes".

    Include the quotation marks around the word.

    Expression set to "Yes"

  10. Click Apply. Do not close the Calculate Field tool.

    The value Yes is added to the Magnet attribute for the three selected schools.

  11. In the Schools table, click the Switch Selection button.

    Switch Selection button

    The other eight schools are now selected.

  12. In the Calculate Field tool, edit the expression to "No" and click OK.

    The values for the Magnet field are updated.

  13. Click the Clear Selection button.

    Clear Selection button

    The selection is cleared.

Export the data to a new feature class

Now that you've added the new data to the Schools layer, the high school graduation rate data is ready for you to analyze. You'll export it to a new feature class to save the joined fields and remove the need for the definition query.

  1. In the Contents pane, right-click Schools, point to Data, and choose Export Features.

    Export Features option

    The Export Features window appears.

  2. Confirm that the Input Features parameter is set to Schools.
  3. For Output Feature Class, type HighSchools.

    Export Features parameters

  4. Click OK.

    The new HighSchools layer is added to the map.

  5. In the Contents pane, uncheck Schools to turn the layer off.
  6. Close the Schools table.
  7. On the Quick Access Toolbar, click Save to save the project.

    Save button on the Quick Access Toolbar

You've downloaded, opened, and explored the initial data for the project. You also changed the fields that layers show and added new fields. You've joined data to the schools from a .csv file, calculated the graduation rate, and coded schools as magnet or non-magnet schools. You've exported a new layer of just the high schools with all of the information that you've added.


Explore the data

Next, you'll explore the data. You'll use a chart of the data to get a sense of how the data is distributed, and how the graduation rates relate to specific locations around the county.

View a histogram of the graduation rates

First, you'll open the HighSchools layer attribute table and sort it.

  1. Right-click the HighSchools layer and choose Attribute Table.
  2. In the HighSchools table, right-click the header of the Graduation Rate field and choose Sort Ascending.

    Sort Ascending option

  3. Right-click the header of the Graduation Rate field and choose Statistics.

    A chart appears with the graduation rates plotted as a histogram and the Chart Properties pane appears, showing descriptive statistics for the field values. The mean (average) graduation rate for the schools is 88.63 percent, higher than the state average of 82 percent. The lowest graduation rate is 75.3 percent, and the highest is 100 percent.

    By default, the data is divided into eight bins for the histogram, though this is adjustable.

    The numbers in this table are shown with more decimal places than necessary. Next, you'll change the number formatting for the chart.

  4. In the Chart Properties pane, click the Axes tab. In the X-axis section, for Number format, click the Determine display formatting for numeric field types button.

    Determine display formatting for numeric field types button

  5. In the pane that appears, for Category, choose Numeric. Under Rounding, for Decimal places, type 1.

    Numeric category parameters

    The numeric format for Graduation Rates updates on the x-axis of the chart.

  6. Click Apply.
  7. Close the Chart Properties pane.
  8. Click the first column in the histogram.

    When you click the bar for the lowest value bin, the two school points that fall in this bin are selected. They are highlighted on the map and in the HighSchools attribute table.

  9. Click the other columns to see the locations of schools in other bins.
  10. Click the column with the highest graduation rate.

    The three schools with the highest gradation rate are selected.

  11. In the HighSchools attribute table, review the attributes of the selected schools.

    The three schools with the highest graduation rates are magnet schools. Magnet schools draw students from across the attendance zones. Acceptance into a magnet school may be based on several criteria, including merit, a lottery, or having a sibling who has attended the school before. Magnet schools tend to attract students who are particularly interested in a specialty curriculum or program the school offers, such as arts, science and math, or career and technical education. Magnet schools may also have access to special funding and recruit educators with special interests in the program. These may be some of the reasons the magnet schools have the highest graduation rates among this group of schools in your analysis.

  12. In the HighSchools table, click Clear to remove the selection.

    It may be useful for your organization to study these schools to identify reasons for their high graduation rates that could also be applied in the mentoring program.

  13. Close the HighSchools attribute table and the chart pane.

Filter out the magnet schools

Because the magnet schools draw from across the attendance zones, and because they have the highest graduation rates, they will not be useful in choosing the high schools and middle schools for the mentoring program. You'll add a definition query to filter them out.

  1. In the Contents pane, double-click HighSchools.

    The Layer Properties window appears.

  2. In the Layer Properties window, click the Definition Query tab and click New definition query.
  3. Build the expression Where Magnet is equal to No.
  4. Click Apply and click OK.

    The magnet schools are no longer shown on the map

    Non-magnet schools on the map

Change the symbols for the high schools

Next, you'll change the symbology of the HighSchools points to better show where the higher- and lower-performing schools are located.

  1. Right-click HighSchools and choose Symbology.

    Symbology option

    The Symbology pane appears.

  2. In the Symbology pane, for Primary symbology, choose Graduated Colors. For Field, choose Graduation Rate.

    The default classification method is Natural Breaks.

    Graduated Colors symbology and Graduation Rate field

    In this color scheme, schools with highest graduation rates are drawn as dark blue points, growing lighter as the graduation rate decreases.

    Graduation rates shown by graduated color symbols

    The mean graduation rate for high schools was 88.6, while the graduation rate for the state as a whole for 2020 was about 82 percent. Your organization wants to start the mentoring program in schools that are performing below the state average graduation rate of 82 percent.

    Next, you'll change the symbology to emphasize the schools with rates under 82 percent.

  3. In the Symbology pane, for Classes, choose 2.

    Classes set to 2

  4. In the Classes tab, in the Upper value column, double-click the box in the first row. Set the first upper value to 82 and press Enter.

    Upper value of the first class set to 82

  5. In the Upper value column, double-click the box in the second row. Set the upper value to 100 and press Enter.
  6. Click the symbol for the first category.

    First symbol

  7. In the symbol gallery, click Circle 3.

    Circle 3 in the gallery

    The schools with graduation rates below the state average are emphasized by this symbology.

    Schools with graduation rates under 82 percent

    These schools seem to be clustered in the north-central part of the county. This answers part of your organization's question—where the high schools with graduation rates below the state average are. These schools will be sites for the ninth-grade mentoring program. You still need to determine which schools serving eighth-graders should participate.

  8. In the Contents pane, turn on the Schools layer and turn off the HighSchools layer.

    The HighSchools layer includes the data from the GraduationRates.csv, so you can remove the joined data in the Schools layer.

  9. Right-click Schools, point to Joins and Relates, and choose Remove All Joins.

    Remove All Joins option

  10. In the Remove Joins window that appears, click Yes.

Show the locations of schools with an eighth grade

You'll change the definition query on the Schools layer to show schools that serve eighth-graders.

  1. In the Contents pane, double-click Schools.

    The Layer Properties window appears.

  2. In the Layer Properties window, in the Definition Query tab, point to Query 1 and click the Remove definition query button.

    Remove definition query button

  3. In the Remove Definition Query window that appears, click Yes.

    The query is removed.

    Next, you'll create a query to find schools that are classified as middle schools, K-8 schools, or K-12 schools—schools in any of these categories include eighth-grade students. You'll also add a clause to limit the results to public schools.

    Earlier in this tutorial, you used an And operator to join the two clauses of the query, displaying points where the TYPE field value was High and the SCHOOL_TYP field value was Public School. You wanted the query to only show the schools that fulfilled both these queries.

    In this query, you'll use Or operators to combine three clauses that will find schools with the values Middle Or K-8 Or K-12 in the TYPE field, and use the And operator to find schools where the SCHOOL_TYP field value is Public School.

  4. Click New definition query.

    First, you'll build the query to display middle schools.

  5. For Query 1, build the expression Where TYPE is equal to Middle.
  6. Click Add Clause and build the expression And TYPE is equal to K-8.

    Query built for TYPE is equal to K-8

  7. Click Apply and click OK.

    The map updates to reflect the query.

    No school points visible on the map

    There are no records in the table. What could be the problem?

  8. On the Contents pane, double-click Schools.

    The Layer Properties window appears on the Definition Query tab.

    The problem here is a logic error. The query is using the wrong operator to connect the clauses. By using the operator And, the query selects schools where the value of the TYPE field is both equal to Middle and equal to K-8. This expression results in no matches, because the school points only have one text string value stored in this field. It can't be equal to both text values at the same time.

    When you add a second clause to a query expression in the Definition Query builder, the default operator is And because that is a common choice. If this field were numeric, it might make sense to have two clauses working on the same field joined by an And operator. For example, you might want to view only schools that had between 100 and 300 students, so you could make a query to find points where StudentBody >= 100 And StudentBody <= 300.

    The solution is to edit the query to use an Or operator.

  9. For Query 1, click Edit.
  10. Change And to Or.

    Or operator

  11. Click Apply and click OK.

    Now you can see schools that either have the TYPE field equal to Middle or equal to K-8. There are 26 schools that match this query.

  12. Open the Layer Properties window for the Schools layer.
  13. For Query 1, click Edit. Add a third clause that reads Or TYPE is equal to K-12.

    Query for Type = Middle or K-8 or K-12

  14. Click Apply and click OK.

    Now you can see schools that either have the TYPE field equal to Middle, K-8, or K-12. There are 26 schools that match this query.

    Map of schools with eighth-graders

Exclude private schools and public charter schools

Some of these schools are private schools and some are public charter schools, which, like the magnet high schools, draw from across the district. You'll exclude the private schools and the public charter schools. To do that, you'll add a clause.

  1. Open the Layer Properties window for the Schools layer.
  2. On the Definition Query tab, for Query 1, click Edit.
  3. Click Add Clause and build the expression And SCHOOL_TYP is equal to Public School.

    Be sure to set the operator to And this time.

    Fourth clause added to the query

  4. Click Apply and click OK.
  5. Open the Schools attribute table.

    There are still 26 schools that match the query. This result is unexpected because some of those schools should have been removed due to the query. Also, there are still private and public charter schools listed in the table. What is going on?

  6. Open the Layer Properties window for the Schools layer. For Query 1, click Edit.

    Part of the expression is indented now.

    Query expression with wrong order of operations

  7. Turn on the SQL toggle button to view the query as an SQL expression.

    SQL toggle button

    In the SQL expression, you can see that parentheses are grouping (TYPE = 'K-12' And SCHOOL_TYP = 'Public School') together.

    The problem is that in SQL there is an order of operations, like in arithmetic. The rule is that And operators are evaluated first, followed by Or operators. The parentheses were added automatically to indicate that rule was being followed. In this case, the query is actually showing schools that match both TYPE = K-12 And SCHOOL_TYP = Public School, or schools where TYPE = K-8 or TYPE = Middle.

  8. In the SQL view of the expression, edit the parentheses so that they surround TYPE = 'Middle' Or TYPE = 'K-8' Or TYPE = 'K-12'.

    The full query expression should be: (TYPE = 'Middle' Or TYPE = 'K-8' Or TYPE = 'K-12') And SCHOOL_TYP = 'Public School'

    Edited query expression to adjust order of operations

  9. Click Apply and click OK.

    The query now only returns the public schools that may have eighth-grade students.

  10. Open the Layer Properties window for Schools.
  11. For Query 1, click Edit. If necessary, turn off SQL.

    Indentation showing correct grouping of clauses

    The change you made to the SQL code is reflected in the grouping and indentation of the clauses. You can also change the grouping of clauses in this view.

  12. Click the bar next to the clause to select K–8 schools, press Ctrl, and click the bar next to the clause to select Middle schools.

    Select clauses to group

    When two or more clauses are selected, the two buttons at the top of the query become active. These allow you to change the indentation levels to group or ungroup clauses to control the order of operations.

  13. Click Cancel and close the Layer Properties window. Close the Schools attribute table.

View the schools together

Now that you've found the schools that have eighth-grade students, you can view these schools alongside the high schools.

  1. Turn on the HighSchool layer.
  2. In the Contents pane, right-click the point symbol for the Schools layer and choose a bright green color, such as Quetzal Green.

    Quetzal Green color

    The schools with eighth-graders are now bright green.

    Schools and high schools

  3. Save the project.

The question is now which of those schools with eighth grades should also get the mentors. You can select all of the schools in attendance zones of a participating high school. You can select schools within some distance of a participating high school. Or you can use demographic data to help make the decision.

You've found the high schools with graduation rates below the state average. These will be sites for the ninth-grade mentoring program. You've also used a definition query to show the qualifying public schools in the district with eighth-graders. Next, you'll view and analyze some demographic data to help decide which of these schools to include in the program.


Use demographic data to select schools

Next, you'll add some demographic data to the map to help make the decision about which schools to include. You'll inspect the map to see the relationship between graduation rate and demographic factors, and then use data enrichment to quantify the problem. Finally, you'll use data enrichment to identify the schools with eighth-grade students to participate in the mentoring program.

Add data from Living Atlas

First, you'll add a layer of demographic data from ArcGIS Living Atlas of the World.

  1. On the ribbon, on the View tab, in the Windows group, click the Catalog Pane button.

    Catalog Pane in the View tab

    The Catalog pane appears.

  2. In the Catalog pane, click Portal.

    Portal tab

  3. Click the Living Atlas tab. In the search bar, type ACS and press Enter.

    ACS search in Living Atlas tab

    ACS stands for American Community Survey, a demographic and housing data project based on the United States Census Bureau's annual sampling of data.

    Note:

    To read more about ACS data in ArcGIS Living Atlas, read the Census and ACS topic.

    There are many ACS layers. You'll refine the search query to find a specific one.

  4. In the search bar, type ACS poverty and press Enter.
  5. Right-click ACS Poverty Status Variables - Boundaries and choose Add To Current Map.

    Add To Current Map option

    The polygons are census tracts. Smaller tracts tend to be in urban areas with higher concentrations of people.

    ACS data on the map with the school points

    On the map, the areas with darker colors have higher poverty rates. The schools with lower graduation rates seem to fall within smaller, urban tracts near the middle of the county, with higher poverty rates.

  6. On the map, click one of the tracts near a red point.

    The tract's pop-up appears. The pop-up shows the tract number, the number of people living in the tract, the number of people living below the federal poverty line, and the number of children living in poverty.

    Tract 114 pop-up

    The histogram shows the age structure of the population (younger age classes are on the left; older are on the right).

    The pop-up reveals the number of children who are living in poverty in this tract, and that a large fraction of the total tract population is living in poverty.

  7. Click a light-colored tract near the western edge of the county.

    Tract 108.21 pop-up

    The pop-up reveals that in this census tract, there are relatively fewer children living in poverty, and a small fraction of the total tract population is living in poverty.

    Based on a review of these two tracts, the school with lower graduation rate is in an area with a higher poverty rate, compared to the area where the school with a higher graduation rate is located. However, several schools with higher graduation rates are located in areas with higher poverty rates.

    Observing two points is not sufficient to draw a conclusion that there is a correlation between poverty rate and graduation rates. Next, you'll use the Enrich tool to further examine the attributes in areas with high and low graduation rates.

  8. Close the pop-up.
  9. Save the project.

Add demographic data to the attendance zones

Another way to examine factors that may contribute to lower graduation rates is to summarize demographic data by attendance zone. This will allow a quantitative comparison.

  1. Click the Analysis tab, and in the Tools section, click the Enrich tool.

    Enrich tool button

    The Geoprocessing pane appears, showing the Enrich tool. This tool allows you to add demographic information from ArcGIS Online to your features.

  2. In the Geoprocessing pane, for Input Features, choose AttendanceZones.

    You'll use the default output name, AttendanceZones_Enrich.

  3. For Variables, click the add button.

    Add button for the Variables parameter

    The Data Browser window appears.

  4. In the Data Browser window, double-click Education.

    Education category in the Data Browser

  5. Double-click the Educational Attainment folder.
  6. Under 2023 Educational Attainment (Esri), check the 2023 Pop Age 25+: High School/No Diploma box. Click the percentage button (%) to add the percentage of adults who attended high school but did not get a diploma and click the number button (#) to unselect it.

    Percentage of adults who attended high school but did not graduate

    Note:

    Data is periodically updated. Use the most recent data available.

  7. On the side pane, click Categories and double-click Income.
  8. Double-click Household Income.
  9. Under 2023 Income (Esri), check the 2023 Median Household Income box.

    2023 Median Household Income variable

    The 2023 Income (Esri) and 2023 Educational Attainment (Esri) groups of variables are current-year estimates compiled by Esri demographers from most up-to-date census data and other data sources. They are continuously updated with the most recent information.

    What other demographic variables might be useful for this analysis? You can browse through the different categories, or search for other keywords. For this tutorial, you'll only use the two variables you chose.

  10. Click OK.

    The variables are added to the Enrich tool pane.

    Note:

    The Enrich tool consumes credits based on the number of features enriched and the number of variables added (at a rate of 1 credit per 100 feature-variables). If you add more demographic variables, the tool will consume 0.08 credits per additional variable.

    Learn more about credits

  11. In the Geoprocessing pane, click Estimate Credits to see how many credits this tool will consume. Click Run.

    The tool runs. The new AttendanceZones_Enrich layer has the information from the attendance zones, plus data from the two demographic variables that you added.

Join the high schools to the attendance zones

The attendance zones now have attributes that you can use to quantify differences between the attendance zones for schools with higher graduation rates and schools with lower rates. The attendance zones do not have the graduation rates, but you can use a spatial join to add this information from the HighSchools layer.

  1. In the Geoprocessing pane, click the Back button.

    Back button

  2. In the search box, type Spatial Join. In the list of search results, click the Spatial Join tool.

    Spatial Join tool in search results

    The Spatial Join tool appears.

  3. For Target Features, choose AttendanceZones_Enrich

    These features will get values from the schools they contain added to their attributes.

  4. For Join Features, choose HighSchools and accept the other default values.

    The Join Operation parameter is set to Join one to one, as there are the same number of high schools and zones.

    Because the HighSchools layer has a filter, only the features that appear will be processed.

    Spatial Join tool parameters

  5. Click Run.

    The tool runs and the AttendanceZones__SpatialJoin layer is created. Its polygons now have the attributes (including graduation rate) from the HighSchools points. If the zones had the school code on them, you could have accomplished this with a table join, but the attendance zones polygons did not have a field with that value.

Calculate a value using Python

Earlier you added a variable named Under82 to the HighSchools layer. Because of the spatial join, this variable is now also in the attribute table for the AttendanceZones__SpatialJoin layer. Next, you'll add values in this field to make it easier to chart your demographic data.

You'll use a field calculation to set the values of this field to Above or Below. You could do it the way you set the Magnet variable earlier. A quicker way to do it in this case, since there is already a Graduation Rate field that contains relevant information, is to create a Python function and use it in the calculation to add Above when the values are above and Below when the values are below.

  1. Open the attribute table for AttendanceZones__SpatialJoin, right-click Under82, and choose Calculate Field.

    The Calculate Field window appears.

  2. Copy the following script snippet and paste it into the Code Block box:

    def classify_school(gradrate):
        if gradrate >= 82:
            return "Above"
        if gradrate < 82:
            return "Below"

    Script for classify_schools function code block in the Calculate Field tool

    This script is a Python function. A function is block of code that does some task. In this case, the task is to check values in one field and assign values to another. Functions start with def, for define a function.

    The function's name is classify_school, and it takes one parameter, named gradrate, in parentheses after the function name. After the name and parameter, the colon indicates that the next code will be executed by the function.

    The next lines are indented by four (or eight) spaces. Consistent indentation is important in Python, as it defines which lines belong together and when they are executed.

    The function has two conditional branches, specified by the if statements.

    If the value the function receives in the gradrate parameter is greater than or equal to 82, it will return the string value of Above.

    If the value the function receives in the gradrate parameter is less than 82, it will return the string value of Below.

  3. Above the Code Block box, for Under82 =, type classify_school().

    Classify_school function in the expression box

  4. Click between the parentheses. In the Fields list, double-click Graduation Rate.

    This adds the field name as the parameter for the function. It looks like this:

    classify_school(!GradRate!)

    Classify_school function with field parameter in Expression box

    Note:

    The actual field name is inserted, not the field alias (Graduation Rate). The field name is delimited with exclamation points.

    When the tool is run, the Expression box code will be executed for the Under82 field for each row of the table. This code calls the function classify_schools that you defined in the Code Block box. For each row, it will check if the value in GradRate is above or below 82 and update the value of the Under82 field with the result the function returns.

  5. In the Code Block box, click the end of the last line of code and press Enter.

    The yellow highlight is no longer visible, indicating that the code has run.

    Code in the Code Block box

  6. Click OK.

    In the attribute table, the Under82 field now shows the values Above and Below according to the Python code.

    In this case, there are only a few features and values to update, so it would not have taken much longer to calculate these values using manual selection, or using the Select by Attribute tool. However, when there are more different values that you want to calculate, and more conditions, Python expressions can save you time.

  7. Save the project.

Graph the results

Now that you have the attendance zones, demographic data, and graduation rates and Above or Below categories on the same features, you can look at the relationships between the variables. Charting is a good way to do this.

  1. In the Contents pane, right-click AttendanceZones__SpatialJoin, point to Create Chart, and choose Bar Chart.

    The Chart Properties pane appears.

  2. In the Chart Properties pane, for Category or Date, choose NAME.
  3. Under Numeric Field(s), click Select. Check 2023 Median Household Income and click Apply.

    Choose 2023 Median Household Income variable

    The chart updates.

    2022 Median Household Income summarized by School Attendance Zone

    There are differences in median household income between the different high school attendance zones.

  4. In the Chart Properties pane, click the General tab. For Chart title, type Median Household Income by Attendance Zone.

    Chart title parameter

  5. In the Chart pane, click the Export button and choose Export As Graphic.

    Export As Graphic option

    The Export window appears.

  6. Save the image of the chart on your computer as Median_HH_Income_by_Attendance Zone as a .png file.

    You can add this image to a report document or presentation.

  7. In the Chart Properties pane, click the Data tab. For Split by (optional), choose Under82.

    Split by parameter

    The chart updates to show each school by the Under82 field value.

    School attendance zones income split by Under82 variable

    When you split the values by whether the attendance zone is above or below the 82 percent graduation rate, you can see that lower median household income is somewhat associated with lower graduation rates. The break point seems to be at a median household income of about $55,000. You could export an image of this chart to use in a presentation.

  8. Close the chart view and the attribute table.
  9. In the Contents pane, uncheck the ACS Poverty Status Variables - Boundaries layer.

    Now that you've identified a relationship between median income and graduation rate, you will use data enrichment to find schools with eighth-graders in areas of lower median household income. Income doesn't seem to be the only factor here, since two schools with rates over 82 percent also have relatively low values for attendance zone median household income. You could explore the relationship of other demographic variables to graduation rates and use one or more additional variables to make the decision.

Enrich within a distance

To find schools with eighth-graders that have lower median household income, you could select the ones that fall in the same high school attendance zones as the participating high schools. However, the zones are large and may contain several neighborhoods with different characteristics. The middle, K–8, and K–12 schools are smaller than the high schools and each draws from a smaller area. Since you don't have polygons representing the attendance zones for these schools, you'll approximate their neighborhood by specifying a search distance in the Enrich tool.

  1. On the ribbon, click the Analysis tab. In the Tools group, click the Enrich tool.
  2. In the Enrich tool pane, for Input Features, choose Schools.

    You'll accept the default output name, Schools_Enrich.

  3. For Variables, click the add button.

    The Data Browser window appears.

  4. In the Data Browser window, on the search bar, type median household income and press Enter.
  5. Check the 2023 Median Household Income box and click OK.
  6. In the Enrich tool pane, confirm that Distance or time is set to 1 and that Units is set to Miles.
  7. At the top of the tool pane, click estimate credits. Click Run.

    The tool runs, summarizing the median household income for a circle with a radius of one mile around each school and add that data to the output Schools table.

    Using a radius of one mile around the points gives a sample area of 3.14 square miles. This is larger than the two-square-mile minimum recommended size for using the Enrich tool on an area. Smaller areas will contain less reliable results.

    The Enrich tool also allows you to summarize demographic data using walking- or driving-time zones around points, so you could get demographic data for areas within a 10-minute walk of each school or a 15-minute drive. For this tutorial, a simple circle is fine.

Find the schools with low median household income

You'll symbolize the schools by the income values that you just added and select the schools for the eighth-grade mentoring program.

  1. In the Contents pane, right-click Schools_Enrich and choose Symbology.
  2. In the Symbology pane, for Primary Symbology, choose Graduated Symbols.
  3. For Field, choose 2023 Median Household Income.

    Symbology pane parameters

    The symbology for the Schools_Enrich layer is updated.

    Schools symbolized by nearby median income

    Small circles represent lower median incomes near a school, and larger circles represent higher median incomes. Most of the schools with lower median household income within a one-mile radius are also close to the three high schools with graduation rates under 82 percent.

    Next, you'll select schools with eighth-graders for the program. Since the breakpoint between the high schools with over 82 percent graduation rate and under 82 percent graduation rate was about a $55,000 median household income, you'll use the same median household income value as the selection criterion for the schools with eighth-graders.

  4. On the ribbon, click the Map tab. In the Selection group, click Select By Attributes.
  5. In the Select By Attributes window, for Input Rows, choose Schools_Enrich.
  6. Under Expression, build the expression Where 2023 Median Household Income is less than or equal to 55000.

    Select By Attributes parameters

  7. Click OK.

    The schools are selected.

    Selected schools where the median household income is less than or equal to $55,000

  8. Open the attribute table for the Schools_Enrich layer.

    The selected schools ar4 highlighted in the table. The TYPE value for five schools is Middle; the value for two is K-8.

  9. On the table header, click the Copy Selection button.

    Copy Selection button

    You can paste the selected rows into a spreadsheet or text document to record the schools you've chosen. You could also save the selected features as a new layer or calculate a field value to indicate that these schools were selected for the program. You could also select on the HighSchools layer by attribute and copy those rows into your document.

  10. Save the project.

In this tutorial, you prepared data for analysis, used definition queries to select the right features to display, joined tables, calculated values, symbolized and charted data, spatially joined layers, and selected features by attribute to query for specific values. You used a Python function to automate the classification of values in one field into classes in another. You also used ArcGIS Living Atlas and the Enrich tool to add demographic data to your project to support your decision-making.

You can find more tutorials in the tutorial gallery.