Geocode facilities and competitors

Healthcare organizations typically weigh multiple factors when planning to expand into a new market or attempting to better serve existing populations. The process may include assessing the location of existing facilities belonging to your network and your competitors, measuring the accessibility of a given location based on a community's means and modes of travel (for example, driving, walking, or the availability of public transit), and examining the distribution of current patients. However, you must take care when using individual patient data. Legal and ethical guidelines require special handling for protected health information (PHI) and personally identifying information (PII).

Download the data

First, you'll download the data used in this tutorial.

Note:

This data is fictitious. It has been created for the purpose of demonstrating the workflow in this tutorial. It is designed to look plausible for the workflow and is structured similarly to data that you might use in this situation, but due to the legal limitations on sharing real data of this type, it is entirely made up. Do not rely upon this data. Do not attempt to draw conclusions or make real-world decisions based on this data. Do not use this data to train AI or machine learning models; the results will be inaccurate. The addresses in this dataset are real addresses, for the purposes of enabling a demonstration of geocoding and to provide plausible data to work with, but the data has no real relation to these addresses. Any names or attribute values associated with these addresses in the datasets are made up and have nothing to do with any actual persons or conditions at these locations.

  1. Download the Protect_Patient_Data_Zipped_Folder.zip folder.

    A file called Protect_Patient_Data_Zipped_Folder.zip is downloaded to your computer. Depending on your browser and settings, it may be saved in your Downloads folder, or on your Desktop.

  2. Right-click the downloaded file and choose Extract All.

    Extract All option

  3. Specify the output folder location and click Next.

    This is a password-protected zip archive. A password window appears.

  4. For Password, type I_Understand_This_Is_Fictitious_Data and click OK.
    Note:

    Use of this password indicates that you understand that the data is fictitious.

    The file is extracted to your computer as a folder.

  5. Browse to and open Protect_Patient_Data_Zipped_Folder.

    The folder contains a file called ProtectPatientData.ppkx. A .ppkx file is an ArcGIS Pro project package, a compressed file for sharing projects that may contain maps, data, and other files that you can open in ArcGIS Pro

  6. Double-click ProtectPatientData.ppkx to open it in ArcGIS Pro. If prompted, sign in with your ArcGIS account.
    Note:

    If you don't have access to ArcGIS Pro or an ArcGIS organizational account, see options for software access.

    A map of Nashville, Tennessee appears. The patients point layer shows the location of fictitious current patients of the health system. You'll geocode and append a file of new fictitious patients to this layer later. First, you'll geocode a table of your healthcare facilities.

  7. In the Contents pane, uncheck the patients layer.

    You'll examine this layer and append new patients to it later.

Map facilities

Hospital strategists need a geographic perspective to make good decisions for the growth, efficiency, relevance, and sustainability of their organization. An improved understanding of the network helps support the diverse communities it serves and helps to close gaps and provide better care. A first step in gaining that insight is to map the various health system facilities, such as the main hospital campus and the network of primary care clinics, within the community. This foundational data can serve as the basis for several calculations such as the following: 

  • Determination of service areas
  • Examination of population characteristics
  • Identification of gaps in services and outcomes
  • Proximity of competitors

You'll map your facilities and those of your competitors. This workflow will consume credits, but you'll have the option to estimate how many credits will be used in an operation prior to running a tool.

  1. If the Catalog pane is not open, on the ribbon, click View. In the Windows group, click Catalog Pane.

    Catalog Pane button

  2. In the Catalog pane, expand Folders, ProtectPatientData, commondata, and userdata.

    Files in userdata folder

    The userdata folder is where files that have been packaged with an ArcGIS Pro project package are stored. This folder contains three comma-separated value (.csv) files with the data that you will add to your project.

  3. Click CompetitorLocations.csv. Press the Shift key while clicking new_patients.csv.

    The three .csv files are selected.

  4. Drag the files onto the map.

    Three .csv files

    The tables do not have geometries, so you don't see them on the map, but they are added to the Contents pane in the Standalone Tables group.

    Standalone Tables group in the Contents pane

    You'll make point features from the addresses in these tables by geocoding them. Hospital and clinic locations are not PHI or PII, so you do not need to take any special precautions in this process.

  5. In the Contents pane, right-click FacilityLocations.csv and choose Geocode Table.

    Geocode Table option

    The Geocode Table pane appears.

  6. In the Geocode Table pane, click Start.

    Start button

  7. For Input Locator, choose ArcGIS World Geocoding Service.

    ArcGIS World Geocoding Service set as Input Locator

    Note:

    If you do not see the ArcGIS World Geocoding Service option in the list, you must sign in to your ArcGIS Online organization.

    The ArcGIS World Geocoding Service consumes ArcGIS Online credits at a rate of 40 credits per 1,000 addresses. This table has addresses for eight facilities, so geocoding it will consume 0.32 credits.

    Note:

    If your organization has purchased ArcGIS StreetMap Premium, you may also see that locator as an option. With ArcGIS StreetMap Premium, the street data and locator are hosted within your organization's firewall, while the ArcGIS World Geocoding Service is hosted in ArcGIS Online. Both of these locators are built with reference data that covers streets, their directionality, speed limits, address numbering, and more. When you're making data-driven decisions that will impact the lives of real people, confidence in your geocoding results is critical. For more information, see the documentation for ArcGIS World Geocoding Service and the ArcGIS StreetMap Premium product page.

  8. Click Next.

    Next, you'll set the fields used to store address information. To decide which fields to use, you'll look at the attribute table.

  9. Click Attribute table.

    Attribute table button

    The table appears. It has multiple fields that store various parts of the address.

    Facilities table

    ArcGIS World Geocoding Service will use the data from the Address, City, State, and Zip fields to determine the facility locations.

  10. Accept the default value of More than one field and click Next.

    The geocoding tool detects the fields and maps them to some of the fields that the locator can use for identifying locations.

    Fields mapping

    If you had a table with different field names that the locator could not automatically match, you could choose the correct fields in the drop-down lists.

  11. Click Next.
  12. For Output, click the Browse button. In the Output window, for Name, type Facilities.

    Name parameter in the Output window

    The location for the new feature class defaults to protectpatientdata.gdb, the default geodatabase for the project.

  13. Click Save
  14. Confirm Preferred Location Type is set to Address Location. For Output Fields, choose Minimal.

    Output Fields parameter

    This option will reduce the number of fields added to your output feature class. Since your primary goal is to visualize these locations, it's unlikely that you will need comprehensive geocoding fields as part of your output.

    Note:

    You'll still get some new fields containing information about the geocoding process, such as matched address, confidence scoring, address type, and match type, but the Minimal option prevents duplication of your input fields and prevents the addition of other geocoding output fields that may be empty.

  15. Click Next. For Country, check United States.

    Country set to United States

  16. Click Next. Leave all of the categories unchecked and click Finish.

    The ArcGIS World Geocoding Service is designed to match a wide variety of types of addresses. By leaving these unchecked you are allowing the locator to match addresses with the greatest flexibility. Later, you'll specify a more limited set of address types to match.

  17. Click estimate credits.

    Estimate credits link

    The message updates to show the number of credits that the geocoding process will consume.

    Estimate of 0.32 credits

    The estimate of 0.32 credits is what you would expect for geocoding eight addresses.

  18. Click Run.

    When the geocoding process finishes, you're prompted to rematch addresses. Since the message reports that no addresses were unmatched, you do not need to rematch.

  19. In the Geocoding Completed window, click No.

    The eight facilities in your network are added to the map.

    Facilities added to the map

    You've geocoded your healthcare system's locations.

Map competitors

Next, you'll geocode the locations of your major regional competitor, Tennessee Star Medical Group. The table of competitor locations has addresses formatted in same columns as the table of your facilities. It contains the addresses of six locations.

  1. In the Contents pane, right-click CompetitorLocations.csv and choose Geocode Table.

    When you geocoded your facilities, you followed the six-step guided workflow. This time, you'll skip that and work directly with the tool.

  2. In the Geocode Table pane, click Go to Tool.

    Go to Tool button

  3. For Input Locator, choose ArcGIS World Geocoding Service.

    Input Locator parameter

    Because you opened the Geocode Table tool by right-clicking the CompetitorLocations.csv table and choosing to geocode it, the Input Table box already contains CompetitorLocations.csv. When you chose ArcGIS World Geocoding Service, the fields from the table were automatically mapped to some of the fields that the service uses.

    Mapped fields

  4. For Output, click the Browse button.
  5. For Name, type Competitors. Click Save.
  6. For Output Fields, choose Minimal.
  7. For Country, check United States.
  8. Click estimate credits.

    The process will consume 0.24 credits.

  9. Click Run.

    When the tool finishes, you are prompted to start the rematch process. All of the addresses were matched, so you don't need to rematch.

  10. Click No.
  11. Close the Geocode Table tool and any open tables.

    You've geocoded your facilities and those of your competitor. Because healthcare facilities location data is not PII or PHI, you used the standard method for geocoding, which can be applied to anything that is not specifically protected. Since this is not protected information, you can also feel comfortable that your resulting point layers can be stored on your desktop, within your ArcGIS organization.

Symbolize facilities

You'll examine the distribution of your facilities and those of your competitor on the map. To make it clearer, you'll change the symbols for the points.

  1. In the Contents pane, right-click Facilities and choose Symbology.

    Symbology option

  2. In the Symbology pane, for Primary symbology, choose Unique Values.

    Unique Values option

  3. For Field 1, choose FacilityType.

    FacilityType option

    The FacilityType field contain two values, 1 and 2. The code 1 indicates a hospital and the code 2 indicates a primary care center. You'll set symbols for the different types of facilities.
  4. On the Classes tab, click the point symbol in the row for code 1.

    Facility Type 1 symbol

  5. In the Search bar, type Hospital pushpin and press Enter. In the list of results, click the largest Hospital pushpin symbol.

    Hospital pushpin symbol

  6. In the Symbology pane, click the back button.
  7. Click the point symbol in the row for code 2.

    Facility Type 2 symbol

  8. Search for Hospital. In the list of results, click the largest Hospital circle symbol.

    Hospital circle symbol

    You've symbolized the hospitals and clinics in your network.

    Hospitals symbolized on the map

Symbolize competitors

Next, you'll use the same symbols for the competitors, but choose different colors. Since you've already done the work to set the symbology for your own facilities, you'll import it and then change the colors to make the competing facilities visually distinct.

  1. In the Contents pane, click the Competitors layer to select it.
  2. In the Symbology pane, click the options button and choose Import symbology.

    Import symbology option

    The Apply Symbology From Layer tool appears.

  3. For Symbology Layer, choose Facilities.

    Symbology Layer parameter set to Facilities

  4. Click Run.

    The symbology of the Facilities layer is imported and applied to the Competitor layer.

    Competitors symbology updated

  5. Right-click the code 1 symbol and choose a medium shade of gray.

    Competitor location color set to gray

  6. Change the code 2 symbol to the same shade of gray.
  7. Close the Symbology pane.

    Map showing facilities and competitors

    You can now see the distribution of patients, competing facilities, and your own facilities together on the map. You'll add labels to get more information as you explore the map.

  8. In the Contents pane, right-click Facilities and choose Label.
  9. Right-click Competitors and choose Label.
  10. Zoom in to some of the places where your facilities are close to your competitors and check the names.

    Your primary hospital facility and Tennessee Star's primary hospital facility are located on opposite sides of the downtown Nashville area. Even so, it's likely that those two locations see a great deal of overlap in the populations they serve.

    A couple of the primary care locations in your health system are also close to some competitor locations:

    • Tennessee Star Medical Group—North and Nashville Memorial Primary Care—Grizzard are two primary care facilities in fairly close proximity.
    • Tennessee Star Medical Group—Nashville Primary Care South and Nashville Memorial Primary Care—North are also in fairly close proximity, but it's unlikely that the 24-hour clinic and the family medical practice will serve the same communities under the same conditions. There are likely to be either demographic differences in the patients at the two locations or a difference in the type of care being sought.
    • Tennessee Star Medical Group—Nashville Primary Care South-East and Nashville Memorial Primary Care—Antioch appear to be the two most remote locations for each medical group, farthest from Nashville's city center. However, it is unclear how much their service areas may overlap at this stage.
  11. Right-click Facilities and choose Label.

    The labels are turned off.

  12. Turn off labels for the Competitors layer.

    You can gather a great deal of useful information by displaying this data on the map and performing a brief visual analysis. However, other GIS tools will allow you to dive deeper into the data to enhance your understanding of the communities these medical networks serve. This analysis will allow you to draw some conclusions to support making business decisions, and is a first step to provide more equitable healthcare to the Nashville region.

Determine service areas

Now that you've mapped your healthcare facilities, you can determine their service areas. Creating a service area is like buffering a point. When you buffer a point, you specify a straight-line distance, and a circle is created using that distance. When you create a service area around a point, you also specify a type of radius, but unlike a buffer, the radius represents either the maximum time or distance that can be traveled along a network, such as a road or sidewalk network. The result is a service area polygon indicating the ability to reach the point within the time or distance you specified. Getting accurate results for this calculation is another reason to ensure you're using the best available reference data, like the ArcGIS World Geocoding Service or ArcGIS StreetMap Premium.

Health research has shown that travel time is a more important indicator of access to care than travel distance.

  1. On the ribbon, click the Analysis tab. In the Workflows group, click Network Analysis and choose Service Area.

    Service Area option

    A message appears that says Creating new analysis layer. When the process is complete, the Service Area layer is added to the  Contents pane.

    The group layer includes sublayers for the inputs and outputs of the analysis. The colors of the service area features are randomly assigned, so your map may not match the colors shown in this tutorial.

    Service Area layer in the Contents pane

    You'll load your healthcare facilities into the Facilities sublayer, and when you run the analysis, the Polygons sublayer will show the resulting drive-time polygons. You won't need the Lines sublayer for this analysis. The three Barriers sublayers allow you to specify places where the travel network is not passable. For this analysis, it is not necessary to add any barriers.

  2. In the Contents pane, click the Service Area layer.

    Service Area layer

    Clicking this group layer activates the Service Area Layer tab on the ribbon. This tab contains tools for conducting service area analyses.

  3. On the ribbon, click the Service Area Layer tab.

    Service Area Layer tab

  4. In the Input Data group, click Import Facilities.

    Import Facilities button

    The Add Locations tool appears.

  5. For Input Locations, choose Facilities.

    Input Locations parameter

    This option will load your healthcare facilities features into the Facilities sublayer of the Service Area layer.

    The Input Network Analysis Layer parameter is set to Service Area and the sublayer is set to Facilities.

  6. Click OK.
    Note:

    You may see a warning that the Name field of the input data is longer than the Name field of the Facilities sublayer. This discrepancy is due to the way fields with text data are imported from .csv files. The address values in the field are much shorter than 500 characters, so they were not truncated.

    The Facilities sublayer now contains your Facilities points. They are drawn on the map as solid colored circles with a black outline.

  7. Close the Add Locations window.
  8. On the ribbon, on the Service Area Layer tab, in the Travel Settings group, confirm that Mode is set to Driving Time.
  9. For Cutoffs, edit the values to be 15, 30.

    Cutoffs values set to 15, 30

    This option will calculate 15-minute and 30-minute drive-times service areas.

    The service area analysis tool consumes credits based on the number of input features and the number and the number of drive times. This layer has eight input features and you are solving drive times for 15 and 30 minutes, so 16 service areas will be created, at a cost of 0.5 credits per service area, or eight credits total.

  10. On the Service Area Layer tab, in the Analysis group, click Run.

    Run Service Area analysis

    The analysis process runs for a few seconds. When it completes, the Polygons layer apears in the Contents pane, showing the 15- and 30-minute drive-time polygons.

    Polygons layer showing cutoff times

    The service area polygons also appear on the map.

    Service areas 15 and 30 minute drive times

    The 30-minute drive-time service area covers nearly all of Nashville's city limits, and therefore anyone living in the urban center who has a car will likely be able to travel to a Nashville Memorial location within 30 minutes. The drive-time service area extends out along highways, where higher speeds allow greater distance to be covered within the specified time. The 15-minute drive-time service area is smaller, covering most of the center of Nashville. Drivers in this area could reach a facility within 15 minutes.

    It is worth thinking about how much of Nashville Memorial's patient population does not have access to a car. The same procedure you used to calculate drive-time service areas can also be applied to walking-time service areas. If you wanted to do this, you'd create a new Service Area layer, load the facilities into it, set the Mode option to Walking Time, and run the service area analysis again.

    Walking Time setting

    The walking-time service areas cover a much smaller fraction of the city. You could analyze your patient demographics to find places where clusters of patients do not have cars and provide additional services there or supply transportation to and from those locations.

    Walking-time service areas

    An additional factor worth considering would be that some patients in Nashville's urban center might be less likely to have access to a car but might have better access to public transit. If you have public transit data, you can take that into account in your analysis.

    If you wanted, you could also use these techniques to analyze the service areas of the Tennessee Star Medical Group, but for this tutorial, you won't.

  11. On the Quick Access Toolbar, click the Save Project button.

    Save Project button

Now that you've mapped your facilities and determined the 15- and 30-minute drive-time service areas, you can examine the relationship of your facilities to your patients.


Geocode patients

Geocoding the patient addresses from your network will help you visualize where the fictitious Nashville Memorial patients are located. This is a first step in determining gaps in access. It will allow you to determine if there are large numbers of patients living outside of the service areas for your facilities.

Because geocoding with the ArcGIS World Geocoding Service requires sending your patient address data outside of your organizational network and firewall-protected devices, you must consider regulations protecting personal information. The following steps use fictitious data. In the future, if you are geocoding personal health information for your organization, be sure that you have permission to access and use such data for your intended workflows.

You'll split the new patient data table into two tables, one containing PII and a key value and another containing the addresses and a matching key value. This will allow the address data to be passed to the ArcGIS World Geocoding Service without sending any more information than is necessary to obtain the point locations. When the points are returned by the geocoding process, you can join the patient data back to them, using the shared key.

While this is a best practice as you geocode your data using the ArcGIS World Geocoding Service, it alone is not sufficient to protect private data. The ArcGIS World Geocoding Service is different from most other geocoders since it is designated and approved as an ArcGIS Online Health Insurance Portability and Accountability Act (HIPAA) eligible service, meaning that it is validated for alignment with HIPAA guidelines and any data submitted to the service is processed in a way that provides protection according to those guidelines. This specific service is only available for geocoding in the United States.

Import the CSV file

The CSV file is read-only. You'll import it to a new table in your database and add a field to hold the temporary key values.

  1. On the ribbon, click the Analysis tab. In the Geoprocessing group, click Tools.

    Tools button

  2. Search for table to geodatabase.

    Table to Geodatabase tool search

  3. In the list of search results, click the Table To Geodatabase tool.
  4. For Input Table, choose new_patients.csv.

    Input Table parameter

  5. For Output Geodatabase, click the Browse button.
  6. In the Output Geodatabase window, expand Project and Databases. Click protectpatientdata.gdb to select it.

    Output geodatabase

  7. Click OK. In the Geoprocessing pane, click Run.
  8. In the Catalog pane, expand Databases and protectpatientdata.gdb.
  9. Right-click new_patients_csv and choose Rename.

    Rename option

  10. Rename the table to new_patients and press Enter.
  11. Drag the new_patients table onto the map.

    The table is added to the Standalone Tables section of the Contents pane.

    Contents pane showing the new_patients table

    You no longer need the .csv files in the Contents pane, so you'll remove them.

  12. Press Shift while clicking each of the three .csv files to select them. Right-click them and choose Remove.

    Remove option

  13. Right-click new_patients and choose Open.

    Open option

    The table contains address information but also some information about the patients that is considered PII.

    Information in the tables that is PII

    To protect this information, you'll split it off from the addresses. You'll need to create a field to store a temporary key value that you'll use to rejoin the address table to the PII table.

    You might think of using the existing PatientID field, but since this could identify an individual and might be linked to other patient records in other parts of your healthcare system, it is better to assign an arbitrary temporary value for this purpose.

Add a temporary key value

Next, you'll add a new field to the table to hold the temporary key value.

  1. In the table, click Add.

    Add button

    The Fields view appears.

  2. In the new row at the bottom of the field list, for Field Name, type TempKey.

    Field Name parameter

    You can accept the default Data Type value of Long.

  3. On the ribbon, on the Standalone Table tab, in the Manage Edits group, click Save.

    Save button

  4. Close the Fields view.

    Now that you have the TempKey field, you'll calculate new values for it. These will be the keys that the patient data is joined back to the address points on, after you geocode the addresses.

  5. In the new_patients table, scroll to the TempKey field.
  6. Right-click the column header for the TempKey field and choose Calculate Field.

    Calculate Field option

  7. In the Calculate Field window, for Helpers, scroll down to Sequential Number.

    Sequential Number function

  8. Double-click Sequential Number.

    The SequentialNumber() function is added to the TempKey = box. This line will run the helper function code in the Code Block box.

    The Python code defining the SequentialNumber() function is added to the Code Block box.

    Python code to add sequential numbers

    The Code Block box contains Python code that allows you to specify a starting value, pStart, and an interval, pInterval. When the Calculate Field tool is run, it will assign each row in the TempKey field a sequential value starting at the pStart value and incremented each time by the pInterval value.

  9. Edit the value for pStart to 539.

    Code Block parameter with pStart set to 539

    The value 539 is a randomly chosen offset so the TempKey field values don't match the OBJECTID field values. You could choose another integer value that fits within the long integer data type, but 539 works well in this case.

  10. Edit the value for pInterval to 3.

    Code Block parameter with pInterval set to 3

    The value will increment by 3 for each row.

  11. Click the Verify button.

    Verify button

    The Python code is validated to ensure it is still correct after your edits.

    Note:

    If you do not get the Expression is valid message, click the Clear button, add the SequentialNumber() function, and update the starting and increment values again.

  12. Click OK.

    The TempKey field values are updated. The new values start at 539 and increment by 3 for each row.

    TempKey values

    These values do not relate directly to the patient records, but they will allow you to join your patient PII back to the points after you geocode the address table.

Create a table without PII

Next, you'll create a table to geocode that contains only the address fields and the TempKey field.

  1. In the Catalog pane, under protectpatientdata.gdb, right-click new_patients and choose Copy.

    Copy option for the table

  2. Right-click protectpatientdata.gdb and choose Paste.

    Paste option for the table

    A copy of the table, new_patients_1, is added to the geodatabase.

  3. Right-click new_patients_1 and choose Rename. Type patients_no_identifiers.

    This output name is for clarity about the process for this tutorial. In a real-world situation, it would be better to avoid using the word patient in a file or table name.

  4. Drag the patients_no_identifiers table onto the map.
  5. In the Contents pane, right-click patients_no_identifiers and choose Open.

    patients_no_identifiers table

    The patients_no_identifiers table is a copy of the new_patients table. You'll remove the PII fields from it.

  6. Click the header of the Name column.

    Name column

    The column is selected.

  7. Press the Shift key and click the header of the PatientID column.

    The columns for Name and PatientID, and the columns between them, are selected.

    Multiple columns in table selected

  8. Right-click a header of one of the selected columns and choose Delete.

    Delete option

    You're asked to confirm that you want to delete the selected fields.

  9. Click Yes.

    The Name, Sex, RaceSelected, Hispanic_Latino, Lang_Pref_Home, and PatientID fields are deleted. You're left with the columns needed for geocoding and the TempKey column that you will use to join the original table to the geocoded results.

    Table without PII fields

Geocode the table

Now that you have the patient address data without any other personal or health identifiers, you'll geocode the table to visualize the patient locations on the map.

  1. In the Contents pane, right-click patients_no_identifiers and choose Geocode Table.
  2. In the Geocode Table pane, click Go to Tool.
  3. For Input Locator, choose ArcGIS World Geocoding Service.

    ArcGIS World Geocoding Service option

  4. Confirm that the input fields of the table are mapped correctly to the ArcGIS World Geocoding Service fields.

    Field mapping

  5. For the Output, click the Browse button. Type new_patient_locations and click Save.
  6. For Output Fields, choose Minimal.
  7. For Country, check United States.
  8. For Category, click the drop-down menu and expand Address. Check Subaddress, Point Address, and Street Address.

    Address categories

    Since you only want to map real patient addresses, you'll only check those options. If all categories remained checked, you might get results to mapped to intersections or businesses rather than real addresses that are likely to be a patient's home. It will be easier for you to focus on results that don't have a good address match if those records remain unmatched as opposed to being matched to the other categories.

  9. Click estimate credits.

    Geocoding the table of new patient addresses will use 3.88 credits.

  10. Click Run.

    The geocoding process matches 94 of the new records, but three are unmatched.

    Records can be unmatched for a number of reasons. When geocoding with the ArcGIS World Geocoding Service, most addresses would normally find an approximate match by falling back to less precise address types like the centroid of a ZIP Code polygon. Since you specified only Subaddress, Point Address, and Street Address match types as acceptable, the rows that have missing information and do not match these categories of location are unmatched.

  11. In the Geocoding Completed window, click Yes.

    The Rematch Addresses - new_patient_locations pane appears, showing the first unmatched record.

    Rematch Addresses pane

    There seems to have been a data entry or table management error. The street address appears to be split, so part of the street name is in the City field, while the City value is in the Subregion field.

    In a real-world situation, you could contact the patient, or the source of your new_patients.csv file, to get the correct address and ensure that it is corrected elsewhere in your system, but for the purposes of this tutorial, you'll fix the error and proceed.

  12. Edit the following parameters:
    • For Address, type 840 Old Lebanon Dirt Rd.
    • For City, type Hermitage.
    • For Subregion, type Davidson County.

    Edited address

    After you have edited the fields to have the correct values, you can rematch the address.

  13. Click Apply.

    The address is matched to a point location with a score of 100, a high reliability value.

  14. Click the Match button.

    Match button

  15. Click the Save Edits button and click Yes.

    Save Edits button

    The other two addresses are no longer valid, so you'll delete them from the table.

  16. In the new_patient_locations table, scroll down to the two addresses with a U in the Status column.
  17. Click the row number next to the first record to select the row. Press Ctrl and click the other row to select them both.

    Unmatched address rows selected

  18. On the toolbar at the top of the table, click Delete. Click Yes.
  19. In the Rematch Addresses pane, click the Save Edits button. Click Yes.
  20. Close the Rematch Address pane.

Join patient information and points

Now that you have all the new patient records geocoded, you'll join the protected data, such as name, race, and sex back to them.

  1. In the Geoprocessing pane, search for and open the Join Field tool.

    Join Field tool

  2. For Input Table, choose new_patient_locations.

    Input Table set to new_patient_locations

  3. Set the following parameters:
    • For Input Field, choose TempKey.
    • For Join Table, choose new_patients.
    • For Join Field, choose TempKey.
  4. For Transfer Fields, choose Name.

    Name field

    Another drop-down menu appears below Name.

    Second drop-down menu

  5. Add the following fields to Transfer Fields:
    • Sex
    • RaceSelected
    • Hispanic_Latino
    • Lang_Pref_Home
    • PatientID

    Transfer Fields parameter

  6. Click Run.

    The PII fields are joined back onto the geocoded patients.

    PII fields in the table

    To comply with privacy guidelines, you must save your data to your local machine or to a secured server in your ArcGIS Enterprise environment rather than storing your output data in ArcGIS Online. Storing health data with PII in ArcGIS Online is not yet approved by the same HIPAA rules that permit geocoding with the ArcGIS World Geocoding Service.

  7. Close all open tables.

Append the new patients

The new_patient_locations feature class is ready to be appended to your main patients feature class.

  1. In the Geoprocessing pane, click the Back button. Search for and open the Append tool.

    The Append tool takes one or more layers of data and copies them into another layer.

  2. For Input Datasets, choose new_patient_locations.
  3. For Target Dataset, choose patients.
  4. For Field Matching Type, choose Use the field map to reconcile field differences.

    Use the field map to reconcile differences option

    This option will allow you to append new_patient_locations into patients, in spite of the fact that their schemas do not match exactly. The TempKey field is not present in the patients layer. If you used the default option, both layers would have to have exactly the same fields.

    Since they don't match exactly, you can drop the values in TempKey by not mapping them to an existing field in patients. These values were only useful for rejoining the PII to the points, and you do not want to keep them.

    The field map contains an Output Fields section and a Source section. Each of the fields in the patients layer is listed in Output Fields. You can click them and see what fields in the Source layer will be mapped to them.

    Field mapping allows you to direct data from one field into a field of another name. For example, if the patients layer had a field named State instead of Region, you could click that field and specify that data in the Region field of new_patient_locations layer would be appended into it.

  5. Click Run.

    The new patients are added to the patients layer. Now you can use that layer in your analysis to identify clusters of patients from a population that may be underserved.

  6. In the Contents pane, turn on the patients layer.
  7. Save the project.

You have geocoded your new_patients table using a method that protects PII and complies with HIPAA rules. You have updated the main patients layer with this new data, and now you can use this securely stored local data in an analysis.


Identify an underserved population

In Nashville, 17 percent of the population does not speak English at home. Among this group, the most common language spoken at home is Spanish. Nearly half of this group speaks Spanish only. When information and care are not delivered in a familiar language, patients may experience decreased satisfaction with care, reduced access to preventative care, and worse health outcomes.

Your health system wants to better serve Spanish-speaking patients and would like to see if they cluster spatially near certain clinics, or if they are in areas that have less access to clinics, as measured by the service areas. Understanding the linguistic preferences and needs of patients in a service area will help you ensure your health system is providing adequate translation and interpretation services. It is important that all patients are provided with information and services they can fully understand to make informed decisions about their health.

Identify patients outside of the service area

You'll combine the information you derived for the service areas with the patient data to find patients who are not within the 15-minute drive-time service area.

  1. If necessary, in the Contents pane, check the Service Area group layer to show it.
  2. In the Service Area group layer, right-click Polygons and choose Attribute Table.
  3. In the Polygons attribute table, right-click the column header for FromBreak and choose Sort Descending.

    Sort Descending option

  4. Scroll down the table and click the row header for the first row that has a value of 0 in the FromBreak field.

    First FromBreak value of 0 selected

  5. Press the Shift key while clicking the row header of the last row that has a value of 0 in the FromBreak field.

    All FromBreak 0 value rows selected

    All of the 0–15 minute drive-time service area polygons are selected.

  6. In the Contents pane, right-click Polygons and choose Zoom to Layer.

    Zoomed to the 0-15 minute drive time polygons

  7. In the Geoprocessing pane, click the Back button. Search for and open the Select Layer By Location tool.

    Select Layer By Location tool

  8. For Input Features, choose patients.

    Input Features set to patients

  9. For Selecting Features, choose Service Area\Polygons.

    Selecting Features set to Service Area\Polygons

    A notification appears below the Selecting Features input box, indicating that the Service Area\Polygons feature class has a selection and that eight records will be processed. When you saw a similar message earlier, you needed to clear the selection so all of the features would be processed. In this case, you have deliberately selected the 0–15 minute drive-time service area polygons to use them to create the selection. You do not need to clear the selection.

  10. Click Run.

    Patients within the 0–15 minute drive-time service area

    The patients within the 0–15 minute drive-time service area polygon are selected. This is a step toward the information that you want, which is the inverse of this set. That is, you want all of the patients that are not in this range.

  11. Open the patients table.

    Most of the patients are selected. You'll switch the selection to select the patients that are outside of the 15-minute drive-time areas to help identify underserved areas.

  12. Click Switch.

    Switch button

    The selection switches and now 984 records are selected. These patients are the ones who are outside of the 0-15 minute drive-time service area. These patients may have a more difficult time accessing services than those who are closer to facilities.

    Note:

    If the number of patients selected isn't exactly the same, it's fine to proceed.

Identify Spanish speakers

Now that you have selected the patients who are outside of the 15-minute drive-time service area, you'll select the Spanish-speakers in this group.

  1. In the Geoprocessing pane, click the Back button. Search for and open the Select Layer By Attribute tool.
  2. For Input Rows, choose patients.

    The tool indicates that there is a selection. Only selected records will be processed when the tool is run.

  3. For Selection Type, choose Select subset from the current selection.

    Select subset from the current selection option

  4. For Expression, click Select a field and choose Lang_Pref_Home.

    Lang_Pref_Home field

    The Lang_Pref_Home field contains the preferred language spoken in the patients' homes.

  5. Accept the default comparison operator, is equal to. In the value drop-down menu, choose Spanish.

    Spanish option

  6. Click Run.

    The tool selects the patients who have the value of Spanish in the Lang_Pref_Home field from within the already selected set of patients outside of the 0-15 minute drive-time service areas.

  7. Uncheck the Polygons layer.

    You can now see the selected features.

    Selected Spanish-speaking patients outside 0-15 minute drive time

    There are two small clusters in the northwest and southeast of the area.

Export the selected features

You've identified a set of patients who meet this particular set of selection criteria. You'll save these points to their own feature class so you'll have them to work with, without going through all the selection steps. In a service expansion planning project, you may have multiple populations of interest, so you can generate several sets of patient points. For example, you can use data enrichment to identify patients in neighborhoods where the rate of car ownership is low, do a walk-time service area analysis, and select out clusters of patients who are outside a convenient walking distance.

  1. In the Contents pane, right-click patients, point to Data and choose Export Features.

    Export Features option

    The Input Features parameter is already set to your patients layer. The tool notifies you that there is a selection and those will be processed.

  2. In the Export Features pane, for Output Feature Class, type patients_underserved_1.

    Output Feature Class parameter

    You'll accept the default output location. Since this is a subset of the patients layer, it contains PII and must be stored securely on your local machine or on an ArcGIS Enterprise server behind your firewall.

  3. Click OK.

    The tool runs and the patients_underserved_1 layer is added to the map.

  4. Uncheck the patients laye.

    Underserved patients on the map

    The northeast cluster you've identified as potentially underserved patients is close to a competitor location: Tennessee Star Medical Group—Primary Care Hermitage. Given this fact, this group, while not served within your network, is perhaps not as underserved as it appears at first glance. If you were to run a drive-time service area analysis, you would likely find that many patients within this cluster would be within a 15-minute drive of the competitor clinic.

    Cluster near competitor

    In contrast, the other cluster to the southwest has no nearby medical facilities, within your network or within the competitor network.

    Cluster not close to competitor

    While it might eventually be worth building a location near the northeast cluster and the competitor location, it's likely that constructing a new provider facility near the southwest cluster will provide the most benefit to the community.

    You could continue your analysis by identifying other clusters of underserved patients who meet different criteria and combine those results to plan your expansion.

  5. Save the project.

In this tutorial, you learned about geocoding health information and service area analysis in the context of strategic planning for a health system. When geocoding patient data, it is critical to take patient privacy regulations and guidelines into consideration. Using fictitious data, you learned how to apply the standard geocoding processes for non-private health data, and the HIPAA-aligned geocoding process for protected health data. Using fictitious data, you learned how to strip identifying fields from a patient address table and rejoin it after the fact so that only address information is used for geocoding. You learned that while the ArcGIS World Geocoding Service meets HIPAA guidelines, it is a best practice to strip those fields. Patient data will be protected when you use the ArcGIS World Geocoding Service within the United States. You also learned that you must take care when storing or hosting patient data.

The objective of this tutorial is for you to increase your confidence and understanding related to geocoding sensitive data. Please also understand that the topic of data security, and especially health data security, is broad and this tutorial alone is not sufficient to cover every potential threat.

To further enhance your knowledge on this topic, particularly as it relates to geographic information, try the tutorial De-identify health data for visualization and sharing, which covers various masking and data aggregation techniques that will further protect your sensitive information. As always, you should be aware of and follow your own organizational policies and procedures associated with personally identifiable information and protected health information.

You can find more tutorials in the tutorial gallery.