Geocode facilities and competitors

Health-care organizations typically weigh multiple factors when planning to expand into a new market or attempting to better serve existing populations. The process may include assessing the location of existing facilities belonging to your network and your competitors, measuring the accessibility of a given location based on a community’s means and modes of travel (for example, driving, walking, or the availability of public transit), and examining the distribution of current patients. However, you must take care when using individual patient data. Legal and ethical guidelines require special handling for protected health information (PHI) and personally identifying information (PII).

Map facilities

Hospital strategists need a geographic perspective to make good decisions for the growth, efficiency, relevance, and sustainability of their organization. An improved understanding of the network helps support the diverse communities it serves and helps to close gaps and provide better care. A first step in gaining that insight is to map the various health system facilities, such as the main hospital campus and the network of primary care clinics, within the community. This foundational data can serve as the basis for several calculations such as the following: 

  • Determination of service areas
  • Examination of population characteristics
  • Identification of gaps in services and outcomes
  • Proximity of competitors

You will map your facilities and those of your competitors. This workflow will consume credits, but you will have the option to estimate how many credits will be used in an operation prior to running a tool.

  1. Download the Protect_Patient_Data_Zipped_Folder.zip zipped project folder.

    A file called Protect_Patient_Data_Zipped_Folder.zip is downloaded to your computer. Depending on your browser and settings, it may be saved in your Downloads folder, or on your Desktop.

  2. Locate the downloaded file on your computer and use a zip utility to extract the zipfile to a folder. Specify the output folder location and click Next.

    Extract the downloaded zip file to a folder.

    Specify a location on your computer for the extracted Protect_Patient_Data_Zipped_Folder folder.

    This is a password-protected zip archive. A password window appears.

  3. For Password, enter the password I_Understand_This_Is_Fictitious_Data and click OK.

    Password needed.

    Use of this password indicates that you understand that the data is fictitious.

    The zipfile is extracted to your computer as a folder.

  4. In the Windows file explorer, open Protect_Patient_Data_Zipped_Folder.

    The folder contains a file called ProtectPatientData.ppkx.

    A .ppkx file is an ArcGIS Pro Project Package, a compressed file for sharing projects that may contain maps, data, and other files that you can open in ArcGIS Pro

  5. Double-click ProtectPatientData.ppkx to open it in ArcGIS Pro. If prompted, sign in with your ArcGIS account.

    A map of Nashville, Tennessee appears.

    The patients point layer shows the location of fictitious current patients of the health system. You will geocode and append a file of new fictitious patients to this layer later.

    Note:
    This data is fictitious. It has been created for the purpose of demonstrating the workflow in this tutorial. It is designed to look plausible for the workflow and is structured similarly to data that you might use in this situation, but due to the legal limitations on sharing real data of this type, it is entirely made up. Do not rely upon this data. Do not attempt to draw conclusions or make real-world decisions based on this data. Do not use this data to train AI or ML models, the results will be inaccurate. The addresses in this dataset are real addresses, for the purposes of enabling a demonstration of geocoding and to provide plausible data to work with, but the data has no real relation to these addresses. Any names or attribute values associated with these addresses in the datasets are made up and have nothing to do with any actual persons or conditions at these locations.

    The first step is to geocode a table of your health-care facilities.

  6. In the Contents pane, uncheck the patients layer.

    You will examine this layer and append new patients to it later.

  7. If the Catalog pane is not open, on the ribbon, click View and in the Windows group, click Catalog Pane.

    Open the Catalog pane.

  8. In the Catalog pane, expand Folders, expand ProtectPatientData, and expand commondata and expand userdata.
    Expand folders to see the contents of the userdata folder.

    The userdata folder is where files that have been packaged with an ArcGIS Pro project package are stored.

    This folder contains three comma-separated value (.csv) files with the data that you will add to your project.

  9. Click CompetitorLocations.csv, and press the Shift key while clicking new_patients.csv.

    The three .csv files are selected.

  10. With the three files selected, click one and drag all three onto the map.

    Drag .csv files onto the map to add them to the project.

    The tables do not have geometries, so you don't see them on the map, but they are added to the Contents pane in the Standalone Tables group.

    The .csv files are in the Standalone Tables group in the Contents pane.

    You will make point features from the addresses in these tables by geocoding them. Hospital and clinic locations are not PHI or PII, so you do not need to take any special precautions in this process.

  11. In the Contents pane, right-click FacilityLocations.csv, and click Geocode Table.

    Right-click FacilityLocations and click Geocode Table.

    The Geocode Table pane appears.

  12. In the Geocode Table pane, click Start.

    Click Start.

  13. For the Input Locator, click the drop-down list and choose ArcGIS World Geocoding Service, and click Next.

    Choose the ArcGIS World Geocoding Service.

    If you do not see the ArcGIS World Geocoding Service option in the list, you must sign in to your ArcGIS Online organization.

    The ArcGIS World Geocoding Service consumes ArcGIS Online credits at a rate of 40 credits per 1,000 addresses. This table has addresses for eight facilities, so geocoding it will consume 0.32 credits.

    Note:

    If your organization has purchased ArcGIS StreetMap Premium, you may also see that locator as an option. With ArcGIS StreetMap Premium, the street data and locator are hosted within your organization's firewall, while the ArcGIS World Geocoding Service is hosted in ArcGIS Online. Both of these locators are built with reference data that covers streets, their directionality, speed limits, address numbering, and more. When you’re making data-driven decisions that will impact the lives of real people, confidence in your geocoding results is critical. For more information, see the documentation for ArcGIS World Geocoding Service, and the ArcGIS StreetMap Premium product page.

  14. Click the Attribute table button to see the fields of the FacilityLocations.csv table.

    Click the Attribute table button.

    The table of facilities has multiple fields for the parts of the address.

    Facilities table

    ArcGIS World Geocoding Service will use the data from the Address, City, State, and Zip fields to determine the facility locations.

  15. Accept the default value of More than one field and click Next.

    The geocoding tool detects the fields and maps them to some of the fields that the locator can use for identifying locations.

    Fields are mapped.

    If you had a table with different field names that the locator could not automatically match, you could choose the correct fields in the drop-down lists.

  16. For Output, click the Browse button and on the Output window, in the Name text box, type Facilities and click Save.

    Name the output Facilities.

    The location for the new feature class defaults to protectpatientdata.gdb, the default geodatabase for the project.

  17. For Preferred Location Type, accept the default value of Address Location.
  18. For Output Fields, click the drop-down list, click Minimal, and click Next.

    Choose Minimal.

    This will reduce the number of fields added to your output feature class. Since your primary goal is to visualize these locations, it’s unlikely that you will need comprehensive geocoding fields as part of your output.

    Note:
    You will still get some new fields containing information about the geocoding process, such as matched address, confidence scoring, address type, and match type, but the Minimal option prevents duplication of your input fields and prevents the addition of other geocoding output fields that may be empty.

  19. For Country, check United States and click Next.

    Check United States.

  20. Leave all of the categories unchecked on the Limit By Categories pane, and click Finish.

    Leave all categories unchecked.

    The ArcGIS World Geocoding Service is designed to match a wide variety of types of addresses. By leaving these unchecked you are allowing the locator to match addresses with the greatest flexibility. Later, you will specify a more limited set of address types to match.

  21. Click the estimate credits link to see how many credits the process will use.

    Estimate credits.

    The message updates to show the number of credits that the geocoding process will consume.

    Estimate of 0.32 credits

    The estimate of 0.32 credits is what you would expect for geocoding eight addresses.

  22. Click Run.

    When the geocoding process finishes, you will be prompted to rematch addresses. Since the message reports that no addresses were unmatched, you do not need to rematch.

    Click No on the Geocoding Completed dialog box.

  23. Click No.

    The eight facilities in your network are added to the map.

    Facilities added to the map.

    You’ve geocoded your health-care system’s locations. Next, you’ll repeat the process to geocode and visualize the competition in the region.

  24. Click Save Project to save your project.

Map competitors

The next step is to geocode the locations of your major regional competitor, Tennessee Star Medical Group. The table of competitor locations has addresses formatted in same columns as the table of your facilities. It contains the addresses of six locations.

  1. In the Contents pane, right-click CompetitorLocations.csv and click Geocode Table.

    When you geocoded your facilities, you followed the six-step guided workflow. Now you'll skip that and work directly with the tool.

  2. On the Geocode Table pane, click Go to Tool.

    Click Go to Tool.

  3. Click the Input Locator drop-down list and click ArcGIS World Geocoding Service.

    For Input Locator, choose the ArcGIS World Geocoding Service.

    A note appears at the top of the tool that the operation will consume credits. You will only be geocoding six locations, so it will cost a fraction of a credit.

    Because you opened the Geocode Table tool by right-clicking the CompetitorLocations.csv table and choosing to geocode it, the Input Table box already contains the CompetitorLocations.csv. When you chose the ArcGIS World Geocoding Service, the fields from the table were automatically mapped to some of the fields that the ArcGIS World Geocoding Service uses.

    CompetitorLocations fields mapped to the locator.

  4. For Output, click the Browse button and in the Name box, type Competitors, and click Save.
  5. For Output Fields, click the drop-down list and click Minimal.
  6. For Country, click the drop-down list and check United States.
  7. Click the estimate credits link to see how many credits the process will use.

    The process will consume 0.24 credits.

  8. Click Run.

    When the tool finishes, you are prompted to start the Rematch process.

  9. All of the addresses were matched, so click No.
  10. Close the Geocode Table tool.

    You've geocoded your facilities and those of your competitor.

    Because health-care facilities location data is not PII or PHI, you used the standard method for geocoding, which can be applied to anything that is not specifically protected. Since this is not protected information, you can also feel comfortable that your resulting point layers can be stored on your desktop, within your ArcGIS Enterprise environment, or in ArcGIS Online.

  11. Click Save Project to save your project.

Change the symbology of the facilities

You will examine the distribution of your facilities and those of your competitor on the map. To make it clearer, you will change the symbols for the points.

  1. In the Contents pane, right-click Facilities and click Symbology.

    Right-click Facilities and click Symbology.

  2. In the Symbology pane, click the Primary symbology drop-down list and click Unique Values.

    Choose Unique Values symbology.

  3. Click the Field 1 drop-down list and click FacilityType.

    Choose FacilityType for the unique values symbology field.

    The Facility Type field contain two values, 1 and 2. The code 1 indicates a hospital, and the code 2 indicates a primary care center. You will set symbols for the different types of facilities.
  4. Click the point symbol in the row for Facility Type 1.

    Click Facility Type 1 symbol.

  5. In the Search bar, type Hospital pushpin, press Enter, and click the largest of the hospital pushpin symbols.

    Search for and select the large Hospital pushpin symbol.

  6. In the Symbology pane, click the back button.
  7. Click the point symbol in the row for Facility Type 2.

    Click Facility Type 2 symbol.

  8. In the Search bar, type Hospital, press Enter, and click the largest of the hospital circle symbols.

    Hospital circle symbol

    You've symbolized the hospitals and clinics in your network.

Set the symbology for the competitors

Next, you'll use the same symbols for the competitors, but choose different colors. Since you've already done the work to set the symbology for your own facilities, you can import it and then change the colors to make the competing facilities visually distinct.

  1. In the Contents pane, click the Competitors layer.
  2. In the Symbology pane, click Options and click Import symbology.

    Click options and click Import symbology.

  3. In the Geoprocessing pane, for Symbology Layer, click the drop-down list, click Facilities, and click Run.

    Import symbology from the Facilities layer.

    The symbology of the Facilities layer is imported and applied to the Competitor layer.

    Competitors have Facilities symbology.

  4. Right-click the Hospital symbol and in the color palette, change the color to a medium shade of gray.

    Make competitor locations gray.

  5. Use the same method to change the color of the symbol for competing primary care centers.
  6. Close the Symbology pane.

    Map showing facilities and competitors.

    You can now see the distribution of patients, competing facilities, and your own facilities together on the map.

    You can add labels to get more information as you explore the map.

  7. In the Contents pane, right-click Facilities and click Label.
  8. In the Contents pane, right-click Competitors and click Label.
  9. Zoom in to some of the places where your facilities are close to your competitors and check the names.

    Your primary hospital facility and Tennessee Star’s primary hospital facility are located on opposite sides of the downtown Nashville area. Even so, it’s likely that those two locations see a great deal of overlap in the populations they serve.

    A couple of the primary care locations in your health system are also close to some competitor locations.

    • Tennessee Star Medical Group—North and Nashville Memorial Primary Care—Grizzard are two primary care facilities in fairly close proximity.

    • Tennessee Star Medical Group—Nashville Primary Care South and Nashville Memorial Primary Care—North are also in fairly close proximity, but it’s unlikely that the 24-hour clinic and the family medical practice will serve the same communities under the same conditions. There are likely to be either demographic differences in the patients at the two locations or a difference in the type of care being sought.
    • Tennessee Star Medical Group—Nashville Primary Care South-East and Nashville Memorial Primary Care—Antioch appear to be the two most remote locations for each medical group, farthest from Nashville’s city center. However, it is unclear how much their service areas may overlap at this stage.

  10. Click Save Project to save your project.

You can gather a great deal of useful information by displaying this data on the map and performing a brief visual analysis. However, other GIS tools will allow you to dive deeper into the data to enhance your understanding of the communities these medical networks serve. This analysis will allow you to draw some conclusions to support making business decisions, and is a first step to provide more equitable healthcare to the Nashville region.


Determine service areas

Once health-care facilities are mapped, you can calculate their services areas. Creating a service area is like buffering a point. When you buffer a point, you specify a straight-line distance, and a circle is created using that distance. When you create a service area around a point, you also specify a type of radius, but unlike a buffer, the radius represents either the maximum time or distance that can be traveled along a network, such as a road or sidewalk network. The result is a service area polygon indicating the ability to reach the point within the time or distance you specified. Getting accurate results for this calculation is another reason to ensure you’re using the best available reference data, like the ArcGIS World Geocoding Service or ArcGIS StreetMap Premium.

Determine service areas for your facilities

Health research has shown that travel time is a more important indicator of access to care than travel distance.

  1. On the ribbon, click the Analysis tab, and in the Workflows group, click Network Analysis and click Service Area.

    Click Network Analysis and click Service Areas.

    You will see a message that says Creating new analysis layer. When the process is complete, the Service Area layer is added to the  Contents pane.

    The group layer includes sublayers for the inputs and outputs of the analysis. The colors of the service area features are randomly assigned, so your map may not match the colors shown in this tutorial.

    Service areas input and output layers are added to the Contents pane.

    You will load your healthcare facilities into the Facilities sublayer, and when you run the analysis, the Polygons sublayer will show the resulting drive-time polygons. You will not need the Lines sublayer for this analysis. The three Barriers sublayers allow you to specify places where the travel network is not passable. For this analysis, it is not necessary to add any barriers.

  2. In the Contents pane, click Service Area.

    Click Service Area.

    Clicking this group layer activates the Service Area Layer tab on the ribbon, which contains tools for conducting service area analyses.

  3. On the ribbon, click the Service Area Layer tab.

    Click Service Area Layer tab.

  4. On the Service Area Layer tab, in the Input Data group, click Import Facilities.

    Click Import Facilities.

  5. On the Add Locations tool, click the Input Locations drop-down list and click the Facilities layer.

    Set the Input Locations to the Facilities layer.

    This will load your health-care facilities features into the Facilities sublayer of the Network Analysis Service Area layer.

    Input Network Analysis Layer is set to Service Area and the sublayer is set to Facilities.

  6. Click OK to run the Add Locations tool.

    You may see a warning that the Name field of the input data is longer than the Name field of the Facilities sublayer. This is due to the way fields with text data are imported from .csv files. The address values in the field are much shorter than 500 characters, so they were not truncated.

    The Facilities sublayer now contains your Facilities points. They are drawn on the map as solid colored circles with a black outline.

  7. On the ribbon, on the Service Area Layer tab, in the Travel Settings group, verify that the Mode value is Driving Time.
  8. Edit the Cutoffs box values to be 15, 30.

    Edit the Cutoffs values to be 15, 30

    This will calculate 15-minute and 30-minute drive-times service areas.

    The service area analysis tool consumes credits based on the number of input features and the number and the number of drive times. This layer has eight input features and you are solving drive times for 15 and 30 minutes, so 16 service areas will be created, at a cost of 0.5 credits per service area. This will consume a total of eight credits.

  9. On the Service Area Layer tab, in the Analysis section, click Run.

    Click Run to run the Service Area analysis.

    The analysis process runs for a few seconds. When it completes, the Polygons layer symbology updates to show the 15- and 30-minute drive-time polygons and the service area polygons are added to the map.

    The symbology for the Polygons layer is updated.

    Service areas 15 and 30 minute drive times

    The 30-minute drive-time service area covers nearly all of Nashville’s city limits, and therefore anyone living in the urban center who has a car will likely be able to travel to a Nashville Memorial location within 30 minutes. The drive-time service area extends out along highways, where higher speeds allow greater distance to be covered within the specified time. The 15-minute drive-time service area is smaller, covering most of the center of Nashville. Drivers in this area could reach a facility within 15 minutes.

    It is worth thinking about how much of Nashville Memorial’s patient population does not have access to a car. The same procedure you used to calculate drive-time service areas can also be applied to walking-time service areas. If you wanted to do this, you'd create a new Service Area layer, load the facilities into it, set the Mode option to Walking Time, and run the service area analysis again.

    Walking time settings

    The walking-time service areas cover a much smaller fraction of the city. You could analyze your patient demographics to find places where clusters of patients do not have cars and provide additional services there or supply transportation to and from those locations.

    Walking-time service areas

    An additional factor worth considering would be that some patients in Nashville’s urban center might be less likely to have access to a car but might have better access to public transit. If you have public transit data, you can take that into account in your analysis. Read more about leveraging transit data in network analysis in the help.

    You could also use these techniques to analyze the service areas of the Tennessee Star Medical Group.

  10. Click Save Project to save your project.

Now that you've mapped your facilities and determined the 15- and 30-minute drive-time service areas, you can examine the relationship of your facilities to your patients.


Geocode patients

Geocoding the patient addresses from your network will help you visualize where the fictitious Nashville Memorial patients are located. This is a first step in determining gaps in access. It will allow you to determine if there are large numbers of patients living outside of the service areas for your facilities.

Because geocoding with the ArcGIS World Geocoding Service will require sending your patient address data outside of your organizational network and firewall-protected devices, you must consider regulations protecting personal information.

Again, the following steps use fictitious data. In the future, if you are geocoding personal health information for your organization, be sure that you have permission to access and use such data for your intended workflows.

You will split the new patient data table into two tables, one containing PII and a key value and another containing the addresses and a matching key value. This will allow the address data to be passed to the ArcGIS World Geocoding Service without sending any more information than is necessary to obtain the point locations. When the points are returned by the geocoding process, you can join the patient data back to them, using the shared key.

While this is a best practice as you geocode your data using the ArcGIS World Geocoding Service, it alone is not sufficient to protect private data. The ArcGIS World Geocoding Service is different from most other geocoders since it is designated and approved as an ArcGIS Online Health Insurance Portability and Accountability Act (HIPAA) Eligible service, meaning that it is validated for alignment with HIPAA guidelines and any data submitted to the service is processed in a way that provides protection according to those guidelines. This specific service is only available for geocoding in the United States. For more information, refer to https://trust.arcgis.com/en/privacy/hipaa.htm.

Import the .csv file

The .csv file is read-only. You will import it to a new table in your database and add a field to hold the temporary key values.

  1. On the ribbon, click the Analysis tab, and in the Geoprocessing section, click Tools.

    Click the Analysis tab and click Tools.

  2. In the Search box, type table to geodatabase.

    Search for the Table to Geodatabase tool.

    The search returns the Table To Geodatabase geoprocessing tool.

  3. Click the Table To Geodatabase tool.
  4. On the Table To Geodatabase tool, in the Input Table drop-down list, choose new_patients.csv.

    Input table set to new_patients.

  5. For Output Geodatabase, click the Browse button.
  6. In the Output Geodatabase pane, expand Project and expand Databases, then click protectpatientdata.gdb, and click OK.

    Choose the target geodatabase for the table.

  7. Click Run.
  8. Click the Catalog tab.
  9. In the Catalog pane, expand Databases and expand protectpatientdata.gdb.
  10. Right-click new_patients_csv and click Rename.

    Rename the table.

  11. Delete the _csv at the end of the table name, so the table name is new_patients, and press Enter.
  12. Click the new_patients table and drag it onto the map.

    The new_patients table is added to the Standalone Tables section of the Contents pane.

    The new_patients table is added to the Contents pane.

    You no longer need the .csv files in your project, so you can remove them. Keep the table you just added, new_patients, in the project.

  13. Press Shift while clicking each of the three of the .csv files to select them, right-click one of them, and click Remove.

    Remove the .csv files.

  14. Right-click new_patients and click Open.

    Open the new_patients table.

    The table contains address information but also some information about the patients that is considered PII.

    The new_patients table contains PII

    To protect this information, you will split it off from the addresses. You'll need to create a field to store a temporary key value that you will use to rejoin the address table to the PII table.

    You might think of using the existing PatientID field, but since this could identify an individual and might be linked to other patient records in other parts of your health-care system, it is better to assign an arbitrary temporary value for this purpose.

  15. Click Save Project to save your project.

Add a field for the temporary key value

Next, you'll add a new field to the table to hold the temporary key value.

  1. At the top of the table, in the Field row, click Add.

    Click Add.

    The Fields: new_patients design pane appears.

  2. In the new row at the bottom of the field list, in the Field Name box, type TempKey.

    Add the TempKey field.

    You can accept the default Data Type value of Long.

  3. On the ribbon, on the Standalone Table tab, in the Changes section, click Save.

    Save the change.

  4. Close the Fields: new_patients tab.

    Close the field design tab.

Calculate a temporary key value

Now that you have the TempKey field, you will calculate new values for it. These will be the keys that the patient data is joined back to the address points on, after you geocode the addresses.

  1. In the new_patients table, scroll to the right to the TempKey field.
  2. Right-click the column header for the TempKey field and click Calculate Field.

    Right-click TempKey and click Calculate Field.

  3. On the Calculate Field tool, scroll down the Helpers section to Sequential Number.

    Scroll Helpers to Sequential Number.

  4. Double-click Sequential Number.

    This adds SequentialNumber() to the TempKey = field box. This line will run the helper function code in the Code Block box.

    The Python code defining the SequentialNumber() function is added to the Code Block box.

    Code block with Python code to add sequential numbers

    The Code Block box contains Python code that allows you to specify a starting value, pStart, and an interval, pInterval. When the Calculate Field tool is run, it will assign each row in the new_patients table TempKey field a sequential value starting at the pStart value and incremented each time by the pInterval value.

  5. Edit the value for pStart to be 539.

    Edit the pStart value.

    The value 539 is a randomly chosen offset so the TempKey field values don't match the OBJECTID field values.

    You could choose another integer value that fits within the Long Integer data type, but 539 works well in this case.

  6. Edit the value for pInterval to be 3.

    Edit pInterval.

    This will increment the value by 3 for each row.

  7. Click the Verify button to validate that the Python code is still correct after your edits.

    Click the Verify button.

    You will see the Expression is valid message. If you do not get this message, click the Clear button, add the SequentialNumber() function again, and update the starting and increment values again.

  8. Click OK to run the Calculate Field tool.

    The TempKey field values are updated. The new values start at 539 and increment by 3 for each row.

    TempKey values assigned

    These values do not relate directly to the patient records, but they will allow you to join your patient PII back to the points after you geocode the address table.

  9. Click Save Project to save your project.

Create a table without personal information to geocode

Now you will create a table to geocode that contains only the address fields and the TempKey field.

  1. In the Catalog pane, in the protectpatientdata.gdb, right-click new_patients and click Copy.

    Copy the new_patients table.

  2. In the Catalog pane, right-click protectpatientdata.gdb and click Paste.

    Paste the copy of the table into the geodatabase.

    A copy of the table, named new_patients_1, is added to the geodatabase.

  3. Right-click new_patients_1 and click Rename.
  4. For the table name type patients_no_identifiers.

    This output name is for clarity about the process for this tutorial. In a real-world situation it would be better to avoid using the word "patient" in a file or table name.

  5. Click the patients_no_identifiers table and drag it onto the map.
  6. In the Standalone Tables section of the Contents pane, right-click patients_no_identifiers and click Open.

    The patients-no-identifiers table open on the map. is a copy of the new_patients table.

    The patients_no_identifiers table is a copy of the new_patients table. You will remove the PII fields from it next.

  7. Click the header of the Name column.

    Click the header of the Name column.

    The Name column is selected.

    The name column is selected.

  8. Press the Shift key and click the header of the PatientID column.

    This selects the columns for Name and PatientID, and the columns between them.

    Multiple columns are selected.

  9. Right-click on a header of one of the selected columns and click Delete, then click Yes to confirm that you want to delete these fields.

    Delete the selected columns that contain PII.

    This will delete the Name, Sex, RaceSelected, Hispanic_Latino, Lang_Pref_Home, and PatientID fields.

    The columns are removed from the table, and you are left with the columns needed for geocoding and the TempKey column that you will use to join the original table to the geocoded results.

    The PII fields are removed from the table.

Geocode the table

Now that you have the patient address data without any other personal or health identifiers, you will geocode the table to visualize the patient locations on the map.

  1. In the Contents pane, right-click patients_no_identifiers and click Geocode Table.
  2. In the Geocode Table pane, click Go to Tool.
  3. For Input Locator, click the drop-down list and choose ArcGIS World Geocoding Service.

    Set the input locator.

  4. Verify that the input fields of the table are mapped correctly to the ArcGIS World Geocoding Service fields.

    Check the mapping of the fields.

  5. For the Output, click the Browse button and type new_patient_locations.
  6. For Output Fields, click the drop-down list and click Minimal.
  7. For Country, click the drop-down list and check United States.
  8. For Category, click the drop-down list and expand Address.
  9. Check Subaddress, Point Address, and Street Address.

    Check categories of address.

    Since you only want to map real patient addresses, you do not want to include things like Intersections and Distance Markers.

    If all categories remained checked, you might get results to mapped to intersections or businesses rather than real addresses that are likely to be a patient’s home. It will be easier for you to focus on results that don’t have a good address match if those records remain unmatched as opposed to being matched to the other categories.

  10. Click the estimate credits link.

    Geocoding the table of new patient addresses will use 3.88 credits.

  11. Click Run.

    The geocoding process matches 96 of the new records, but one is unmatched.

    Records can be unmatched for a number of reasons. When geocoding with the ArcGIS World Geocoding Service, most addresses would normally find an approximate match by falling back to less precise address types like the centroid of a ZIP code polygon. Since you specified only Subaddress, Point Address, and Street Address match types as acceptable, the rows that have missing information and do not match these categories of location are unmatched.

  12. Click Yes.

    One record was not matched. Click Yes to rematch it.

    The rematch dialog box opens, showing the unmatched record.

    Unmatched record

    There seems to have been a data entry or table management error here. The street address appears to be split, so part of the street name is in the City field, while the City value is in the Subregion field.

    In a real-world situation, you could contact the patient, or the source of your new_patients.csv file, to get the correct address and ensure that it is corrected elsewhere in your system, but for the purposes of this tutorial, you will fix the error here and proceed.

  13. Edit the Address field to have the complete address 840 Old Lebanon Dirt Rd.
  14. Edit the City field to have the value Hermitage.
  15. Edit the Subregion field to have the value Davidson County.

    Corrected address fields.

    After you have edited the fields to have the correct values, you can rematch the address.

  16. Click Apply.

    The address is matched to a point location with a score of 100, a high reliability value.

  17. Click Match.

    Click Match.

  18. Click Save Edits.

    Save edits.

  19. Close the Rematch Address pane.

    See the documentation for more information about rematching addresses.

  20. Click Save Project to save your project.

Join the patient information to the points

Now that you have all the new patient records geocoded, you can join the protected data, such as name, race, and sex back to them.

  1. In the Geoprocessing pane, search for join field and click the Join Field tool.

    Search for join field.

  2. For Input Table, click the drop-down list and choose the new_patient_locations layer.

    Set the input table to new_patient_locations

  3. For Input Field, click the drop-down list and click TempKey.
  4. For Join Table, click the drop-down list and click new_patients.
  5. For Join Field, click the drop-down list and click TempKey.
  6. Accept the default value for Transfer Method, Select transfer fields.
  7. Click the Transfer Fields drop-down list and click Name.

    Choose Name.

  8. Another drop-down list appears below Name.

    Another field picker appears.

  9. Use the same method to add the following fields to Transfer Fields: Sex, RaceSelected, Hispanic_Latino, Lang_Pref_Home, and PatientID.

    The transfer fields are added

  10. Click Run.

    The PII fields are joined back onto the geocoded patients.

    PII fields joined back to the records after geocoding

    To comply with privacy guidelines, you must save your data to your local machine or to a secured server in your ArcGIS Enterprise environment rather than storing your output data in ArcGIS Online. Storing health data with PII in ArcGIS Online is not yet approved by the same HIPAA rules that permit geocoding with the ArcGIS World Geocoding Service.

  11. Click Save Project to save your project.

Append the new patients

The new_patient_locations feature class is ready to be appended to your main patients feature class.

  1. In the Geoprocessing pane, search for append and click the Append tool.
  2. On the Append tool, click the drop-down list for Input Datasets and click new_patient_locations.

    The Append tool takes one or more layers of data and copies them into another layer.

  3. On the Append tool, click the drop-down list for Target Dataset and click patients.
  4. Click the Field Matching Type drop-down list and click Use the field map to reconcile field differences.

    Use the field map.

    This will allow you to append new_patient_locations into patients, in spite of the fact that their schemas do not match exactly. The TempKey field is not present in the patients layer. If you used the default option, both layers would have to have exactly the same fields.

    Since they don't match exactly, you can drop the values in TempKey by not mapping them to an existing field in patients. These values were only useful for rejoining the PII to the points, and you do not want to keep them.

    The field map contains an Output Fields section and a Source section. Each of the fields in the patients layer is listed in Output Fields. You can click them and see what fields in the Source layer will be mapped to them.

    It helps that both tables have the essentially the same field names.

    Field mapping allows you to direct data from one field into a field of another name. For example, if the patients layer had a field named State instead of Region, you could click that field and specify that data in the Region field of new_patient_locations layer would be appended into it.

  5. Click Run.

    The 97 new patients are added to the patients layer. Now you can use that layer in your analysis to identify clusters of patients from a population that may be under-served.

  6. In the Contents pane, check patients layer to show the distribution of patients.
  7. Click Save Project to save your project.

You have geocoded your new_patients table using a method that protects PII and complies with HIPAA rules. You have updated the main patients layer with this new data, and now you can use this securely stored local data in an analysis.


Identify locations to plan extended service

In Nashville, 17 percent of the population does not speak English at home. Among this group, the most common language spoken at home is Spanish. Nearly half of this group speaks Spanish only. When information and care are not delivered in a familiar language, patients may experience decreased satisfaction with care, reduced access to preventative care, and worse health outcomes.

Your health system wants to better serve Spanish-speaking patients and would like to see if they cluster spatially near certain clinics, or if they are in areas that have less access to clinics, as measured by the service areas. Understanding the linguistic preferences and needs of patients in a service area will help you ensure your health system is providing adequate translation and interpretation services. It is important that all patients are provided with information and services they can fully understand to make informed decisions about their health.

Identify patients outside of the 15-minute drive-time service area

You can now combine the information you derived for the service areas with the patient data to find patients who are not within the 15-minute drive-time service area.

  1. In the Contents pane, check the Service Area group layer to show it.
  2. In the Service Area group layer, right-click Polygons and click Attribute Table.
  3. In the Polygons attribute table, right-click the column header for FromBreak and click Sort Descending.

    Sort descending on the FromBreak field.

  4. Scroll down the table, and click the row header for the first row that has a value of 0 in the FromBreak field.

    Select the first 0 value in FromBreak.

  5. Press the Shift key while clicking the row header of the last row that has a value of 0 in the FromBreak field.

    Press Shift while clicking the last one to select the 0 values.

    All of the 0–15 minute drive-time service area polygons are now selected.

  6. In the Contents pane, right-click Polygons and click Zoom to Layer.

    Zoom to the polygons layer.

  7. In the Geoprocessing pane, in the Search box, type select by location.
  8. Click the Select Layer By Location tool.

    Select Layer By Location tool

  9. For Input Features, click patients.

    Input Features set to patients

  10. For Selecting Features, click Service Area\Polygons.

    Selecting Features set to Service Area\Polygons

    A notification appears below the Selecting Features input box indicating that the Service Area\Polygons feature class has a selection, and that eight records will be processed. When you saw a similar message earlier, you needed to clear the selection so all of the features would be processed. In this case, you have deliberately selected the 0–15 minute drive-time service area polygons to use them to create the selection. You do not need to clear the selection.

  11. Click Run.

    Patients within the 0–15 minute drive-time service area are selected.

    The patients within the 0–15 minute drive-time service area polygon are selected. This is a step toward the information that you want, which is the inverse of this set. That is, you want all of the patients that are not in this range.

  12. Open the patients table.

    There are 4,687 patients selected, out of a total of 5,429.

  13. Click Switch.

    Switch button

    This switches the selection. Now there are 742 out of 5,429 patients selected. These patients are the ones who are outside of the 0-15 minute drive-time service area. These patients may have a more difficult time accessing services than those who are closer to facilities.

  14. Click Save Project to save your project.

Identify the Spanish-speaking subset of these patients

Now that you have selected the patients who are outside of the 15-minute drive-time service area, you will select the Spanish-speakers in this group.

  1. In the Geoprocessing pane, in the Search box, type select by attribute.
  2. Open the Select Layer By Attribute tool.
  3. For Input Rows, choose the patients layer.

    The tool indicates that there is a selection. Only selected records will processed when the tool is run.

  4. For Selection Type, choose Select subset from the current selection.

    Select subset from the current selection option.

  5. In the Expression section, in the Where input box, click Select a field and click Lang_Pref_Home.

    Select Lang_Pref_Home

    The Lang_Pref_Home field contains the preferred language spoken in the patients' homes.

  6. Accept the default comparison operator, is equal to.
  7. In the value drop-down list, click Spanish.

    Choose Spanish.

  8. Click Run.

    The tool selects the patients who have the value of Spanish in the Lang_Pref_Home field from within the already selected set of patients outside of the 0-15 minute drive-time service areas.

  9. Uncheck the Polygons layer so you can see these selected features.

    Selected Spanish-speaking patients outside 0-15 minute drive time.

    There seems to be a cluster on the southwest side of town and another on the northeast side.

Export the selected features

You've identified a set of patients who meet this particular set of selection criteria. You can save these points to their own feature class so you will have them to work with, without going through all the selection steps. In a service expansion planning project, you may have multiple populations of interest, so you can generate several sets of patient points. For example, you can use data enrichment to identify patients in neighborhoods where the rate of car ownership is low, do a walk-time service area analysis, and select out clusters of patients who are outside a convenient walking distance.

  1. In the Contents pane, right-click the patients layer, point to Data and click Export Features.

    Export Features

    The Input Features parameter is already set to your patients layer. The tool notifies you that there is a selection and those will be processed. This is what you want to do, so you can proceed.

  2. In the Export Features pane, for the Output Feature Class, type patients_underserved_1.

    Name the output feature class

    Accept the default output location, your project geodatabase, protectpatientdata.gdb. Since this is a subset of the patients layer, it contains PII and must be stored securely on your local machine or on an ArcGIS Enterprise server behind your firewall.

  3. Click OK.

    The tool runs and the patients_underserved_1 layer is added to the map.

  4. Uncheck the patients layer so you can see the patients_underserved_1 layer.

    The underserved_patients_1 layer is on the map.

    Looking at the two large clusters you've identified as potentially underserved patients, notice that the northeast cluster is close to a competitor location: Tennessee Star Medical Group—Primary Care Hermitage. Given this fact, this group, while not served within your network, is perhaps not as under-served as it appears at first glance. If you were to run a drive-time service area analysis, you would likely find that many patients within this cluster would be within a 15-minute drive of the competitor clinic.

    Cluster near competitor

    In contrast, the other cluster to the southwest has no nearby medical facilities, within your network or within the competitor network.

    Cluster not so close to competitor

    While it might eventually be worth building a location near the northeast cluster and the competitor location, it’s likely that constructing a new provider facility near the southwest cluster will provide the most benefit to the community.

    You could continue your analysis by identifying other clusters of under-served patients who meet different criteria and combine those results to plan your expansion.

  5. Click Save Project to save your project.

In this tutorial, you've learned about geocoding health information and service area analysis in the context of strategic planning for a health system. When geocoding patient data, it is critical to take patient privacy regulations and guidelines into consideration. Using fictitious data, you learned how to apply the standard geocoding processes for non-private health data, and the HIPAA-aligned geocoding process for protected health data. Using fictitious data, you learned how to strip identifying fields from a patient address table and rejoin it after the fact so that only address information is used for geocoding. You learned that while the ArcGIS World Geocoding Service meets HIPAA guidelines, it is a best practice to strip those fields. Patient data will be protected when you use the ArcGIS World Geocoding Service within the United States. You also learned that you must take care when storing or hosting patient data.

The objective of this tutorial is for you to increase your confidence and understanding related to geocoding sensitive data. Please also understand that the topic of data security, and especially health data security, is broad and this tutorial alone is not sufficient to cover every potential threat.

To further enhance your knowledge on this topic, particularly as it relates to geographic information, we recommend completing the tutorial entitled, "De-identify health data for visualization and sharing" which covers various masking and data aggregation techniques that will further protect your sensitive information. As always, you should be aware of and follow your own organizational policies and procedures associated with personally identifiable information and protected health information.

You can find more tutorials in the tutorial gallery.