Skip To Content

Install the R-ArcGIS bridge and start statistical analysis

In this lesson, you'll install the R-ArcGIS bridge and begin analyzing your dataset.

Download R and RStudio

First, you'll download and set up R and RStudio, a free integrated development environment for R. RStudio helps you work in R by providing a coding platform with access to CRAN, the Comprehensive R Archive Network, which contains thousands of R libraries, a built-in viewer for charts and graphs, and other useful features. (If you already have R and RStudio installed, skip to the next section.)

  1. If necessary, download R 3.2.2 or later. Accept all defaults in the installation wizard.

  2. If necessary, download RStudio Desktop. Accept all defaults in the installation wizard.

Create an ArcGIS project

Now you'll add data to an ArcGIS project to create a map of San Francisco crimes.

  1. Download the file.
  2. Locate the downloaded file on your computer and extract its contents to a folder named San-Francisco.
  3. Open the San-Francisco folder.

    The folder contains the SF_Crime geodatabase, which has crime data that you'll add to a map.

  4. Start ArcGIS Pro. If prompted, sign in using your licensed ArcGIS account.

    If you don't have ArcGIS Pro or an ArcGIS account, you can sign up for an ArcGIS free trial.

  5. From New, Blank Templates, click Map.
  6. In the Create a New Project window, for Name type Crime Analysis and save it to your San-Francisco folder. Uncheck the Create a new folder for this project box.
  7. In the Catalog pane, on the Project tab, expand Folders and then expand the San-Francisco folder.
  8. Expand the SF_Crime geodatabase, right-click the San_Francisco_Crimes feature class, and choose Add To Current Map.

    Map showing raw data

    The map shows locations where crimes occurred from January 2014 through December 2014 in San Francisco.

Install the R-ArcGIS bridge

Once the R-ArcGIS bridge is installed, you can begin reading and writing data to and from ArcGIS and R. You can also begin running script tools that reference an R script.

  1. On the ribbon, click the Project tab.

    Project tab on the ribbon

  2. Click Options. In the Options pane, in the Application list, click Geoprocessing.

    Options window

  3. In the R-ArcGIS Support section, select your desired R home directory.

    All versions of R that are installed on your computer will appear in the list. Select R 3.2.2 or a later version.

    If you haven't installed the R-ArcGIS bridge, a warning appears indicating that you need to install the arcgisbinding R package to connect R with ArcGIS. You can automatically download and install the arcgisbinding package, download the package separately, or install the package from a file. If you previously installed the R-ArcGIS bridge, an installed message appears indicating the version of your arcgisbinding package. You're presented with options to check for updates, download the latest version, or update from a file.

  4. If applicable, click the icon next to the warning, and select the option to automatically download and install the arcgisbinding package. Otherwise, check for updates and ensure that you have the latest version of the package.
  5. In the Options window, click OK.
  6. Click the Back button to return to the open map that contains the data on which you want to perform spatial and statistical analysis.

Aggregate point data by counts within a defined location

At first glance, the map may be overwhelming and it may be difficult to understand what the data represents. Before you start your analysis, you need to aggregate crime counts by space and time. Aggregation reveals the spatial and temporal relationships in your data that may not have been visible previously. Aggregating allows you to summarize your crime points in space-time bins that combine the crimes that have occurred into counts by space and time increments of your choosing.

  1. If necessary, open the Geoprocessing pane. (On the Analysis tab, in the Geoprocessing group, click Tools.)
  2. In the search box, type Create Space Time Cube and press Enter.
  3. In the results, click Create Space Time Cube By Aggregating Points to open the tool. Complete the following parameters:
    • For Input Features, choose San_Francisco_Crimes.
    • For Output Space Time Cube, browse to your San-Francisco folder and name the output
    • For Time Field, choose Dates.
    • For Time Step Interval, type 1 and choose Months.
    • For Time Step Alignment, accept the default.
    • For Aggregation Shape Type, choose Hexagon grid.
    • For Distance Interval, type 300 and choose Meters.

    Create Space Time Cube By Aggregating Points tool

    These parameter values specify the size and shape of the space-time bins that you are creating. Because your data is for the year 2014, analyzing crimes by each month is a natural breaking point. Additionally, your department wants to analyze crimes at a local level, so you select a small distance interval value. Hexagon bins are selected because they are preferable in analyses that include aspects of connectivity or movement paths.

  4. Click Run.

    The Create Space Time Cube By Aggregating Points tool creates a netCDF file, which allows you to view spatial patterns and trends over time. The tool aggregated the 74,760 points in the San_Francisco_Crimes layer into 3,510 hexagons (the polygon bins). Each hexagon represents an area of approximately 78,000 square meters. The Distance Interval and Time Step Interval parameters impact the number of resulting bins and the size of each bin. These values can be chosen based on prior knowledge of the analysis area, or the tool will calculate values for you based on the spatial distribution of your data. You can confirm that the Create Space Time Cube By Aggregating Points tool successfully created the file by checking the Messages window.

Analyze crime hot spots

Next, you'll analyze where statistically significant clusters of crime are emerging and receding throughout the city. Your analysis will help the department anticipate problems and evaluate the effectiveness of resource allocation for their crime prevention measures.

  1. In the Geoprocessing pane, search for and open the Emerging Hot Spot Analysis tool.
  2. Complete the following parameters:
    • For Input Space Time Cube, browse to and select the location where you stored your newly created space time cube.
    • For Analysis Variable, choose COUNT.
    • For Output Features, browse to your San-Francisco folder and name the output San_Francisco_Crimes_Hot_Spots.
    • For the remaining parameters, accept the defaults.

    Emerging Hot Spot Analysis tool

    By using the default value for Neighborhood Distance, you are letting the tool calculate a distance band for you based on the spatial distribution of your data. The Neighborhood Time Step value is set to one time step interval (one month in this case) by default. These settings are ideal for an exploratory analysis; however, if you knew the optimal distance band and time step interval for your analysis, you could set them.

  3. Click Run.

    The tool runs and its results are added to the map. (The warning message informs you of the value that the tool used for the Neighborhood Distance parameter.)

  4. Turn off the San_Francisco_Crimes layer to see the results more clearly.

    Emerging Hot Spot Analysis Results map

    Trends in statistically significant hot and cold spots are shown on the map. Red areas indicate that over time there has been clustering of high numbers of crime, and blue areas indicate that over time there has been clustering of low numbers of crime. Each location is categorized based on the trends in clustering over time.

    The dark red hexagon bins are persistent hot spots. These are locations that have been statistically significant hot spots for 90 percent of all of your time slices. However, these locations do not have a discernable increase or decrease in the intensity of clustering of crime counts over time.

    In contrast, the light red with beige outlined hexagon bins are intensifying hot spots. These are locations that have been statistically significant hot spots for 90 percent of all of your time slices. In addition, these are locations where the intensity of clustering of crime counts is increasing over time, and that increase is statistically significant.

    Conversely, the dark blue bins are persistent cold spots. These are areas where crime is statistically, and persistently, less prevalent. The light blue outlined bins are intensifying cold spots but means the opposite of its counterpart. Clusters of low crime counts in these cells are becoming more intense over time. In other words, the cold spots are getting colder.

    The department needs to be especially concerned about the areas where crime is persistent or intensifying. They may move resources to these areas from the places where crime cold spots occur.

  5. Save the project.

In this lesson, you installed the R-ArcGIS bridge, prepared your data for statistical analysis, and started using some of the available tools. In the next lesson, you'll add additional attributes to your dataset, allowing you to draw conclusions from your analysis about what factors likely influence the occurrence of crime.