Install the R-ArcGIS bridge and locate data for analysis

First, you'll install the R-ArcGIS bridge and obtain environmental data for your analysis.

Download RStudio

You'll download and set up R and RStudio, a free integrated development environment for R. RStudio helps you work in R by providing a coding platform with access to CRAN, the Comprehensive R Archive Network, which contains thousands of R libraries, a built-in viewer for charts and graphs, and other useful features. (If you already have R and RStudio installed, skip to the next section.)

  1. If necessary, download R 3.2.2 or later. Accept all defaults in the installation wizard.
  2. If necessary, download RStudio Desktop. Accept all defaults in the installation wizard.

Create a project

Next, you'll create a project in ArcGIS Pro and add the data to it. Then, you'll create a map of where African buffalo have been observed.

  1. Download the African-Buffalo.zip file and extract it to a folder named African-Buffalo.
  2. Start ArcGIS Pro. If prompted, sign in using your licensed ArcGIS Pro account.
    Note:

    If you don't have ArcGIS Pro or an ArcGIS account, you can sign up for an ArcGIS free trial.

  3. Under New, click Catalog.
  4. In the Create a New Project window, name the project Ecological Niche Factor Analysis. Save it to your African-Buffalo folder and uncheck the Create a new folder for this project box.

    Create a New Project

    A blank project opens to the Catalog view. Using the Catalog, you'll connect to the African-Buffalo geodatabase.

  5. In the Catalog view, double-click Folders. Expand African-Buffalo and open ENFA.gdb.

    Make sure you open the ENFA.gdb geodatabase that you downloaded for the lesson. The other geodatabase in the folder was created by default when you created the project and is empty.

  6. Right-click African_Buffalo_Locations, click Add To New, and choose Map.

    Add shapefile to the map

    A map is added to your project with the African_Buffalo_Locations feature class shown. African_Buffalo_Locations shows where buffalo were observed from February 2005 through December 2006 in Kruger National Park, South Africa.

Install the R-ArcGIS bridge

Once the R-ArcGIS bridge is installed, you can begin reading and writing data to and from ArcGIS and R. You can also begin running script tools that reference an R script.

  1. On the ribbon, click the Project tab.

    Project tab on the ribbon

  2. Click Options. In the Options pane, in the Application list, click Geoprocessing.
  3. In the R-ArcGIS Support section, select your desired R home directory.

    Options window

    Note:

    All versions of R that are installed on your computer will appear in the list. Select R 3.2.2 or a later version.

    If you haven't installed the R-ArcGIS bridge, a warning appears indicating that you need to install the arcgisbinding R package to connect R with ArcGIS. You can automatically download and install the arcgisbinding package, download the package separately, or install the package from a file. If you previously installed the R-ArcGIS bridge, an installed message appears indicating the version of your arcgisbinding package. You're presented with options to check for updates, download the latest version, or update from a file.

  4. If applicable, click the icon next to the warning, and choose Download latest version to install the arcgisbinding package from Github. Otherwise, check for updates and ensure that you have the latest version of the package.
  5. In the Options window, click OK.
  6. Click the Back button to return to the open map that contains the data on which you want to perform spatial and statistical analysis.

Add watershed data

To perform an ecological niche factor analysis, you must first define a study area. This area should include the regions where buffalo are present, as well as locations where buffalo have yet to be observed, so the environmental and climate characteristics can be compared. While there are many methods for delineating a study area, you'll use the watersheds that encompass the African buffalo observations. Once you have a study area, you'll acquire data pertaining to the environment in that region. Since African buffalo do not have many predators, research has shown that their home territory tends to be determined based on substrate characteristics, like bedrock and soil, which in turn help determine the available plant life and water on which they depend. To acquire data about these characteristics of the region, you can make use of the four datasets that comprise Esri's Ecological Land Units (ELUs). The ELUs describe the bioclimate, landform, lithology, and land-cover characteristics for the entire world.

  1. On the ribbon, click the View tab. In the Windows group, click Catalog Pane.
  2. In the Catalog pane, expand Folders and African-Buffalo and open ENFA.gdb. Right-click Elefantes_Incomati_Watersheds and choose Add to Current Map.

    Elefantes and Incomati watershed data

    The Elefantes and Incomati basins will make up the study area that you'll consider when performing the ecological niche factor analysis. Next, you'll add environment and climate data for this area.

  3. In the Catalog pane, click Portal and click All Portal.

    All Portal data

  4. In the search bar, type World Bioclimates. Right-click the World Bioclimates image service and choose Add To Current Map.

    World Bioclimates image service

  5. In the Contents pane, right-click World Bioclimates and click Zoom to Layer.

    The Ecological Land Unit datasets have data for the entire world.

  6. In the Catalog pane, add the following image services from the portal.
    • World Ecological Facets Landform Classes
    • World Lithology
    • World Land Cover ESA 2010
    • World Distance to Water
    • World Population Estimate 2016
    Note:

    Make sure to add the layer that is an image service and not a different data type. Image service layers have a specific icon. You can also hover over each layer and check the Type is set to Imagery.

    You now have references to the image services containing the data you hope to use for your analysis. However, currently, you have more information than you need because these imagery layers cover the entire world. You will use your study area to make a local copy of the data that you can use in an ecological niche factor analysis. Additionally, to be able to work with all six raster layers, they must have the same cell size. To ensure all output data has the same cell size and is clipped to the extent of the study area, you will set several geoprocessing environment settings that will be used as output settings by geoprocessing tools.

  7. On the ribbon, click the Analysis tab. In the Geoprocessing group, click Environments.

    Set geoprocessing environments

    The Environments pane opens. Current Workspace is set to Ecological Niche Factor Analysis.gdb, the geodatabase that was created with the project. You'll use this to save the copies you make of the imagery service layers.

  8. For Output Coordinate System and Extent, choose Elefantes_Incomati_Watersheds.
  9. Under Raster Analysis, for Cell Size, type 928.

    Set geoprocessing environments

    This is the rounded size of the largest cell. When cells are resampled, it is important to use the largest dimensions for accuracy. Zones of smaller cells can be averaged to create a single cell at the larger dimension, but larger cells that are broken down don't accurately show the data for that area.

    Tip:

    To check the cell size of a raster, open the Layer Properties window. In the Layer Properties, click the Source tab and expand Raster Information.

  10. Click OK.

    These settings will be saved as the default environment when you use geoprocessing tools in this project. Next, you'll use the Copy Raster tool.

  11. On the ribbon, click the Analysis tab. In the Geoprocessing pane, click Tools.
  12. In the Geoprocessing pane, search for and open the Copy Raster (Data Management Tools) tool.
  13. On the Parameters tab, set the following:
    • Input Raster: World Bioclimates
    • Output Raster Dataset: Bioclimates_SA
    • Pixel Type: 8 bit unsigned

    Copy raster to geodatabase

    If you want to override the geoprocessing environments settings you just specified, you can click the Environments tab and change them.

  14. Click Run.

    The tool runs and creates a local copy of the raster imagery layer based on the extent of your study area. You'll repeat this process for the rest of the ELU data.

  15. For the other imagery services, change the Input Raster and Output Raster Dataset parameters as follows:
    • World Ecological Facets and Landform Classes: Ecological_Facets_SA
    • World Lithology: Lithology_SA
    • World Land Cover ESA 2010: Land_Cover_SA
    • World Distance to Water: Distance_To_Water_SA
    • World Population Estimate 2016: Population_SA
  16. In the Contents pane, remove all the imagery layers that you added from the portal.

    Layers for analysis

  17. Save the map.

You've installed the R-ArcGIS bridge, created a study area to conduct your analysis in, and acquired and prepared your data for statistical analysis.


Prepare Ecological Land Unit data for analysis

Previously, you installed the R-ArcGIS bridge and downloaded the data for your statistical analysis. Then, in ArcGIS, you created a study area based on the location of watersheds in Africa that intersects with where African buffalo have already been observed. Since the environment and climate data you possess classifies the land into various categories, next you'll adjust the data to be of use in a quantitative analysis. This can be done by making use of the Reclassify and Focal Statistics tools in ArcGIS.

Reclassify environment categories

Now that you have all the data for your study area, you'll prepare it for use in your ecological niche factor analysis. ENFA requires that the data you provide be continuous, however, several of the layers you are working with contain categorical data, which is represented as integer values. One way to convert a categorical raster to continuous data is to use neighborhood statistics that quantify the prevalence of a category in the neighborhood surrounding each cell. To do this, you'll start by reclassifying all locations with a specific category by the number 1 and all other categories by the number 0. Because this process gets repeated for every category in each of the six datasets, you'll create a model to automate the process. First, you'll create a geodatabase to save the results.

  1. In the Catalog pane, expand Folders. Right-click African-Buffalo and click New, and then click New File Geodatabase.

    Create a new file geodatabase

  2. Name the new geodatabase Con_Results.gdb and create another geodatabase named Focal_Statistics_Results.gdb.
  3. On the ribbon, click the Analysis tab. In the Geoprocessing group, click ModelBuilder.

    A blank model window opens. You can switch between the model and the map by clicking the tabs at the top of the pane.

    Model and Map tabs

    First, you'll add an iterator. Iterators run a process multiple times based on the data you enter.

  4. If necessary, on the ribbon, click the Model tab. In the Insert group, click Iterators and choose Iterate Field Value.

    Iterate Field Value tool

    Iterate Field Values and its output, Value, are added to the model pane. They will appear unavailable until you add parameters.

  5. Double-click Iterate Field Values. In the Parameters window, for Input Table, choose Bioclimates_SA.
  6. For Field, choose Value, and for Data Type, choose Double. Click OK.

    Iterate Field Values parameters

    The Iterate Field Values iterator is now configured to loop through each category with values of the Bioclimates_SA layer. To pass these categories to your desired tool, Con, you will add the Con tool to your model and connect the two.

  7. In the Geoprocessing pane, search for the Con (Spatial Analyst Tools) tool. Click the result and drag it into the model pane.

    Add Con tool to model

    The Con tool is the equivalent of an if/else statement. If the conditions you specify are met, the tool will return the value x that you define; if they are not met, it will return y.

  8. Click Value and drag the mouse to connect it to Con. In the pop-up window, define the arrow as Precondition.

    Precondition the Con tool

  9. Click Bioclimates_SA and connect it to Con. In the pop-up, define Bioclimates_SA as the Input conditional raster.
  10. Double-click Con. For Expression, click the New Expression drop-down and select Create a new expression in SQL. Type Value = %Value%.
  11. For Input true raster or constant value, type 1. For Input false raster or constant value, type 0.

    This SQL query will set the condition equal to each category value that the raster iterates through in turn. For each cell where the condition is met, a value of 1 will be returned. If the condition is not met, a value of 0 will be returned.

  12. For Output raster, click the browse button and browse to your Con_Results geodatabase. Name the output Bioclimates_ Con%Value% .

    Con parameters

  13. Click OK.

    Currently, your model is set to iterate through each category with values in your given input layer and pass those category values through the Con tool. In the Con tool, for each cell where the condition given is true, meaning the category of that cell matches the current category passed in, you will be replacing that value with a 1. In all other cases, that value will be set to 0. This process then repeats for each new category with values and produces a unique output for each category. A new raster will be created for each category that the iterator identifies and saved to the Con_Results geodatabase with the value that it represents. Using the percent sign (%) as a wild card in both the SQL expression and to save the output rasters tells the tool to use each unique value identified by the iterator.

Add focal statistics

Next, you'll use these results to produce raster layers that capture the prevalence of each category. Focal statistics allow you to define a neighborhood and calculate a statistic, such as a sum, for each raster cell contained in that neighborhood. This will allow you to get a sense of which categories are occurring where and with what frequencies. By calculating focal statistics on each result from the Con tool, you will produce quantitative values that represent each category and its prevalence, which you can then use in your ecological niche factor analysis.

  1. In the Geoprocessing pane, search for Focal Statistics (Spatial Analyst Tools). Drag the result into your model pane.

    The result from the Con tool will be passed through Focal Statistics, so you'll connect the two.

  2. Drag Bioclimates_ Con%Value% to Focal Statistics. In the pop-up menu, define the connection as Input Raster.

    Connect Focal Statistics

  3. Double-click Focal Statistics.

    Input Raster should already be filled with the result of the Con tool.

  4. For Output Raster, browse to your Focal_Statistics_Results.gdb and name your output Bioclimates_FS%Value%.

    This will give each output a unique name based on the category being run.

  5. For Neighborhood, choose Circle. For Statistics type, choose Sum. Accept all the other defaults.
  6. Click OK.

    For each category in the World Bioclimates layer, this tool will construct a circular neighborhood and sum all the cells with a value of one present in each neighborhood. The ultimate result will be a new raster that communicates how commonly the first category with values for the World Bioclimates layer occurs and where those locations are.

  7. On the ribbon, in the Run group, click Run.

    The model will start running through each category in the World Bioclimates Study Area layer. It will find that category 14 is the first category that contains values. It will then pass this category and its values to the Con tool, the result from which will then be passed to the Focal Statistics tool. This process will then repeat for the next category with values until all the categories with values have been gone through. In the case of World Bioclimates, this final category is category 32.

    When you see the drop shadow under your tools in the ModelBuilder window, the tools have finished running.

  8. In the Catalog pane, expand Con_Results.gdb and Focal_Statistics_Results.gdb. If necessary, right-click each and choose Refresh.

    Each geodatabase should have 8 new rasters in it. The results from the Con tool are raster layers composed of ones and zeros to represent where each category is present and where it is not. The results from the Focal Statistics tool are raster layers summarizing the number of each category present in circular neighborhoods around each location. This result is what will be used in your ecological niche factor analysis. Now, you will repeat this same process for each of the other layers composing the Ecological Land Units, namely, Ecological Facets, Lithology, and Land Cover.

  9. On the ribbon in the Model group, click Save As and name the model Bioclimates.

    The model will be saved to the Ecological Niche Factor Analysis toolbox. It is accessible from the Catalog pane to rerun or edit as necessary. Now, you'll use the model to run three more variables. Each variable must be run separately because only one iterator can be used per model.

  10. Double-click Bioclimates_SA and change the input variable to Ecological_Facets_SA.

    The model will update the iterator, value, and input conditional raster accordingly.

  11. Double-click Con and change the Output raster to Ecological_Facets_Con%Value%. Click OK.
  12. Double-click Focal Statistics and change the Output raster to Ecological_Facets_FS%Value%. Click OK and click Run.
  13. When the model is finished running, save it as Ecological Facets, and then edit the model parameters to rerun the model for the following layers:
    • Lithology_SA
      • Input conditional raster: Lithology_SA.
      • Con output raster: Lithology_Con%Value%.
      • Focal Statistics output raster: Lithology_FS%Value%.
      • Save the model as Lithology.
    • Land_Cover_SA
      • Input conditional raster: Land_Cover_SA.
      • Con output raster: Land_Cover_Con%Value%.
      • Focal Statistics output raster: Land_Cover_FS%Value%.
      • Save the model as Land_Cover.

    You do not need to do this process for the Population or Distance to Water layers, because they already contain numerical measures.

  14. Close the model pane and save the project.

You've created models to automate two separate processes. Through this model creation, you saved yourself the time of manually running these tools multiple times for each raster layer, and you produced models that can be shared with others wanting to duplicate this workflow.


Prepare buffalo presence data for analysis

Previously, you reclassified your data and summarized the results to produce multiple new rasters containing information about how common certain environmental and climate characteristics are in your study area. By constructing a study area based on watersheds, you are working with an area substantially larger than the regions where buffalo have been observed. A benefit of this is that any edge effects introduced by running the Focal Statistics tool will be contained to the outer edges of your study region and thus will not impact the analysis. Edge effects result from the fact that neighborhoods constructed at the edges of a defined area do not always contain the same number of cells as those constructed in the middle of the study area, since you reach a point where there are no more cells to include. There are many ways to handle edge effects, but in this case, by having a large enough study area and knowing to exhibit caution when interpreting any results at borders, you can account for this.

Create a raster with buffalo locations

Next, you'll prepare your data on buffalo sightings. The original data contains one point for each location where a buffalo was sighted. Recall that the ENFA requires rasters as input. To convert the point locations to a raster, you'll use the Point to Raster tool. This tool creates a raster where each cell contains a count of the number of points that fall within the cell. To begin this process, you'll create a raster with the same extent as your study area allowing you to match environmental information and buffalo presence to determine their preferred ecological niches.

  1. In the Geoprocessing pane, open Point to Raster (Conversion Tools).
  2. For Input Features, choose African_Buffalo_Locations. For Value field, choose visible.

    For all cells where buffalo were observed, the visible field has a value of 1.

  3. For Output Raster Dataset, choose the Ecological Niche Factor Analysis geodatabase and name the raster African_Buffalo_Locations_Raster.
  4. For Cell assignment type, choose Sum.

    In locations where multiple buffalo were observed, all the values will be summed to get a count of buffalo for that location.

  5. Make sure Cellsize is 928 and click Run.

    This is the size of the largest raster cell. Earlier, you resized all the other rasters to this cell size using geoprocessing environments. Because those environments are still set, each additional raster you produce will follow those parameters. They will all have the same extent as the Elefantes_Incomati_Watersheds layer and have a cell size of 928.

  6. In the Contents pane, uncheck all layers except for African_Buffalo_Locations_Raster and the Topographic basemap to turn them off.

    The new raster has a default color ramp of black to white. You'll change this to make it easier to see.

  7. If necessary, in the Contents pane, expand the African_Buffalo_Locations_Raster layer and right-click the color ramp, and then expand the color ramp options.

    Change default color ramp

  8. Choose a green to red color ramp.

    Buffalo presence raster

    The cells with values show up in the same locations as the points representing African buffalo locations. All other cells contain NoData values because the buffalo were not present in those locations. However, you want these locations to reflect the fact that based on your data, no buffalo were observed there. To do that, you'll change your NoData cells to contain zeros instead. In this case, zeros do not reflect that buffalo are completely absent from that location; instead, they have yet to be seen there. This is an important distinction and part of what makes an ecological niche factor analysis so powerful.

Create a buffalo presence raster

An important part of knowing where buffalo are and the habitat characteristics they prefer is knowing where they are not. You'll create a raster showing where buffalo have not been observed by classifying all the cells with no data. You'll fill the cells where buffalo have been observed with a value of 0, and all other cells that contained NoData values now have a value of 1. While this might seem contradictory at first, this result will help you produce your final buffalo presence raster. Then, you'll switch the raster to make sure each cell only shows a value of 1 or 0 where buffalo are or are not present. These rasters will be part of a calculation of how many buffalo have been observed in each cell of your study area raster layer.

  1. In the Geoprocessing pane, open Is Null (Spatial Analyst Tools).
  2. For Input Raster, choose African_Buffalo_Locations_Raster, and for Output Raster Dataset, name the result African_Buffalo_NoData_Locations_Raster.
  3. Click Run.

    NoData locations

  4. In the Geoprocessing pane, open Con (Spatial Analyst Tools).
  5. Set the following parameters:
    • Input Raster: African_Buffalo_NoData_Locations_Raster
    • Input true raster or constant value: 0
    • Input false raster or constant value: African_Buffalo_Locations_Raster
    • Output raster: African_Buffalo_Presence_Raster
  6. Click Run.
  7. In the Contents pane, remove African_Buffalo_Locations_Raster and African_Buffalo_NoData_Locations_Raster.

    You've now created a presence raster for African buffalo containing locations that are weighted by the counts of buffalo observed there over the course of the study. You did this by first inputting the result from the Is Null tool. This raster layer consisted of ones and zeros, reflecting where NoData values where present and where they were not. In all cases where a NoData value existed, a 1, representing true, was the value of that given cell. By setting the Con tool's Input true raster or constant value to 0, all of these locations had their values changed to 0 to reflect no buffalo being observed at those locations. In all other locations where a NoData value was not present, the values from your results of the Points to Raster tool were used. By using these two results, you combined the results you wanted from each tool run to communicate how commonly buffalo were found in each cell of your study area. The resulting presence raster can now be used in conjunction with all your environmental and climate raster data layers to perform an ecological niche factor analysis.

Combine environmental and buffalo presence data

To simplify working with all this information, you'll combine your multiple single-band ecological raster data layers and your buffalo presence raster into one multiband raster, which can then be used in R to convey information regarding the environments African buffalo have been observed and not observed in.

  1. In the Geoprocessing pane, open Composite Bands (Data Management Tools).
  2. For Input Rasters, browse to your Focal_Statistics_Results geodatabase. Select every Focal Statistics result by clicking the first raster and pressing Shift while clicking the last raster.
  3. Click OK.

    All 53 rasters from Focal_Statistics_Results.gdb are added to the Input Rasters list.

  4. At the end of the Input Rasters list, click browse next to the empty input box and add Population_SA, Distance_to_Water_SA, and African_Buffalo_Presence_Raster from the Ecological Niche Factor Analysis geodatabase.
  5. For Output Raster Dataset, type ENFA_Environmental_Buffalo_Attributes and save the raster to the Ecological Niche Factor Analysis geodatabase.
  6. Click Run.

    Composite raster of buffalo presence

    You have now created a 56-band raster containing all the environmental, climate, and buffalo presence data needed for your ecological niche factor analysis model.

  7. Save the project.

You've created a presence raster for African buffalo and combined all your environmental and climate data into one multiband raster, ready for analysis.


Perform ecological niche factor analysis in R

Previously, you finalized all of your data creation and aggregation work by producing one raster containing all of the ecological details for your study area and the incidence of buffalo. This multiband raster dataset is what you will now transfer into R using the R-ArcGIS bridge so you can build an ENFA model and interpret its results. While the bridge has always supported vector data and allowed its easy transfer between R and ArcGIS, raster support is a new feature of the bridge. As such, there are several new functions designed to enable seamless connections between R and ArcGIS for raster data, of which you'll make use.

Bridge your data into R

You'll work in RStudio to perform an ecological niche factor analysis on your data. Because you've installed the R-ArcGIS bridge, the data in your ArcGIS Pro project is connected to and accessible from RStudio.

  1. If necessary, open your Ecological Niche Factor Analysis project in ArcGIS Pro, and then open RStudio.

    Next, you'll run a command that loads all the functions for the arcgisbinding package. Then you'll run another command that performs a quick check to ensure that the bridge is running correctly and that R recognizes the version of ArcGIS Pro you're using. Both commands need to be run each time you start a new session of RStudio.

  2. In the R console, type the following code and press Enter:
    library(arcgisbinding)
  3. In the R console, type the following code and press Enter:
    arc.check_product()

    The arc.check_product() function causes the RStudio Console to print information regarding your ArcGIS product and license. Now, paths to shapefiles, feature classes from geodatabases, tables, single-band and multiband rasters, and image service URLs are all valid arguments to use in the open function.

  4. Use the arc.open() function. For its argument, type the full path to the composite multiband raster containing all your data (ENFA_Environmental_Buffalo_Attributes) and press Enter.
    Note:

    You may have saved your project data to a different location than shown in the code example. If you're copying and pasting the code, update the path accordingly. Additionally, if you copy and paste the file path, make sure to use forward slashes (/) in the path. R does not read backslashes (\), which most file browsers use.

    data_path <- arc.open("C:/African-Buffalo/Ecological Niche Factor Analysis.gdb/ENFA_Environmental_Buffalo_Attributes")

    In addition to the path to your data, this object contains the spatial and attribute information for your ArcGIS data and can now be used in other functions.

  5. Install the raster package.
    install.packages('raster')
    library(raster)

    The raster package provides functions needed to interact with the raster data you've been working with. The arc.raster() function both converts the data from an ArcGIS data type to an R object and allows you to customize it. With arc.raster(), you can construct a subset of the data based on the number of rows and columns you are looking to work with and, additionally, create subsets by a specific band or multiple bands.

    Note:

    If you have trouble installing packages programmatically, on the ribbon of RStudio, click Tools and click Install Packages. In the Install Packages window, you can search for and install any available package.

  6. Use the arc.raster() function. For the first argument, put data_path as the object you're working with. Because you've already performed data manipulation in ArcGIS, you'll bring the data in as it is, so no second argument is needed.
    arc_raster <- arc.raster(data_path)

    The arc.raster() function will convert your data to an R object with a class of arc.raster. Finally, you will finish the process of bringing in your ArcGIS raster data to R by converting your arc.raster object to a rasterBrick object. A rasterbrick, from the raster package you loaded, is the R equivalent of a multiband raster. This will allow the script to read all 56 bands of the ENFA_Environmental_Buffalo_Attributes raster you created earlier. The raster package provides functions for manipulating, analyzing, and modeling gridded spatial data, and through the bridge, you can transfer all the spatial attributes from your data from ArcGIS into R without worrying about a loss of information.

  7. Use the as.raster() function. For the first argument, use the arc_raster as the object you are converting to a rasterBrick object.
    r_raster <- as.raster(arc_raster)

    Your data has been bridged from ArcGIS Pro to RStudio and converted from an ArcGIS data type to an R object that you'll be able to perform analysis on.

Perform component analysis

Now that your data is in R, you can begin using any of the libraries and their associated functions to perform analysis in R. First, you'll account for the correlations that exist between many of your environmental attributes. Ecological data is often highly correlated, and correlated variables de-emphasize each other by communicating the same relationship. You'll also need to identify the variables that explain the most variation in your data by using an analysis like principal component analysis (PCA). PCA is a common method for handling issues like multicollinearity and dimensionality reduction. The dudi.pca function from the ade4 package will allow you to perform PCA on your data and will return an object ready for use in an ENFA analysis.

  1. Use the .as function to convert the pixels contained in your arc_raster object to a data frame. The function in R you plan to use requires its input to be in a data frame format. Then, create an index column and fill it with row numbers to assist with later calculations.
    variables <- as.data.frame(arc_raster$pixel_block())
    variables$pixel_index <- 1:nrow(variables)
  2. Use the na.omit() function to remove any NAs, or missing values, in your data frame. Then, store your index values in a new vector object named pixel_index.
    variables <- na.omit(variables)
    pixel_index <- variables$pixel_index
    variables$pixel_index<-NULL

    You'll use the pixel_index object when creating a new prediction raster to match locations. To handle the correlations in your environmental and climate data, you will separate it from the buffalo presence data.

  3. Define two subsets in R. One contains all the ecological data, which will define the study area in terms of environment and climate. The second contains the counts of buffalo observed at each location.
    ecological_variables <-variables[,-56]
    buffalo_presence <- variables[, 56]

    Next, you'll run the principal components analysis on your environmental and climate attributes.

  4. Use the library() function to load the ade4 package into RStudio. If necessary, uncomment the install.packages(“ade4”) package.
    #install.packages(“ade4”)
    library(ade4)
  5. Run the dudi.pca() function on your environmental variables represented by the ecological_variables object. For the additional arguments, set scannf to FALSE to specify that you don't want the results returned as a plot, and nf = 10 to keep the top 10 variables.
    dudi_obj <- dudi.pca(ecological_variables, scannf = FALSE, nf = 10)

    Now that you've handled correlations between your environmental data and found the 10 most impactful variables through principal components, you can perform an ecological niche factor analysis. The buffalo presence data will serve as a weight of importance.

  6. Load the adehabitatHS package. If necessary, uncomment the install.packages(“adehabitatHS”) line.
    #install.packages(“adehabitatHS”)
    library(adehabitatHS)
  7. Perform an ENFA analysis by entering the result of the PCA along with the information stored in the buffalo_presence object. These receive the additional arguments of scannf = FALSE and nf = 2.
    enfa_result <- enfa(dudi_obj, buffalo_presence, scannf = FALSE, nf = 2)

    Like before, by setting scannf equal to FALSE, you are opting to not receive a plot, and by setting nf equal to 2, you are focusing on the two most influential environmental characteristics that impact where African buffalo decide to live.

Produce a habitat suitability map

Now that you have fit your ecological niche factor analysis, you can use the results from your model to produce a habitat suitability map for your study area. This map will provide a measurement for every location in your study area on how habitable it is for African buffalo. This measurement is based on what is known as the Mahalanobis distance. The Mahalanobis distance is a measurement between the current habitat pixel being considered and the defined ecological niche calculated in your ENFA. The smaller the distance, the closer that particular location is to the species' preferred niche, making it a desirable location. These prediction values can then be used to identify the best location to establish as a new conservation area for the buffalo based on their environmental and climate preferences.

  1. Load the sp package. If necessary, uncomment the #install.packages(“sp”) line.
    #install.packages(“sp”)
    library(sp)
  2. Create a new pixels data frame from your study area.
    raster_dim <- dim(arc_raster)[c(2,1)]
    grd <- GridTopology(arc_raster$extent[1:2], arc_raster$cellsize, raster_dim)
    spg <- SpatialPixelsDataFrame(grd, data.frame(d=rep(0, raster_dim[1]*raster_dim[2])))
    spg <- spg[pixel_index,1]
  3. Use the predict() function to take your resulting ecological niche factor analysis model and use its results to predict habitat suitability values into your spatial pixels data frame, which you created in step 1. Then, use the plot() function to view the result.
    habitat_suitability <- predict(enfa_result, spg)
    plot(habitat_suitability)
  4. If necessary, on the lower right box of RStudio, click Plots.

    Show plots in RStudio

    Now that you have produced a prediction map regarding habitat suitability, you can write this output back to ArcGIS to map and visualize before using it to determine another region within your study area that is suitable for African buffalo.

    RStudio plot

Write results to ArcGIS

After you have finished your work in R, you can use the bridge to write your desired results back to ArcGIS for further mapping and analysis. First, you'll convert the data back into a raster format that ArcGIS can read.

  1. Define your predictions raster with the same spatial reference as your original data.
    habitat_suitability@proj4string@projargs<-arc.fromWktToP4(arc_raster$sr$WKT)
  2. Use the arc.write() function to transfer your data back to ArcGIS. For the first argument of this function, type the path to your Ecological Niche Factor Analysis file geodatabase. Name your output Habitat_Suitability_Prediction_Raster. For the second argument, enter the R object you want to write back.
    arc.write("C:/African-Buffalo/Ecological Niche Factor Analysis.gdb/Habitat_Suitability_Raster", habitat_suitability)

    Once you run this command, the bridge will write this output data from R to your specified location.

  3. Return to your Ecological Niche Factor Analysis project in ArcGIS Pro.
  4. In the Catalog pane, open your Ecological Niche Factor Analysis geodatabase. If necessary, right-click the geodatabase and click Refresh.
  5. Right-click Habitat Suitability Prediction Raster and choose Add To Current Map.
  6. In the Contents pane, expand Habitat Suitability Prediction Raster if necessary, and right-click the color ramp.
  7. Choose the same red to green color scheme as earlier.

    Habitat suitability raster

    The areas in shades of green represent the locations with a small Mahalanobis distance, meaning they are close to the buffalo's preferred environmental habitats. Areas in shades of yellow and red represent areas with larger Mahalanobis distances, indicating that characteristics of these locations make them less ideal to buffalo based on their displayed environmental preferences.

You've used the R-ArcGIS bridge to transfer your ArcGIS data from ArcGIS to R. You also used R to perform the needed conversions on your data to perform a principal component analysis and an ecological niche factor analysis before using the bridge to transfer the results back to ArcGIS. You then created a map to visualize your result.


Locate a new conservation region

Previously, you performed an ecological niche factor analysis in R and used the R-ArcGIS bridge to transfer your data and results to and from R. The final results you transferred back to ArcGIS consist of a raster layer for your study area containing measurements on how suitable each location is to African buffalo based on their preferences. Next, you'll use these measurements to locate a new conservation region by using tools in ArcGIS before sharing your results online for others to view.

Adjust habitat suitability scale

To begin this process, you'll start by rescaling your raster according to the habitat suitability of each cell in your study area. The value contained in each cell is the Mahalanobis distance, where the higher the value, the farther the area is from the habitat preferences of African buffalo. To make this more intuitive, you can use ArcGIS to rescale your values so that the higher the value, the more desirable the area to the buffalo.

  1. In the Geoprocessing pane, open the Rescale by Function tool.
  2. For Input raster, choose Habitat_Suitability_Raster, and name the Output raster Habitat_Suitability_Rescaled_Raster.
  3. For Transformation function, choose Linear.
  4. Click Calculate Statistics.

    The minimum, maximum, and other raster statistics fill in.

    Note:

    Because Focal Statistics and other statistical tools don't always sample identical areas, your minimum, maximum, and other statistics may vary.

  5. For From scale, type 2349908, and for To scale, type 4.2202078475384E-05. Click Run.

    These are the minimum and maximum cell values present in your raster layer.

    Rescale by Function tool

    By selecting Linear and making the extents of your scale the minimum and maximum of your data, you're reversing your values for a descending scale. In the resulting layer, areas in green, or favorable areas for buffalo, display a higher value.

Interpret ENFA results

To identify a new region for conservation from your rescaled habitat suitability raster, you can use ArcGIS to work within the buffalo's identified environmental preferences. These can be specified from your ecological niche factor analysis, and you can also select an optimal location outside existing regions. By using a parameterized region-growing algorithm, new region candidates are grown and compared against an evaluation metric to determine the best region based on your specified constraints.

  1. In the Geoprocessing pane, open the Locate Regions (Spatial Analyst Tools) tool.
  2. For Input raster, choose Habitat_Suitability_Rescaled_Raster. For Output raster, type New_Conservation_Region.
  3. For Input raster or feature of existing regions, choose African_Buffalo_Locations and click Run.
    Note:

    Because of the size of your study area, the processing might take some time. The tool is not done running until the layer is added to the Contents pane.

    New conservation area

    The Locate Region tool compares the highest average suitability scores for each region to ultimately select the optimal region. The output is a new raster layer that contains zeros to represent locations that have not been selected and ones to represent those that have. This area is relatively far from people and close to water, and it has the habitat characteristics that buffalo like. The new region has been established near one of the areas already favored by African buffalo, so you can expand substantially into areas deemed very favorable to the buffalo based on your analysis results. Because you want to share your results as a web layer, you'll make them look more visually appealing.

  4. In the Contents pane, expand New_Conservation_Region if necessary. Right-click the symbol for 0 and choose No Color.
  5. Right-click the symbol for 1 and change the color to Cordovan Brown.

    Cordovan Brown color

  6. On the ribbon, click the Appearance tab. In the Effects group, change the transparency to 55 percent.
  7. Check African_Buffalo_Locations to turn the layer on, and expand it if necessary.
  8. Right-click the circle symbol and choose Cocoa Brown (to the right of Cordovan Brown).

    Layer symbology

    The transparency of the layer allows you to see that some of the suggested area overlaps existing reserves and national parks. This is the data you'll share as a web layer to show others your findings and do more analysis on the new conservation area.

    Conservation area symbology

Share results as a web layer

Now that you've completed your analysis, you can prepare these results to be shared so others can examine your work, provide feedback, and use your workflow to replicate results or perform their own analyses.

  1. In the Contents pane, right-click New_Conservation_Region and click Sharing. Choose Share As Web Layer.
  2. In the Share As Web Layer pane, name your layer African_Buffalo_Conservation_Areas. Add your initials to differentiate your results from the rest of the organization.
  3. For Layer Type, choose Tile.
  4. For Summary, type Suggested conservation area, calculated using an Ecological Niche Factor Analysis conducted using the R-ArcGIS Bridge. This result measures habitat suitability for African buffalo observed in Kruger National Park based on data from the Ecological Land Unit dataset. This region accounts for the buffalo's ideal habitat, including bioclimates, soil type, land cover, and distance to water. The values are a rescaled Mahalanobis distance calculated from performing an ecological niche factor analysis in R using the adehabitatHS package.
  5. For Tags, type relevant search terms, such as African Buffalo, South Africa, ENFA, and Ecological Land Units. Press Enter after each tag.
  6. For Sharing Options, check the box for Everyone.

    Share As metadata

  7. Click the Content tab.
  8. Make sure that African_Buffalo_Conservation_Areas is the only layer listed under My Content.

    Layer to share as web layer

  9. Under Finish Sharing, click Analyze. There are no warnings, so click Publish.
  10. When the publishing process is finished, click Manage Web Layer.

    The results of your analysis will be available as a web layer that you can share with others to look at your results and perform further analysis, if necessary.

Challenge

The R-ArcGIS bridge can be integrated within ArcGIS in multiple ways. One of these includes creating script tools that can be used like any other geoprocessing tool in ArcGIS. Turn your R script from this lesson into a script tool that can then be used in ArcGIS for others to reproduce your results or to perform their own ecological niche factor analysis. The benefit of a script tool is that users of your tool can easily interact with R functionality. Since script tools honor selections made by users in Pro, they can be included in models along with other ArcGIS or R/Python script tool functionality. They can also be shared as a web tool or GP service and offer a tremendous amount of flexibility when it comes to applying powerful analytical methods. To help you get started, consider checking out one of the sample script tools included on the R-ArcGIS bridge GitHub page, or take our web course on creating script tools with the R-ArcGIS bridge.

Based on the results of this analysis, you've identified more land that could be conserved to promote the growth of the African buffalo. Conserving the identified land could have major benefits to the community. African buffalo are a major tourist attraction, and because the habitat area is between Kruger National Park in South Africa and Limpopo National Park in Mozambique, the infrastructure for tourism already exists. Additionally, because you know African buffalo should be kept away from domesticated cattle, this area could be quarantined to ensure no diseases spread between the species.

Through this lesson, you learned how to install and set up the R-ArcGIS bridge and how to use the bridge to transfer data between ArcGIS and R, and you have seen one of the possible ways R can enhance your ArcGIS workflows through its powerful statistical libraries.

You can find more lessons in the Learn ArcGIS Lesson Gallery.