Download species observation data

The first step is acquiring observation data, or presence data, for your species of interest. In this section, you'll download data from the Global Biodiversity Information Facility (GBIF), which combines observation data from multiple sources for scientific use. Then, you'll explore the observation points in ArcGIS Pro to ensure that the license and data types are correct for the types of analysis you want to perform. You'll also check for duplicate data to prevent overrepresentation in your analysis.

Set up your ArcGIS project

First, you'll set up the ArcGIS Pro project where you'll be working with the data. Then, you'll add two items from ArcGIS Living Atlas. The first is a custom geoprocessing tool, the Download Species Occurrence Points tool, which will allow you to download species points for a selected area of interest from GBIF directly to your ArcGIS Pro project. The second is a layer showing the national boundary of Spain, which you'll use to clip data later in the tutorial.

  1. Start ArcGIS Pro. If prompted, sign in using your licensed ArcGIS organizational account.
    Note:

    If you don't have access to ArcGIS Pro or an ArcGIS organizational account, see options for software access.

    When you open ArcGIS Pro, you're given the option to create a new project or open an existing one. If you've created a project before, you'll see a list of recent projects.

  2. Under New Project, click Map.

    Create a project using the Map template.

  3. In the New Project window, for Name, type EuropeanBadger_Habitat. Leave Location unchanged and confirm that Create a folder for this project is checked.
  4. Click OK.

    Now, you'll add two items from the portal. The first is the Download Species Occurrence Points geoprocessing sample, and the second is a layer showing the boundaries of Spain. This layer will be used to clip and constrain environmental data.

  5. On the ribbon, click the View tab. In the Windows group, choose Catalog Pane.

    Open the Catalog pane.

    The Catalog pane appears. The Catalog pane can be used to add items to a project; view, create, and manage items; and get information about item properties.

  6. In the Catalog pane, click the Portal tab and choose Living Atlas.

    Choose the Living Atlas portal.

  7. In the search bar, search for Download Species Occurrence Points or paste the item ID 927944e867624504bfd6c489b0d2aec7 and press Enter.
  8. Right-click the Download Species Occurrence Points geoprocessing sample and choose Add To Project.

    Add the Download Species Occurrence Points tool to the project.

    The geoprocessing sample is added to the project.

  9. Search for the Spain Country Boundary layer. Find the Spain Country Boundary feature layer owned by esri_dm and drag the result onto the map.

    Add the Spain Country Boundary layer to the map.

    The layer draws on the map and is added to the Contents pane as ESP_Country.

Download animal observation data from GBIF

Next, you'll download animal observation data from the Global Biodiversity Information Facility (GBIF) using the Download Species Occurrence Points geoprocessing sample. GBIF is a global data repository that collects data on where species have been recorded. Data is included from multiple sources, such as iNaturalist, and formatted into a common schema for broad use. First, you'll open the species page on the GBIF site to confirm the genus and species of the European badger. The Download Species Occurrence Points geoprocessing sample requires this information to be entered using correct nomenclature, with the genus name capitalized and the species name in lowercase.

Note: Depending on the protection status of the species, location data may be obscured to prevent poaching or other interference. The European badger is classified by the IUCN Redlist as Least Concern, so location data is not obscured.

  1. Open the GBIF page for Meles meles.

    The species overview page appears. This page shows information about the badger including photos that have been submitted with occurrence records, a map of where sightings have occurred, and a description of the animal's activity and ecology.

  2. Scroll down and read the Description information, paying attention to the Activity and Biology Ecology sections.
  3. Click the Metrics tab and explore statistics about sightings.

    Click the Metrics tab.

    Badgers have been widely sighted across Europe, primarily in the warmer months. In colder climates, badgers hibernate to escape winter weather. Next, you'll download occurrence data to your project.

  4. In ArcGIS Pro, in the Catalog pane, click the Project tab and expand the Toolboxes group.

    There are two toolboxes in the project, a default one added when you created the project, and DownloadSpeciesOccurrencePoints.pyt.

  5. Expand the DownloadSpeciesOccurrencePoints.pyt toolbox, then right-click Download Species Occurrence Points and choose Open.

    Open the Download Species Occurrence Points tool.

    The Download Species Occurrence Points (GBIF) tool opens.

  6. For Scientific Name, type Meles meles.

    Note:
    The Scientific Name parameter requires correct capitalization. Ensure that the genus name is capitalized and the species name is lowercase.

    Next, you'll draw a study area. The Download Species Occurrence Points (GBIF) tool requires areas of interest to have fewer than 300 vertices. The Spain boundary polygon you added to the map is too detailed for this tool, so you'll sketch a rough polygon.

  7. For Study Area, click the drawing tool and choose Polygons.

    Sketch a polygon for the study area parameter.

    Different editing templates appear below the Study Area parameter. By default, the Polygon template is selected, but you can also choose other templates such as circles, rectangles, and freehand lines.

  8. Click the map to draw a rough polygon around Spain.

    Study area around Spain

    The sketched study area can be rough, as long as it encompasses all of Spain. After downloading the data, you'll use the Pairwise Clip tool to extract only the observation points within the country borders.

  9. For Output Species Occurrence Points, type Melesmeles_GBIF_[date], substituting in the date you downloaded the file. Check the Generate and register DOI box.

    When the box is checked, a Digital Object Identifier (DOI) will be generated and registered with GBIF.org to act as a permanent address to a description of the occurrence records downloaded via the tool. This is required for proper attribution of the data source.

  10. Click Run.

    When the tool is finished running, the Melesmeles_GBIF_[date] layer is added to the Contents pane.

    Species observation points

Explore animal observation data

Next, you'll explore the badger observation data you added to ArcGIS Pro. You'll analyze the data you extracted and ensure that the license and data types are correct for the types of analysis you want to perform. You'll also clip the output layer to create a dataset that contains only points in your area of interest, Spain.

  1. In the Catalog pane, right-click the Melesmeles_GBIF_[date] layer and choose Attribute Table.

    Open the table.

    The table contains information about each sighting. Each observation point has descriptive information, including a unique identifier, method of observation, and license type. For this tutorial, you can only use data that has license types that allow public use. These licenses are CC0 1.0, which shows data that is in the public domain, and CC BY 4.0, which denotes data that can be shared and adapted as long as there is attribution of the original data source and/or owner and a description of what has been changed. You'll delete the records that don't match these license types.

    Depending on the organization or purpose you're preparing this data for, you may be able to use data licensed under different terms. For example, if you're working on academic research, it's likely that you're able to use all the data in the layer, including the records designated for noncommercial use. When in doubt, only use data you're certain is in the public domain.

  2. On the ribbon, if necessary, click the Map tab. In the Selection group, click Select By Attributes.

    The Select By Attributes tool opens.

  3. In the Select By Attributes tool, build the query Where license is equal to http://creativecommons.org/licenses/by-nc/4.0/legalcode.

    Filter query

  4. Click OK.

    Points that have noncommercial licensing are now selected on the map and in the attribute table. Selected records are highlighted in blue.

  5. At the top of the attribute table, click Delete. In the Delete window, click Yes.

    The highlighted records are deleted from the layer. Now that the dataset contains only points that you have permissions to use, you'll use the Pairwise Clip tool to remove records outside of Spain.

  6. In the Geoprocessing pane, click the back button. Search for and open the Pairwise Clip tool.

    When the Pairwise Clip tool opens, a banner at the top of the tool warns you that the layer has pending edits.

  7. In the Pairwise Clip tool, click Save Edits.

    Save edits made to the layer.

    The edits you made to the layer are now saved and the unlicensed points are permanently deleted from your project.

  8. For Input Features, choose the Melesmeles_GBIF_[date] layer. For Clip Features, choose ESP_Country.
  9. Name the Output Feature Class EuropeanBadger_points and click Run.

    Run the Pairwise Clip tool.

  10. In the Contents pane, uncheck the Melesmeles_GBIF_[date] and Download Species Occurrence Points (GBIF) Study Area (Polygons) layers and close the attribute table. Right-click the new EuropeanBadger_points and choose Attribute Table.

    Study area points

    At the time this data was processed, there were 2,678 observation points that met the licensing and other selection requirements. Your dataset might be different.

Check field types

Now that the data has been clipped to only contain the points within your area of interest, you'll check that you have all the field types you need for analysis. Depending how data is collected and processed, it's common to find that numeric data is stored as text, or dates are stored as numbers. In this analysis, the date field is important to account for potential seasonal shifts. To ensure that the date field is in the right format for later analysis, you'll check the field map. Then, you'll create a Calendar Heat Chart to evaluate whether there are seasonal patterns to the observations.

  1. On the ribbon, click the Table tab. In the Field group, click Fields.

    The Fields view opens. This table lists all the attribute fields in the EuropeanBadger_points layer, their alias, data type, and other information.

  2. Scroll through the Fields view until you see the eventDate attribute. Confirm that the Data Type is set to Date Only.

    Date Only field type

    The eventDate field is in the correct format for use in the analysis. If this field wasn't already present, you could also calculate a new date field by concatenating the year, month, and day fields in the table. Notice that these fields are currently stored as numeric fields.

  3. Close the Fields view.

    Now that you've confirmed the data type for the date field, you can use it to create a chart.

  4. In the Contents pane, right-click the EuropeanBadger_points table, click Create Chart, and choose Calendar Heat Chart.

    Create a Calendar Heat Chart.

    A blank Chart window appears.

  5. In the Chart Properties pane, for Date, choose the eventDate field.

    Chart the stdTime field.

    The chart populates, showing a heat chart of the month and day when sightings took place. Sightings occurred year-round, with more occurring during the cooler fall and winter months.

    Chart of badger observations by month

Remove excess observation points

Another consideration during data preparation is repetition in the dataset. Depending on the collection method, some of the points might be overrepresentative of badger locations, such as studies tracking animal movement rather than reporting single sightings. You'll check for clusters of points and remove any that might bias the analysis.

Note:

If you plan to use an analysis method such as Presence Only Prediction (MaxEnt), the data preparation you'll do in this section is included within the geoprocessing tool. But if you plan to use other analysis methods such as regression, these are necessary steps to prepare your data.

  1. On the map, zoom in to the cluster south of Seville.

    Zoom in to the cluster of points south of Seville.

    This cluster of points lies within Doñana National Park and appears to represent animal tracks, which means that each set of points likely represents a single animal.

  2. On the ribbon, on the Map tab, in the Selection group, click Select and draw a rectangle on the map around the points within Doñana National Park.
  3. At the bottom of the attribute table, click Show Selected Records.

    Show Selected Records

    The table is filtered to show only the selected records. Depending how you drew your selection, roughly half of the observation points fall within the national park, and you can see that most of the points were gathered through a tracking study. To avoid overrepresenting this area in future analysis, you'll thin these points.

  4. In the Geoprocessing pane, search for and open the Delete Identical tool.

    At the top of the tool is a warning that the tool modifies the input dataset. The Delete Identical tool permanently removes points from the feature layer that you'll input but won't modify the Melesmeles_GBIF_[date] layer in case you need to access some of the deleted points.

  5. For Input Dataset, choose EuropeanBadger_points and leave the Use the selected records toggle button turned on.
  6. For Fields, choose Shape. For XY Tolerance, choose 500 Meters and click Run.

    Delete identical records.

    When the tool finishes running, the attribute table will need to be refreshed because some of the selected records are now deleted.

  7. Close the Attribute table.
  8. On the ribbon, in the Selection group, click Clear to remove the selection.

    There are still many points within the national park, but they've been thinned.

  9. Reopen the Attribute table and check the number of remaining presence points that is listed at the bottom of the table.

    Depending on the points you had selected, this number may vary. Generally, you'll want to create the same number of background points as you have observations, so make sure to check your specific points.

  10. Save the project.

In this section, you downloaded Meles meles observation data to ArcGIS Pro using the Download Species Occurrence Points (GBIF) tool. You then did some initial data preparation, removing observation points with restrictive licensing and points outside your area of interest, checking field types, and removing duplicative points. In the next section, you'll add environmental variables to prepare the data for analysis.


Map presence and pseudo absence points

Species distribution modeling can be done in several ways using several statistical methods. Many of these methods require both presence and absence data, or in your case, pseudo absence data. They also require environmental data to determine what kinds of climate and habitat conditions are suitable for the animal species. Now that you have presence data downloaded and cleaned, you can generate the pseudo absence or background data and extract environmental attribute data at each location.

Generate randomly sampled pseudo absence points

Now that your presence data is ready, you'll generate pseudo absence points. The simplest method is to use random generation within the study area. To ensure that presence and pseudo absence points are equally weighted, you'll create the same number of background points as you have presence points.

  1. In the Geoprocessing pane, search for and open the Create Spatial Sampling Locations tool.

    The Create Spatial Sampling Locations tool generates sample locations within a continuous study area using simple random, stratified, systematic (gridded), or cluster sampling designs.

  2. Enter the following parameters and click Run:

    • Input Study Area: ESP_Country
    • Output Features: ESP_randomsample
    • Sampling Method: Simple random
    • Number of Samples: The number of points in your EuropeanBadger_points table

    Create a layer of randomly sampled points.

    The layer of random points within Spain is added to the map. This dataset can now be combined with your EuropeanBadger_points layer.

    Randomly sampled points

  3. Close the EuropeanBadger_points attribute table.
  4. In the Geoprocessing pane, search for and open the Merge tool.
  5. For Input Datasets, choose ESP_randomsample and EuropeanBadger_points. For Output Dataset, type badger_sample_set.

    Within the Merge tool, you can decide what fields to add to the new layer, and you can create new ones. You'll add a new field named Presence that you'll use to differentiate the observation points from the GBIF data and the background points from the random sample.

  6. For Field Matching Mode, choose Use the field map to reconcile field differences.
  7. For Field Map, click the Add Fields drop-down menu and choose Add Empty Field.

    Add an empty field.

  8. Rename the NewField to Presence and press Enter.

    By default, the Presence field is set to be a Text field.

  9. Point to the Presence field and click Edit.

    Edit the Presence field.

  10. In the Field Properties window, click Type and choose Short.

    Set the Presence field type to Short.

  11. In the Field Properties window, click OK, then run the Merge tool.

    Note:

    The Presence field will have a warning indicating that it's empty.

  12. In the Contents pane, uncheck ESP_randomsample and EuropeanBadger_points to turn the layers off. Right-click the badger_sample_set layer and click Attribute Table.

    To distinguish the presence and absence points in your new layer, you'll calculate values for the Presence field. Typically, presence points are shown with a value of 1 and background points are given a value of 0. As you scroll through the table, notice that the merged points have a lot of null data fields. You'll use these null fields to select the background points.

  13. In the table, click Select By Attributes. In the Select By Attributes window, build the expression Where kingdom is null and click Apply.

    Select points where kingdom attribute is null.

  14. In the attribute table, scroll until you see the Presence field. Right-click the Presence column name and choose Calculate Field.

    Calculate the Presence field.

  15. In the Calculate Field window, for Presence =, type 0 and click OK.
  16. In the Select By Attributes window, check the Invert Where Clause box and click OK.

    Invert the Where clause.

  17. Right-click the presence column name and choose Calculate Field. Build the expression Presence =1 and click OK.

    Now the features are coded with a 1 value for observed presence, and 0 value for pseudo absence.

  18. On the ribbon, click Clear to clear the selection. Close the attribute table and save the project.

Prepare environmental data

Next, you'll locate and prepare environmental variables that might help determine the presence of badgers. Remember from GBIF that badgers prefer good vegetation cover within foraging habitats. From the animal description in GBIF, you know that in central Spain, badgers prefer mid-elevation mountain areas with woodland and pastures, and avoid lower elevations.

  1. Download the SpainPortugalElev.zip file to your computer and unzip it to the ArcGIS project folder you're working in.

    This file contains one tif file named SpainPortugalElev. This file was created from two raster images downloaded from USGS EROS Archive - Digital Elevation - Global Multi-resolution Terrain Elevation Data 2010 that were mosaicked together to cover the whole of Spain, then clipped to the countries of Spain and Portugal. For more information on creating a mosaic dataset, refer to the documentation. You'll use this raster image to create a slope dataset for Spain.

    You can access more detailed slope and elevation data from ArcGIS Living Atlas. However, because of data export limitations, which limit exports to 4,000x4,000 pixels at a time, the ArcGIS Living Atlas data isn't the best choice for a study area this large.

  2. In the Catalog pane, click the Project tab and expand the Folder group, then expand the EuropeanBadger_Habitat project folder.
  3. Locate the SpainPortugalElev.tif image you unzipped, and drag it onto the map.

    Note:

    If you're prompted to build pyramids and calculate statistics for the layer, click OK.

  4. In the Contents pane, uncheck the badger_sample_set and ESP_Country layers to turn them off.

    Elevation data

    The elevation raster draws on the map. You can use this raster to calculate slope, another variable that may help determine badger habitat.

    Note:
    If the image doesn't immediately appear, make sure the layer is selected in the Contents pane. On the ribbon, click the Raster Layer tab, then in the Rendering group, click DRA. DRA is short for dynamic range adjustment, which automatically adjusts your active stretch type as you navigate around your image based on the pixel values in your current display.

  5. In the Geoprocessing pane, search for and open Surface Parameters (Spatial Analyst Tools).
  6. Enter the following parameters and click Run:

    • Input surface raster: SpainPortugalElev.tif
    • Output Raster: Spain_Slope
    • Input analysis mask: ESP_Country
    • Parameter type: Slope
    • Local surface type: Quadratic
    • Slope measurement: Degree

    The Spain_Slope layer is added to the map. It shows slope values in degrees.

    Slope layer derived from the elevation raster

    The next environmental layer you want to find is land cover. For this, you'll use the European Space Agency's WorldCover 2021 data. WorldCover maps 11 land cover types.

  7. In the Catalog pane, click the Portal tab and choose Living Atlas.
  8. Search for the ESA WorldCover layer and drag it onto the map.

    Add land cover to the map.

    The WorldCover layer draws on the map. This layer contains 11 land cover classes at 10-meter resolution.

    ESA WorldCover layer showing land cover for Spain

Prepare bioclimatic data

In addition to slope, elevation, and land cover, other variables that may help model badger habitat are bioclimatic. You'll add several layers from the CHELSA Bioclimate Projections project. The CHELSA layers provide downscaled estimates of climate and bioclimate variables averaged over 30-year periods 2011-2040, 2041-2070, and 2071-2100 based on CMIP6 ISIMIP3b. The CMIP6 ISIMIP3b climate experiments use Shared Socioeconomic Pathways (SSPs) to model future climate scenarios. For this study, you'll use SSP3-7.0 over the period 2011-2040, or early century. This scenario is appropriate, as it closely resembles the current global situation, including conflict between countries, wealth disparities, and social inequity. All time horizons and SSP scenarios can be accessed from the Multidimensional tab.

  1. In the Catalog pane, click the Portal tab and choose Living Atlas.
  2. In the search bar, type CHELSA and press Enter.

    The 19 CHELSA Bioclimate Projection layers appear. These bioclimate predictors were defined by the USGS. Depending on the analysis you want to run, the study area, and species behavior, all of these bioclimate layers may be applicable. For the purpose of this tutorial, you'll select three to add to the badger_sample_set layer. If you choose to add more variables, your processing times may be longer.

  3. Press the Ctrl key and click the following CHELSA layers to select them, then drag them onto the map.

    • Annual Mean Temperature (Bio1)
    • Annual Precipitation (Bio12)
    • Temperature Seasonality (Bio4)

    Add the Bioclimate Projections layers.

    The three bioclimate layers are added to the map. Each layer contains three scenarios: SSP2-4.5, SSP3-7.0, and SSP5-8.5, to model potential future conditions depending on greenhouse gas emissions, political and social policy, and other changes. This information is stored in each layer in multidimensional format. For modeling in this project, you'll use SSP3-7.0 over the period 2011-2040. This scenario represents high emissions and the average change that may occur compared to historical climate averages in the mid-century time frame.

  4. In the Contents pane, click the Annual Mean Temperature (Bio1) layer to select it. On the ribbon, click the Multidimensional tab.
  5. In the Current Display Slice group, change Variable to SSP370 and StdTime to 25-12-01T00:00:00.

    Multidimensional raster settings

    Note:

    The time periods 1981-2010, 2011-2040, 2041-2070, and 2071-2100 are denoted in the multidimensional raster by their mid-point years as 1995, 2025, 2055, and 2085.

  6. Change the Variable and StdTime settings for the Annual Precipitation (Bio12) and Temperature Seasonality (Bio4) layers to match SSP370 and 25-12-01T00:00:00, respectively.

    The bioclimate variables are now ready to be added to the badger sample set.

Extract environmental data

Now that you have habitat and bioclimate data in the project, you'll use the Extract Multi Values to Points tool to get the raster values for each point location. These values are appended to the input table.

  1. In the Geoprocessing pane, search for and open the Extract Multi Values to Points tool.
  2. For Input point features, choose badger_sample_set.
  3. For Input rasters, click Add Many and choose Toggle All Checkboxes.

    Add many rasters.

    Six raster layers are added to the Input rasters parameter: Spain_Slope, SpainPortugalElev, LandCover, Annual Precipitation (Bio12), Temperature Seasonality (Bio4), and Annual Mean Temperature (Bio1).

  4. For Output field name, edit the fields to the following:

    • annual_mean_temp
    • annual_precip
    • temp_seasonality
    • landcover
    • slope
    • elevation

    Before running the tool, you'll set the Processing Extent to the Spain country boundary you've been using. Because the WorldCover and Bioclimate layers are global datasets, setting the Processing Extent will help you to extract only the data you need.

  5. Click the Environments tab.

    Click the Environments tab.

  6. Expand the Processing Extent group. Click Extent of a Layer and choose the ESP_Country layer.

    Set the processing extent.

  7. Click Run.

    This tool may take some time to run. When the tool is done, the badger_sample_set layer has six new variables in the attribute table. You'll use the Data Engineering tools to explore the data you just added.

Use data engineering

Next, you'll use the Data Engineering tools to explore the data. With the Data Engineering tools in ArcGIS Pro, you can explore, visualize, clean, and prepare your data for analysis. In this section, you'll use Data Engineering tools to better understand the environmental variables you've extracted to your sample set.

  1. In the Contents pane, right-click the badger_sample_set layer and choose Data Engineering.

    Open the Data Engineering tools.

    The Data Engineering view opens. The type of data preparation you choose to do depends on the type of modeling you want to use to create your habitat suitability model. For example, if you're planning to use regression analysis, you can use the Transform tool to transform skewed data to a normal distribution.

  2. In the Fields pane, hold the Shift key and select all six fields you just extracted to the sample set: annual_mean_temp, annual_precip, temp_seasonality, landcover, slope, and elevation.
  3. Drag the selected fields into the empty Statistics pane in the middle of the window.

    Add fields to the Statistics pane.

    The environmental data you've collected for your habitat modeling project is added to the Statistics pane.

  4. In the Data Engineering pane, on the ribbon, click Calculate.

    Calculate statistics

    Statistics for the fields are calculated, including the mean, unique values, and outliers. You can use these statistics to start identifying patterns in your data.

  5. In the Statistics pane, scroll to the Outliers column.

    The field with the most statistical outliers is annual precipitation.

  6. Right-click the Outliers record for the annual_precip field and choose Select Outliers.

    Select outliers on the map.

    The outliers are selected on the map. A lot of the outlier points are in northern Spain in or near the Cantabrian and Pyrenees Mountains.

    Outliers selected on the map

    To visualize these values, you'll use a histogram.

  7. In the Statistics pane, right-click the histogram for the annual_precip field. Click Open Histogram.

    Open the histogram.

    The histogram for the annual_precip field opens. The outliers that you selected are shown on the histogram. Using the histogram, you can see that all the outliers are on the high side, or in areas with steeper slope. Next, you'll look at the average annual temperature.

  8. On the Distribution of slope histogram, on the ribbon, click Clear Selection, then close the histogram.

    Clear the selected points.

  9. In the Statistics pane, right-click the Chart Preview for annual_mean_temp and choose Open Histogram.

    The histogram opens. To understand what temperatures badgers might prefer, you'll select badger presence points.

  10. On the ribbon, click the Map tab. In the Selection group, click Select by Attributes.
  11. In the Select by Attributes window, clear any existing expressions and build the expression Where Presence is equal to 1. Click OK.

    The presence points are selected on the map and in the chart.

    Presence points selected on the map

    Based on the chart, it appears that badgers prefer warmer temperatures.

    Tip:

    If you don't see both the selected data and nonselected data on the chart, on the chart ribbon, in the Filter group, ensure the Selection filter is turned off.

    You can use the Data Engineering tools to examine the other bioclimate variables and make changes to the data as needed.

  12. Clear the selection and close the chart.

    The last step before using the data for modeling is to add attribution.

Edit metadata

Now that your dataset is cleaned and contains the environmental variables needed to begin modeling, your final step is to ensure that the layer has the correct attribution. Many of the observation points had a CC BY 4.0 license, which requires attribution. When you used the Download Species Occurrence Points (GBIF) tool earlier, you generated a Digital Object Identifier (DOI) that will form the basis for this attribution.

  1. On the ribbon, click the Analysis tab. In the Geoprocessing group, choose History.

    Open the geoprocessing history.

  2. Right-click the Download Species Occurrence Points (GBIF) record and choose View Details.

    The tool details window appears. This window records all the parameters you ran the tool with as well as messages that were generated during its run. The DOI is recorded on the Messages tab.

  3. Click the Messages tab.
  4. Read the Citation Requirements section, then scroll to the bottom of the table where the DOI is printed.
  5. Copy the DOI for your dataset.

    Copy the DOI.

    Now you'll create your citation.

  6. On the ribbon, click the View tab. In the Windows group, choose Catalog View.

    Open the Catalog View.

    The Catalog view opens. The Catalog view and Catalog pane, which you've worked with so far in this tutorial, have many similarities, but metadata can only be edited in the Catalog view.

  7. In the Catalog view, expand Databases and EuropeanBadger_Habitat.gdb, then click the badger_sample_set layer.

    Open the badger_sample_set layer's metadata.

    The Metadata editor opens. Currently, the metadata is empty except for the Geoprocessing history, which shows the Join Field and Calculate Field tools you ran.

  8. On the ribbon, on the Catalog tab, in the Metadata group, click Edit.

    Edit the layer's metadata.

    The metadata editor opens.

  9. Enter the following information in the Metadata pane:

    • Title: European Badger Sample Dataset
    • Tags: species modeling, Meles meles, European badger
    • Summary (Purpose): This dataset was created in the tutorial Sample species and environmental data for distribution modeling to model European badger (Meles meles) habitat in Spain.
    • Description (Abstract): The European badger is an important species, providing three main ecosystem services: seed dispersal, topsoil disturbances and microhabitat creation. To model its habitat in Spain, animal observation data was downloaded from GBIF, recorded in the Presence field with a value of 1. Pseudo-absence or background points were generated and merged with the observation data. Environmental data, including slope, elevation, land cover, and bioclimate variables, were extracted to these points.

  10. In the Credits section, enter the following citation and paste the unique DOI that was generated for your dataset.

    GBIF.org ([ACCESS DATE]) GBIF Occurrence Download. DOI:[Your DOI]

    Enter the GBIF citation in the Credits section.

  11. At the bottom of the metadata editor, click New Bounding Box.

    Add a new bounding box.

  12. Enter the following coordinates:

    WestEastSouthNorth

    -17.7532431

    5.6396581

    26.8567504

    44.3051478

  13. On the ribbon, on the Metadata tab, click Save.
  14. Close the metadata editor and save the project.

Now you have a dataset on the European badger that you can use for species distribution modeling. The dataset contains both presence and pseudo absence points as well as environmental data about the slope, elevation, land cover, temperature, and more. This information can be used in models such as MaxEnt or random forest prediction to do species distribution modeling.