Construct a data pipeline

To create the final dataset requested by the stakeholders, your data pipeline needs to transform the data by enabling location, removing unnecessary attributes, and calculating new fields.

You'll add the initial table containing information about the capital projects, filter the dataset, create its geometry using coordinates, reproject the data, and calculate a field.

Create a data pipeline

First, you'll sign in to your Enterprise portal and create an empty data pipeline.

If necessary, sign in to your ArcGIS organizational account.
On the ribbon, click the app launcher button. Choose Data Pipelines.
A browser tab opens to a gallery showing any existing data pipelines you own.
Note:
If you are using Enterprise 12.0, you will see Beta next to the app.
Click Create data pipeline.
If necessary, in the Welcome to Data Pipelines window, click Skip to skip the tour.
The Data Pipelines Editor appears.
Note:
To see the names of the buttons on the Editor toolbar, click the Expand button at the bottom of the toolbar.
This editing environment allows you to add data inputs, provides access to tools to transform data, and allows you to write the processed data out to a feature layer.

Add a CSV table as an input

Now that you have the Data Pipelines Editor open, you'll add your first input dataset. This is the table of capital projects from the DPR. Since this table is frequently updated, an exported table would quickly be out of date. Therefore, you'll access the .csv table directly from its source.

Open the New York City OpenData (NYC OpenData) page for the Capital Project Tracker dataset.
A browser tab opens to an overview of the Capital Project Tracker dataset. This provides valuable information, such as how often the dataset is updated, when it was last updated, and a description for each column in the .csv table.
Click the Export button.
The Export dataset window appears.
In the Export dataset window, click API endpoint.
Note:
The number of rows you see in this dataset and other datasets throughout this tutorial may vary slightly from the images provided due to changes in this dataset over time.
When you click the API endpoint button, the Export format switches to JSON. The API endpoint allows access to the dataset through a URL.
While Data Pipelines accepts JSON data formats, you'll change the data format to CSV.
For Data format, choose CSV.
Next, you'll copy the URL for this dataset.
For Version, click SODA2.
Click Copy to clipboard.
Now, you'll you add this .csv table to your data pipeline as a URL input.
On the Editor toolbar, click Inputs. In the Inputs pane, under File, choose URL.
Note:
If the URL parameter is disabled, verify with your administrator that the ArcGIS Data Pipelines server aligns with the requirements for this tutorial so you can add data by a URL.
The Add a URL window appears.
Click the URL text box and press Ctrl+V to paste the URL you copied from the NYC OpenData website.
The Response format parameter is set automatically.
Click Add.
The URL element is added to the canvas.
The element's name is derived from the name of the .csv table you accessed. You'll change its name next.
On the Element action bar, click the Rename button.
In the text box, clear any text, type Capital Project Tracker, and press Enter. Expand the element so that the name is visible.
Since the element is selected in the canvas, the URL pane is open. Here, you can configure or reconfigure any selected element on the canvas.
Next, you'll preview the dataset you added.
In the URL pane, click Preview.
The Capital Project Tracker window appears. It's currently showing the table preview. By previewing your data, you'll know what your data will look like when you run the data pipeline.
Note:
You can also preview your data by clicking the Preview button on an element's action bar.
The top of the table indicates the number of records is 1,006. This may not match the number of rows on the OpenData website. If the numbers do not match, it is because some of the records contain a character that makes the record multiple lines. In the URL panel, you can account for this.
Note:
The data is updated often, so if the number of records is different, that is okay.
In the URL pane, turn on Has multiline data.
Click Preview again.
The preview refreshes. The number of records is now 1,000.
Scroll through the table to observe the data provided by NYC OpenData.
On the side of the table, click the Map preview button.
Since this is only a table and no geometry has been defined, a map preview is not available. You'll allow for a map preview in a later section.
Click the Schema button.
This lists all the fields in the dataset and their field types. Throughout the rest of the tutorial, you'll use a number of these fields to transform your data, including currentphase, designstart, latitude, and longitude.
Click the Messages button.
If there were any warnings or errors in your preview dataset, they would be listed here.
Close the preview window.

Filter data by attribute

Now that you've added the .csv table to the data pipeline, you'll use a tool element to filter the dataset to only show the capital projects whose current phase is construction and have incorrect latitude and longitude values.

On the Editor toolbar, click Tools.
The Tools pane appears. The tools listed, by category, allow you to manipulate the datasets in your data pipeline. You'll add the Filter by attribute tool to remove any row whose current phase is not construction. You'll also filter out any rows that have a latitude or longitude value of 0.
In the Tools panel, under Clean, click Filter by attribute.
An element is added to the canvas. It needs to be connected to an existing element that contains data and configured.
Move the Filter by attribute element to the right of the URL element.
In the Filter by attribute pane, under Input dataset, click Dataset. In the Select dataset window, choose Capital Project Tracker.
The two elements are connected. Data will flow from the .csv file into the Filter by attribute tool when the data pipeline runs.
Note:
You can also connect elements in a data pipeline by dragging the pointer from the output port of one element to the input port of another element.
Next, you'll configure the filter to exclude any records that have a value of 0 for latitude or longitude and only show those rows that have a current phase value of construction.
In the Filter by attribute pane, click Build new query.
The Query builder window appears.
Ensure that Expression is selected and click Next.
For the first expression, set the field to latitude and set the operator to does not equal. For the value, type 0.
Next, you'll add another expression for longitude.
Click the Expression button.
Write a second expression where longitude does not equal 0.
Add another expression and have it query the rows where currentphase equals construction.
Note:
For the value, use the drop-down list to select a value rather than type it.
Click Add.
In the Filter by attribute pane, click Preview.
The preview window appears. Below the title of this table is a count of the number of records. Previously, it was more than 1,000 records. Now, because of the filters you applied, it's less than 100.
Scroll through the table and observe the values for the latitude, longitude, and currentphase fields.
These values meet the criteria of your query.
Close the preview window.
When the tool element was added to the canvas, it was given the default name of Filter by attribute. You'll change its name to make it more meaningful.
On the Filter by attribute action bar, click Rename and type Filter for Construction Phase. Resize the element so that the name is visible.
Before adding more elements, you'll save your data pipeline.
On the Editor toolbar, click Save and open and choose Save as.
The Save data pipeline window appears.
For Title, type Capital Projects Data Pipeline.
Click Save.
The data pipeline is saved and the title appears above the Tools pane.

Create point geometry

Next, you'll use the latitude and longitude columns in the filtered dataset to provide this dataset with a geometry that is viewable on a map.

On the Editor toolbar, click Tools. In the Tools pane, under Construct, click Create geometry.
The Create geometry element is added to the canvas and its associated pane appears.
Move the Create geometry element to the right of the Filter by attribute element.
Click and drag from the Filter by attribute element's output port to the Create geometry element's input port.
The two elements are connected. Next, you'll configure the Create geometry element. Since this table contains latitude and longitude values, you can create a point geometry.
In the Create geometry pane, for Geometry type, choose Point. For Geometry format, choose XYZ.
Additional parameters appear. These parameters are used to determine which fields in your table contain X, Y, and Z values. Your dataset does not have Z values; this parameter will not be used.
For X field, choose longitude. For Y field, choose latitude.
Click Preview.
In the preview window, click the Map preview button.
Note:
Based on your organization's settings, the basemap in your project may be different from the examples.
The locations of the capital projects are visible on the map. You can click features to see their attributes in a pop-up window.
Close the preview window.

Project point data

Your points were created with latitude and longitude values using the WGS 1984 geographic coordinate system. This is not an ideal coordinate system for New York City. You'll project your data to a more appropriate coordinate system.

Note:

If you're not familiar with coordinate systems, read Coordinate Systems: What's the Difference.

In the Tools pane, under Format, click Project geometry.
The Project geometry element is added to the canvas.
Move the Project geometry element to the right of the Create geometry element.
Click and drag from the Create geometry element's output port to the Project geometry element's input port.
The two elements are connected. Next, you'll configure the Project geometry element.
In the Project geometry pane, for Spatial reference, click the drop-down menu.
The Browse coordinate systems window appears.
For a projected coordinate system, you'll use NAD 1983 (2011) StatePlane New York Long Isl FIPS 3104 (Meters). Its ID number is 6538.
In the Browse coordinate systems window, in the search box, type 6538. Select NAD 1983 (2011) StatePlane New York Long Isl FIPS 3104 (Meters).
Click Done.
The spatial reference of the data will be projected to a more suitable coordinate system.

Calculate a new field

As a final step for preparing your initial input dataset, you'll calculate a new field. The dataset contains the designstart field. This records when each project began. You'll calculate an additional field that determines the amount of time since each project began in years and days.

In the Tools pane, under Construct, click Calculate field.
The Calculate field element is added to the canvas.
Move the Calculate field element to the right of the Project geometry element.
Click and drag from the Project geometry element's output port to the Calculate field element's input port.
The two elements are connected. Next, you'll configure the Calculate field element. You'll start by providing the new field with a name.
In the Calculate field pane, for New field name, type Elapsed_Time, and for New field type, choose String.
Note:
Field names cannot contain special characters, such as spaces.
Next, you'll write an expression to calculate the field. This tool uses ArcGIS Arcade expressions to calculate fields.
Note:
To learn more about ArcGIS Arcade, read Learn ArcGIS Arcade in Four Easy Steps.
Under Arcade expression, click Author Arcade expression.
The Arcade expression window appears. Here, you can write Arcade expressions to calculate field values. You'll copy and paste code that returns the number of years and days since a capital project's design began.
In the Arcade expression window, clear the sample code.

Copy and paste the following code into the Arcade expression window:

//Convert time between 2 fields to count years and days 

//Determine the total number of days
var TotalDays = DateDiff(now(), $record.designstart, "days")

//Determine the number of days
var RemainderDays = Floor(TotalDays % 365)

//Determine the number of years
var RemainderYears = Floor(DateDiff(now(), $record.designstart, "years"))

//Format the final text to account for year(s) and day(s)
if(RemainderYears == 1 && RemainderDays == 1){
  return RemainderYears + " year and " + RemainderDays + " day"
}
else if (RemainderYears == 1 && RemainderDays != 1){
  return RemainderYears + " year and " + RemainderDays + " days"
}
else if (RemainderYears != 1 && RemainderDays == 1){
  return RemainderYears + " years and " + RemainderDays + " day"
}
else{
  return RemainderYears + " years and " + RemainderDays + " days"
}

Arcade expression

The desired format of this calculation is X years and Y days. To do this, the expression first determines the number of days since a capital project design started. Since the number of days may be more than one year, the code divides the number of days by 365 and returns the remainder value. This returns the Y value in the desired format. Then, the expression calculates the number of years since the project began. This is the X value in the desired format. The last part of the expression, starting on line 12, formats the years and days text to make them singular or plural based on the number of days or years since the design started.

Click Save.
The Elapsed_Time field is added to the table and calculated.
In the Calculate field pane, click Preview.
In the preview window, scroll to the Elapsed_Time field.
For each project, the number of years and days since the project began is recorded in an understandable format.
Close the preview window.
You'll rename this element to clarify the field that it calculates.
Rename the Calculate field element to Calculate Elapsed Time.
Expand the element so that its full name is visible.
Finally, you'll save your data pipeline.
On the Editor toolbar, click Save and open and choose Save.

So far, you've added a .csv table and begun to transform the capital projects data. You also filtered the data, gave it point geometry using coordinates, reprojected it to an appropriate coordinate system, and calculated a field to provide the elapsed time since a project was designed.

Perform spatial joins

At this point, the capital project data has been added and partially formatted, but it still needs attribution from other datasets. For each capital project, you need to determine which neighborhood tabulation area and community district they fall within. Both the neighborhood tabulation areas and community districts exist as publicly available polygon datasets. You'll add these two datasets to your data pipeline and use spatial joins to append the neighborhood and district names to each capital project.

Add a GeoJSON as an input

First, you'll add the neighborhood tabulation areas dataset to your data pipeline. They're available on the NYC OpenData website in a GeoJSON format.

Open the NYC OpenData page for the 2020 Neighborhood Tabulation Areas (NTAs) - Tabular dataset.
A browser tab opens to an overview of the 2020 Neighborhood Tabulation Areas (NTAs) - Tabular dataset. Like the Capital Project Tracker dataset, this page provides an overview of the dataset and how frequently it's updated.
Click the Export button.
The Export dataset window appears. Since this dataset contains fewer than 1,000 rows, you won't need to change to URL like you did with the Capital Project Tracker dataset.
In the Export dataset window, set the following parameters:
- Click API endpoint.
- For Data format, choose GeoJSON.
- For Version, select SODA2.
Note:
The number of rows you see in this dataset may vary from the image above due to changes in this dataset over time.
In the Export dataset window, click API endpoint and for Version, click SODA2.
As you did with the .csv file previously, you'll copy the URL for this dataset so you can add it to your project.
Click Copy to clipboard.
In the Data Pipelines Editor, on the Editor toolbar, click Inputs. In the Inputs pane, under File, choose URL.
The Add a URL window appears.
For URL, paste the URL you copied from the NYC OpenData website.
The Response format parameter is automatically set to GeoJSON.
Click Add.
The URL element is added to the canvas.
Again, the name is unintuitive, so you'll rename this element.
Rename the URL element to Neighborhood Tabulation Areas and resize the element to make the full name visible.
Move the element under the Project geometry element.
Next, you'll preview the dataset you added.
In the URL pane, click Preview.
In the preview window, observe the fields provided with this dataset. The ntaname field is the attribute that you'll be adding to the capital projects point layer using a spatial join.
Click the Map preview button.
A map appears with the neighborhoods drawn as polygons.
Close the preview window and save the project.

Project polygon data

Like the capital project points, the neighborhoods GeoJSON uses the WGS 1984 geographic coordinate system. Therefore, you'll add another Project geometry tool to project the neighborhoods to the same state plane zone that you used for the capital project locations. To save time, you'll copy the existing Project geometry element.

On the canvas, select the Project geometry element.
Press Ctrl+C to copy the element.
Press Ctrl+V to paste the element onto the canvas.
Move the Project geometry element to the right of the URL element for Neighborhood Tabulation Areas.
Click and drag from the URL element's output port to the Project geometry element's input port.
The two elements are connected. Because you copied this element, the coordinate system is already selected. Now, the neighborhoods dataset uses the proper coordinate system.

Spatially join the capital projects and the neighborhoods

Now that both of your datasets are using the same coordinate system, you'll add a spatial join to your data pipeline. This spatial join will determine which neighborhood each capital project point falls within and add the neighborhood attributes to the capital project point.

On the Editor toolbar, click Tools. In the Tools pane, under Integrate, click Join.
The Join element is added to the canvas.
Move the Join element to the right of the Calculate field and Project geometry elements.
Next, you'll connect the Calculate field and the Project geometry elements to the Join element. The Join element has two input ports. The upper input port is for the target dataset. The target dataset is the dataset that will have additional attributes added to it. The lower input port is for the join dataset. This is the dataset that will share its attributes to the target dataset. In this case, you want the capital projects to receive the attributes from the neighborhood dataset. Therefore, the Calculate field element is the target dataset and it will be connected to the upper input port.
Click and drag from the Calculate field element's output port to the Join element's upper input port. Click and drag from the Project geometry element's output port to the Join element's lower input port.
The two input elements are connected to the Join element. In the Join panel, the Target dataset and Join dataset are filled in based on the elements you linked.
Next, you'll set the Join element to use a spatial relationship.
In the Join pane, under Spatial relationship, turn on Use spatial relationship.
Additional parameters appear. The Target geometry and Join geometry parameters are automatically completed. But you still need to choose a Spatial relationship. This defines how the target and join datasets are joined. Since the capital project points fall inside of the neighborhood polygons, you'll use the Intersects relationship.
For Spatial relationship, choose Intersects.
Click Preview.
The preview window appears. This dataset continues to represent the capital project points. In the table preview, the fields that initially appear are from the capital project points.
In the table preview, scroll to the end of the table and find the ntaname field.
The fields at the far end of the table are the fields from the neighborhoods. Now, for each capital project, you know the neighborhood that it falls within.
As you perform more joins, the number of fields becomes cumbersome, especially since many of these fields have not been requested by your stakeholders. Later, you'll remove the unnecessary attribute fields.
Close the preview window.
In the next section, you'll add a second spatial join. To prevent confusion, you'll rename the first Join element.
Rename the Join element to Neighborhood Join.

Create a feature layer from a URL

Some stakeholders requested that the final output also contain information about the community district that each capital project falls within. To accomplish this, you'll use another Join element, but first you need the polygon dataset containing the community districts. You'll access the URL for the feature layer from its item details page and create a new item in your content based on it.

In a new browser tab, go to the item details page for the community districts layer.
The item details page for the NYC_Community_Districts layer appears.
Under Details, for URL, click the Copy button.
Next, you'll add the feature layer to your content using the URL you copied.
In another browser tab, access your Enterprise portal and click the Content tab.
Click New item.
In the New item window, click URL.
For URL, paste the URL you copied from the item details page.
Click Next and click Save.
Click the Content tab.
Now you have a feature layer in your content that you can add to your data pipeline.

Add a feature layer as an input

Next, you'll add the NYC_Community_Districts feature layer from your content to the workflow.

On the Editor toolbar, click Inputs. In the Inputs pane, under ArcGIS, choose Feature layer.
The Select a feature layer window appears showing your My content folder that contains the NYC_Community_Districts layer.
Note:
Depending on the layers you have in My content, you may have to scroll or search for the layer.
Find the NYC_Community_Districts layer, click Select layer and click NYC_Community_Districts.
Click Add.
A Feature layer element is added to the canvas.
Rename the NYC_Community_Districts element to Community Districts.
Move the Feature layer element under the Project geometry element for the neighborhoods dataset.
On the canvas, select the Community Districts element. In the Feature layer pane, click Preview.
The BoroCD field is the attribute that you'll be adding to the capital projects point layer using a spatial join.
Click the Map preview button.
A map appears with the community districts drawn as polygons.
Close the preview window.

Project a feature layer

The feature you added uses the Web Mercator (auxiliary sphere) projected coordinate system. To ensure data accuracy, you'll project this feature layer so that it uses the same coordinate system as the reprojected capital project points.

On the canvas, select one of the Project geometry elements.
Press Ctrl+C to copy the element.
Press Ctrl+V to paste the element onto the canvas.
Move the Project geometry element to the right of the Feature layer element.
Click and drag from the Feature layer element's output port to the Project geometry element's input port.
Note:
If the Project geometry tool has an error, delete the one you copied and rather than copying and pasting one that is already on the canvas, add a new one from the Tools > Format section and configure it as you did the other one.
The two elements are connected. Now, the community districts dataset uses the proper coordinate system.

Spatially join the capital projects and the community districts

Now that the community districts dataset has been projected, you'll perform a second spatial join to determine which community district each capital project falls within.

On the Editor toolbar, click Tools. In the Tools pane, under Integrate, click Join.
The Join element is added to the canvas.
Move the Join element to the right of the first Join and Project geometry elements.
Next, you'll connect the first Join and the Project geometry elements to the second Join element. The first Join element will be the target dataset parameter and the Project geometry element will be the join dataset parameter.
Click and drag from the Neighborhood Join element's output port to the second Join element's upper input port. Click and drag from the Project geometry element's output port to the second Join element's lower input port.
Next, you'll set the second Join element to use a spatial relationship.
In the Join pane, under Spatial relationship, turn on Use spatial relationship.
For Spatial relationship, choose Intersects.
Click Preview.
The preview window appears.
In the table preview, scroll to the end of the table.
The first fields you see are from the capital projects dataset. Next, you see the fields from the neighborhood dataset. Finally, at the end of the table, you see the fields from the community districts dataset. Now, each capital project has information about the community district that it falls within.
Close the preview window.
Since there is another Join element on the canvas, you'll rename this second Join element for clarity.
Rename the second Join element to Community District Join. Resize the element so that the full name of the element is visible.
Save your data pipeline.

You added two public polygon layers with attribution that you wanted to add to the capital projects dataset. One dataset is a GeoJSON from the NYC OpenData site, and the other is a feature layer that you added as a web service item to your content. Then, you projected both datasets and spatially joined them to the capital projects dataset.

Clean the data

After adding data and spatially joining it, you have all the attributes that were requested by the stakeholders in various departments. However, there are many other fields that are unnecessary and make the attribute table difficult to navigate. Additionally, some of the requested fields have names that are difficult to interpret.

Next, you'll clean up the attributes before the results are written to an output dataset.

Select fields

First, you'll select only the fields of interest to your stakeholders. This includes several fields from the capital projects dataset, the Time_Elapsed field you calculated, the ntaname field, and the BoroCD field.

On the Editor toolbar, click Tools. In the Tools pane, under Clean, click Select fields.
The Select fields element is added to the canvas.
Move the Select fields element to the right of the Community District Join element.
Next, you'll connect the Join and Select fields elements.
Click and drag from the Join element's output port to the Select fields element's input port.
Next, you'll choose which fields you want the output dataset to contain.
In the Select fields pane, under Fields, click Field.
The Select fields window appears. You'll choose the fields that are of interest to your stakeholders. You'll also choose the GEOMETRY field. This field is necessary for you to be able to display your output dataset as points. Otherwise, the output would be a nonspatial feature layer or hosted table.
In the Select fields window, select the following fields:
- fmsid
- currentphase
- GEOMETRY
- Elapsed_Time
- ntaname
- BoroCD
Click Done.
The fields that will be included in the output are listed.
In the Select fields pane, click Preview.
The preview window appears.
Instead of having an overwhelming number of fields, your output dataset will only contain these six fields that were requested by the stakeholders.
Click the Map preview button.
Since you included the GEOMETRY field in the selected fields, a map of the capital project points is visible.
Close the preview window.

Update fields

Now that you have the fields of interest for your stakeholders, you'll change some of their names to make them more readable.

On the Editor toolbar, click Tools. In the Tools pane, under Format, click Update fields.
The Update fields element is added to the canvas.
Next, you'll connect the Select fields and Update fields elements.
Move the Update fields element to the right of the Select fields element.
Click and drag from the Select fields element's output port to the Update fields element's input port.
Next, you'll choose which fields you want to update and configure them. When updating fields, you can update their name and field type. You'll update three of the fields' names.
The first field you'll update is fmsid. This field originated from the Capital Project Tracker dataset and contains a project identification number.
In the Update fields pane, under Updates, for Field to update, click the drop-down menu.
The Select field window appears. You'll choose the field that you want to update.
In the Select field window, click fmsid.
Next, you'll provide an updated name for this field.
For New field name, type Project_ID.
Note:
As with the Calculate field tool, field names cannot contain special characters, such as spaces.
The first field has been updated. You'll update two more fields, the ntaname and BoroCD fields.
Note:
If you wanted to change a field's type, like string to integer, you could do that using the New field type parameter.
Next, you'll add more fields to update.
Click Add.
In the same manner, update the ntaname field to Neighborhood.
Click Add again and update the BoroCD field to Community.
The three fields you set to update are listed.
Click Preview.
In the table preview, the column headings have been updated. Your table's field headings are more intuitive for your stakeholders.
Close the preview window and save the project.

Create an output feature layer

Thus far, your data pipeline ingests and transforms your data. As a final step, this data will be loaded into a feature layer.

On the Editor toolbar, click Outputs. In the Outputs pane, under ArcGIS, click Feature layer.
The Feature layer element is added to the canvas.
Move the Feature layer element to the right of the Update fields element.
Next, you'll connect the Update fields and Feature layer elements.
Click and drag from the Update fields element's output port to the Feature layer element's input port.
Next, you'll configure the output settings for the feature layer that will be created. With ArcGIS Data Pipelines, it's also possible to have the output replace an existing feature layer or add and update features in an existing feature layer.
In the Feature layer pane, perform the following:
- Under Output settings, ensure that Output method is set to Create.
- For Output name, type DPR Capital Projects.
Click Preview.
What you see in the preview window is what will be written to your feature layer when you run the data pipeline.
Close the preview window.
Resize the Feature layer element so that its full name is visible.
Your data pipeline is complete.
If your data pipeline's elements are disorganized, the Auto layout diagram button repositions elements to better see the flow of inputs, tools, and outputs.
On the Canvas action bar, click Auto layout diagram.
The elements on the canvas are repositioned.
Save your data pipeline.

You cleaned up the data created from the previous spatial joins. You removed unnecessary fields and renamed fields whose names were unintuitive. Finally, you set the data pipeline to write the output dataset to a feature layer in your Enterprise organization.

Review the results

Next, you'll run the data pipeline that you created and explore the results. Then, you'll set the data pipeline to run automatically on a schedule to keep the information current.

Run the data pipeline

Now that your data pipeline is complete, you'll run it to create a feature layer.

On the Canvas action bar, click Run.
The Latest run details window appears and opens to the Run details tab. This window provides you with information as the data pipeline runs. It also displays any warnings or errors that occur during processing.
Note:
The number of features may differ from what is shown in the example and that is okay.
After the data pipeline completes, you'll explore your results. Processing takes about a minute.
In the Latest run details window, click the Output results tab.
This tab lists any outputs created by the data pipeline. The DPR Capital Projects feature layer is listed.
Next, you'll review your feature layer's item details and share the feature layer with your organization.
For the DPR Capital Projects layer, click Options and choose View details.
A browser tab opens to the DPR Capital Projects item details page.
This page provides information about the feature layer created by the data pipeline. Next, you'll share the results with your organization.
Click the Share button.
The Share window appears.
In the Share window, for Set sharing level, choose Organization.
Click Save.
The DPR Capital Projects layer is now shared with your organization for others to access. When the data pipeline runs, it'll update this feature layer for anyone who adds it to the maps or apps.
Note:
When you create a data pipeline, it is stored as an item in your Enterprise account. This item does not need to be shared with your organization for users to access a data pipeline's output feature layer.
Next, you'll view the result on a map.
Click Open in Map Viewer.
A map opens and the DPR Capital Projects feature layer is added.
Click one of the points.
A pop-up appears with the attributes that you specified in the data pipeline.
This dataset is available to be symbolized, analyzed, and configured further for your stakeholders' web maps and apps.
Close the pop-up window.

Update the data pipeline

Your data pipeline ran successfully and you now have a feature layer representing DPR Capital Projects. However, the source data updates regularly and your stakeholders want the latest information reflected in their web maps and apps. You'll update the Feature layer output element to replace the DPR Capital Projects feature layer every time the data pipeline runs in the future.

Note:

If your organization's data pipelines only need to be run once, updating the Feature layer element is not necessary.

In the Data Pipeline Editor, close the Latest run details window.
On the canvas, click the Feature layer element representing your output feature layer.
In the Feature layer pane, under Output settings, change Output method to Replace.
The Feature layer parameter appears. This parameter tells the data pipeline which feature layer in your organization to replace when the data pipeline runs in the future.
For Feature layer, verify that the DPR Capital Projects layer is selected.
Caution:
Be careful when selecting a feature layer to replace. If you choose the incorrect feature layer, data could be irreversibly lost.
Now, when the data pipeline runs again in the future, it will overwrite the existing feature layer data and avoid errors.
Save the data pipeline.

Schedule the data pipeline

Since the input datasets are subject to change, you'll schedule the data pipeline to run automatically in the future.

Click ArcGIS Data Pipelines.
Click Manage scheduling.
Now, you'll create a task. A task allows you to control how frequently your data pipeline runs.
Click Create task.
The Create task window appears. Here, you'll choose the data pipeline that you created.
Select Capital Projects Data Pipeline.
Click Next.
You'll schedule your data pipeline to run automatically, allowing you to saturate the target feature layer with the latest information. In a production environment, you might set this to run monthly, daily, or more frequently based on how often your input datasets update.
First, you'll give this task a title.
For Title, type DPR Capital Projects Update.
You'll have this data pipeline run every 15 minutes.
For Repeat type, choose Minute. For Repeat interval, leave the default value of 15 minutes.
Next, you'll ensure that the data pipeline only runs once.
For End, choose After number of runs. For Number of runs, type 1.
Note:
To learn more about scheduling tasks, read Schedule a data pipeline task.
Click Save.
The task is visible and informs you when it will run next.
Note:
If you want to edit, pause, or delete a task, click the Options button at the far end of the table. Additionally, you can click the link to view or edit your data pipeline.
After the task runs, you can see the task run history.
Click the DPR Capital Projects Update task.
Note:
It may take a minute or two for the task to run. You can also click the Refresh button to see whether it ran.
The Task runs pane shows the task and the status of completed runs. A green check mark indicates that the run succeeded. A red hexagon indicates that the run failed.
Under Output results, an overview of the results from the data pipeline is shown.
You can see that the data was replaced as you specified.

In this tutorial, you built a data pipeline to integrate data from various dynamic sources, added additional attributes, removed extraneous attributes, renamed fields, and wrote the results to a feature layer. You also set the data pipeline to automatically run on a schedule. By configuring a data pipeline, you can skip the tedious process of manually manipulating data and updating feature layers every time there's an update to the source data.

You can find more tutorials in the tutorial gallery.

Construct a data pipeline Configure a data pipeline to ingest publicly available data and calculate an additional attribute.	25 minutes
Perform spatial joins Use spatial joins to add attributes from other geographic datasets.	20 minutes
Clean the data Rename and reduce the number of fields before creating the output dataset.	15 minutes
Review the results Set the data pipeline to run on a schedule and review your results.	10 minutes

Requirements

Outline

Construct a data pipeline

Perform spatial joins

Clean the data

Review the results

Construct a data pipeline

Create a data pipeline

Note:

Note:

Add a CSV table as an input

Note:

Note:

Note:

Note:

Filter data by attribute

Note:

Note:

Create point geometry

Note:

Project point data

Note:

Calculate a new field

Note:

Note:

Perform spatial joins

Add a GeoJSON as an input

Note:

Project polygon data

Spatially join the capital projects and the neighborhoods

Create a feature layer from a URL

Add a feature layer as an input

Note:

Project a feature layer

Note:

Spatially join the capital projects and the community districts

Clean the data

Select fields

Update fields

Note:

Note:

Create an output feature layer

Review the results

Run the data pipeline

Note:

Note:

Update the data pipeline

Note:

Caution:

Schedule the data pipeline

Note:

Note:

Note:

Acknowledgements

Send Us Feedback

Share and repurpose this tutorial

Ready to learn more?

Related Esri training

Introduction to Spatial Data

ArcGIS Online Basics

Make and Share Web Maps with ArcGIS Online