Analyze aggregated data in ArcGIS Insights

Import a workbook package

You will start by importing a workbook package with the results from the previous tutorial into ArcGIS Insights. If you already completed the previous tutorial, skip to the next section.

  1. Download the analyze-aggregated-data.insightswbk file.

    The item is an Insights workbook package containing all the data and analysis from the Create a workbook in ArcGIS Insights tutorial.

  2. Do one of the following:
    • To complete the tutorial in Insights in ArcGIS Online, go to ArcGIS Insights and click Sign in. Sign in to your ArcGIS organizational account.
    • To complete the tutorial in Insights in ArcGIS Enterprise, sign in to your portal. On the ribbon, next to your username, click the App Launcher button and choose Insights.
    • To complete the tutorial in Insights desktop, launch the Insights desktop app and sign in to your ArcGIS organizational account, if prompted.
    Note:

    To access Insights, your organization’s administrator must grant you a license for it. If you don’t have an ArcGIS account or a license for Insights, see options for software access.

    You will import the package you downloaded.

  3. On the home page, click the Workbooks tab.
  4. Click Import and open the file that you downloaded in step 1.

    The workbook is loaded onto your Workbook page.

Count accidents in each ward

Next, you'll start your analysis of the traffic accidents. As you learned in the previous tutorial when you looked at the fields in the Collisions dataset, not all collisions involved cyclists. Using the Number of cyclists field, you'll filter the dataset to only show accidents that involved cyclists.

  1. In ArcGIS Insights, ensure you are on the Workbooks tab.
  2. In the list of workbooks, click the name of the Ottawa cycling accidents workbook to open it.

    Workbook card title

  3. If necessary, click the map card to activate its toolbar.
  4. Click the Card filter button.

    Card filter button

    The New filter pane appears.

  5. Expand the menu and choose Number of cyclists.

    A histogram appears. A histogram is a type of chart that plots the distribution of data. The bars on the histogram indicate points in the same range of values for the Number of cyclists field.

    The Number of cyclists field contains a count of accidents involving cyclists at each location. For many of these locations, the count is 0, meaning no accident involving a cyclist occurred. You'll change the filter so that only locations with a count of 1 or more are included.

  6. Click the left node of the histogram, change 0 to 1, and press Enter.

    Histogram with left node value changed to 1

    The node's position on the histogram updates.

  7. Click Apply and close the Card filters pane.

    The map updates to show only collisions that involved cyclists.

    Map filtered to only show collisions with cyclists

    A result dataset called Collisions is also added to the data pane. Result datasets have an orange icon in the data pane. Because the result dataset has the same name as the original dataset, you'll rename it.

  8. Point to the Collisions dataset and click the Rename dataset button. Type Collisions with cyclists and press Enter.

    Result dataset renamed Collisions with cyclists

    Next, you'll aggregate the number of collisions involving cyclists by city ward. That way, you can compare the number of collisions across different areas of the city.

  9. In the data pane, drag the Wards dataset onto the map card and drop it on the Spatial aggregation drop zone.

    Spatial aggregation drop zone

    The Spatial Aggregation pane appears.

  10. In the Spatial Aggregation pane, ensure that Choose an area layer is set to Wards, Choose a layer to summarize is set to Collisions with cyclists, and Style by is set to Count of Collisions with cyclists.

    Spatial Aggregation parameters

  11. Click Run.

    The collisions involving cyclists are aggregated into the city wards. The map updates to show the number of collisions involving cyclists in each ward, with larger point symbols corresponding to more collisions. In the data pane, the Collisions with cyclists result dataset is replaced by the Spatial Aggregation 1 result dataset.

    Map with collisions involving cyclists aggregated by ward

Normalize accidents

The largest symbols are in central Ottawa, where more urban wards are located. The city of Ottawa covers a large and diverse area, encompassing urban, suburban, and rural regions. It's possible that urban wards have higher numbers of collisions involving cyclists because they have more collisions total. It would be helpful to show the wards with the highest proportion of collisions involving cyclists, instead of the highest total count. You can calculate proportions in your data using a process called normalization.

To normalize the data, you need the total number of collisions and the number of collisions involving cyclists. Your aggregation already has collisions involving cyclists, while the Collisions dataset contains the total number of collisions. You'll add the Collisions dataset to the Spatial Aggregation 1 dataset by running another spatial aggregation.

  1. Drag the Collisions dataset onto the map and drop it on the Spatial aggregation drop zone.

    The Spatial Aggregation pane appears.

  2. In the Spatial Aggregation pane, ensure that Choose an area layer is set to Spatial Aggregation 1, Choose a layer to summarize is set to Collisions, and Style by is set to Count of Collisions.

    Spatial Aggregation parameters

    These parameters will aggregate the Collisions dataset by the areas in the Spatial Aggregation 1 dataset. These areas are the same as the city wards.

  3. Click Run.

    A result dataset called Spatial Aggregation 2 is created.

  4. In the data pane, expand Spatial Aggregation 2.

    The last two fields in the Spatial Aggregation 2 dataset are Count of Collisions with cyclists and Count of Collisions. The first field was added when you ran spatial aggregation the first time, and the second was added when you ran it the second time. You'll use these fields to style and normalize your map.

  5. On the map, click the Count of Collisions arrow.

    Map legend arrow

    The Layer options pane appears.

  6. Click the Symbology tab.

    Symbology tab

    The Symbology tab of the Layer options pane contains parameters for styling the layer. You'll style the layer by the number of collisions involving cyclists and divide that number by the total count of collisions to normalize the data. You'll also change the symbol type from size to color, because color is more appropriate for depicting proportional data.

  7. Change the following parameters:
    • For Style by, choose Count of Collisions with cyclists.
    • For Symbol type, choose Counts and amounts (Color).
    • Expand Classification. For Classification type, choose Equal interval.
    • For Divide Count of Collisions with cyclists by, choose Count of Collisions.

    Layer options pane parameters

    The changes are applied to the map. Areas with higher proportions of accidents involving cyclists are darker colors.

    Map with normalized collisions involving cyclists

    Even when normalized, the central urban wards have the most accidents involving cyclists.

  8. Close the Layer options pane.

    Now that you've created your map, you'll rename the map card to give it a more descriptive name.

  9. Click the empty area of your page to deactivate the map card. Click Card 1, type Collisions with cyclists by ward, and press Enter.

Create a combo chart

Next, you'll create a combo chart that shows the total number of collisions as a line chart and the number of collisions involving cyclists as a bar chart. This chart will help you visualize the data in a nonspatial way. In ArcGIS Insights, charts and maps are linked if they use the same data, so you can interact with the chart to affect what is displayed on the map.

  1. In the data pane, for the Spatial Aggregation 2 dataset, click the circles next to NAME, Count of Collisions with cyclists, and Count of Collisions.

    Selected fields

    All three fields are selected.

  2. Drag the fields onto the empty part of the page next to the map card, point to the Chart drop zone, and drop the fields on Combo Chart.

    Combo Chart drop zone

    The chart is created, but due to the small default size of the card, some of the data may be difficult to read.

  3. Drag the handle on the right side of the combo chart to expand its size until all of the ward names are displayed.

    Handle to expand the chart size

  4. Click an empty area on the page to deactivate the combo chart card. Click Card 1 and rename the card Collisions by ward.

    Next, you'll use your map and chart to determine which ward has the highest proportion of cycling accidents. Your map has one ward with a particularly dark symbol, so you'll point to it to learn more about it.

  5. Click the map card to activate it and zoom in to the darkest wards in the center of the city. Point to the ward with the darkest color.

    Default pop-up

    The pop-up contains the proportion of collisions that involve cyclists, 0.1124 or about 11 percent. The line indicates that the values of all wards range between about 0.01 to 0.11, which means that this ward has the highest value. The average value is 0.04.

    The pop-up doesn't contain the name of the ward, so you'll configure it to show the NAME field, which contains ward names.

  6. In the data pane, for the Spatial Aggregation 2 dataset, point to the Location field and click the Display field button.

    Display field button

  7. For Choose Display Field, choose NAME.
  8. Click the Spatial Aggregation 2 arrow to collapse its fields.
  9. Point to the same ward.

    Configured pop-up

    The pop-up now shows the ward's name, Somerset. You can compare what you see on the map to the combo chart. On the chart, Somerset ward has both the highest total number of collisions (the line) and the highest number of collisions involving cyclists (the column).

    Combo chart with Somerset ward highlighted

    Based on the chart, it appears that the total number of collisions and the number of collisions involving cyclists are almost the same. However, the line chart and the column chart use a different scale. The scale for count of collisions with cyclists is on the left side of the chart, while the scale for the count of collisions is on the right side. The scale for the latter is 10 times larger.

    Outside of Somerset ward, there seems to be a general pattern that wards with more total collisions also have more collisions involving cyclists. One exception is College ward, which has one of the highest numbers of total collisions, but a relatively low number of collisions involving cyclists.

  10. On the ribbon, click the Save button.

In this tutorial, you aggregated the collision data using spatial aggregation and a combo chart to determine which wards have the largest numbers of collisions and the largest proportion of collisions involving cyclists. In the next tutorial, you will look closer at the collisions in Somerset ward to determine the relationship between the collision locations and bike route types.