Train a model

Land-Use Land-Cover (LULC) classification provides an overview of the general categories of land use and land cover for broad geographic areas, based on remotely sensed imagery. It plays a vital role in various aspects such as environmental monitoring, resource management, biodiversity conservation, disaster risk reduction, and climate change analysis. It facilitates the systematic monitoring of changes in land use, efficient allocation of resources, preservation of habitats, identification of areas prone to disasters, and evaluation of climate change impacts.

You'll use the Train Using AutoDL tool to train multiple models and identify the best-performing model to classify land cover from Synthetic Aperture Radar (SAR) imagery.

Download and explore the data

To get started, you'll download a project that contains the training data that you will use.

  1. Download the AutoDL_tutorial.ppkx package.

    The AutoDL_tutorial.ppkx file is downloaded to your computer.

    Note:

    A .ppkx file is an ArcGIS Pro project package and may contain maps, data, and other files that you can open in ArcGIS Pro. Learn more about managing .ppkx files in this guide.

    The project package is about 4.3 GB, so the download may take some time.

  2. Locate the downloaded file on your computer.

    Note:

    Depending on your web browser, you may be prompted to choose the file's location before you begin the download. Most browsers download to your computer's Downloads folder by default.

  3. Double-click the AutoDL_tutorial.ppkx project package.
    Note:

    The packaged project is extracted to your Documents folder. This process extracts the compressed project and the data to a new location and may take some time. The path to the project will be similar to C:\Users\username\Documents\ArcGIS\Packages\AutoDL_tutorial_7bd31e. The last several digits of the last folder name are randomly generated.

    If you want to start from the original state of the project, you can rename or delete this folder and double-click the AutoDL_tutorial.ppkx project package file to create a copy of the project.

    Note:

    If you don't have access to ArcGIS Pro or an ArcGIS organizational account, see options for software access.

    ArcGIS Pro version 3.2 or later is required for the Train Using AutoDL tool.

  4. If prompted, sign in with your ArcGIS Online organization account.

    A map appears, showing part of Germany. An image layer, LULCRaster2018.tif, displays on top of the topographic basemap.

    The land-use land-cover raster layer is shown on the map.

    The LULCRaster.tif layer is a classified raster showing Level-1 LULC classes. This classification system provides a broad categorization of the earth's surface into general land-use and land-cover types, such as urban areas, agricultural land, forests, water bodies, and wetlands. It serves as the most basic level of classification, offering an overview of major land categories for large-scale analysis.

  5. On the ribbon, on the Map tab, in the Navigate section, click Bookmarks and choose the Speyer bookmark.

    Speyer bookmark

    The map zooms to the southwestern part of the LULCRaster2018.tif layer.

    Detail view of built-up area near a river, with agricultural and wooded land

    The layer shows a built-up area near a river, with forest and agricultural areas.

  6. In the Contents pane, uncheck the LULCRaster2018.tif layer and check the SARImagery2018.tif layer.

    The SARImagery2018.tif layer is checked in the Contents pane.

    The SARImagery2018.tif layer is derived from remotely sensed Sentinel-1 Ground Range Detected (GRD) SAR imagery from 2018. This layer has a 10-meter resolution and is stored in TIFF format with three bands.

    Note:

    The original SAR imagery was downloaded and processed to prepare it for analysis. Originally consisting of two polarization bands, VV and VH, it was downloaded, and using the Band Arithmetic raster function, the VV/VH derived band was created. This band consists of the VV band divided by the VH band. This band combination is useful because it highlights differences in scattering behavior, which allows you make inferences about the surface characteristics. After this processing, the Composite Bands geoprocessing tool was used to create a composite of VV, VH, and VV/VH with 8-bit unsigned pixel depth.

    One of the websites where you can download Sentinel-1 GRD data for anywhere on earth for free is the ASF Data Search Vertex website.

    Detail view of SAR imagery of the same area

    Sentinel-1 data is collected regularly, which allows comparison of different images to detect change over time.

    Manually classifying all pixels of this imagery into Level-1 LULC classes would be a long and tedious process. A deep learning model can automate LULC classification on imagery, which can help you regularly get new data, classify it, and detect change compared to older images. In this tutorial, you are testing the process to determine if you can use this workflow to annually update your land-use land-cover data and to allow you to identify and report changes over time.

Explore the training data

The project package contains the training samples that you'll use to train and identify a best-performing model. These samples were created using the Export Training Data For Deep Learning tool. The steps to prepare this data are beyond the scope of this tutorial. You can read more about how it is done.

  1. On the ribbon, click the View tab. In the Windows section, click Catalog Pane.

    Catalog Pane button

  2. In the Catalog pane, expand Folders, expand the AutoDL_tutorial folder, and expand the trainingdata folder.

    Training data location

    The trainingdata folder contains the training data that you will use.

    The images folder contains image chips, extracted from the SARImagery2018.tif layer using the Export Training Data For Deep Learning tool.

    The labels folder contains label images showing the classified land cover types.

    Training a deep learning model to classify this imagery involves presenting the model with the image chips and with the matching labels, allowing the model to learn which classes are associated with which SAR band combinations.

Train multiple models

One of the challenges with deep learning is determining which model architecture to use. This process of choosing and training a model can be confusing and lengthy, as the models have different strengths and weaknesses, and they require different inputs and parameters. The Train Using AutoDL tool will allow you to select a set of model architectures to train. It will then train and test them and report which model performed best.

The Train Using AutoDL tool trains the set of deep learning models by building training pipelines and automating much of the training process. This includes data augmentation, model selection, hyperparameter tuning, and batch size deduction. Its outputs include performance metrics of the best model on the training data and a trained deep learning model package (.dlpk file) that can be used in the Extract Features Using AI Models or Classify Pixels Using Deep Learning tools to classify other images.

Note:

Using the deep learning tools in ArcGIS Pro requires that you have the correct deep learning libraries installed on your computer. If you do not have these files installed, save your project, close ArcGIS Pro, and follow the steps delineated in the Get ready for deep learning in ArcGIS Pro instructions. In these instructions, you will also learn how to check whether your computer hardware and software can run deep learning workflows and other useful tips. When you are finished, you can reopen your project and continue with the tutorial.

  1. On the ribbon, on the Analysis tab, in the Geoprocessing section, click Tools.

    Tools button in the Geoprocessing section of the Analysis tab of the ribbon

  2. In the Geoprocessing pane, in the search box, type train using autodl. In the search results, click the Train Using AutoDL tool.

    The Train Using AutoDL tool in the search results

  3. On the Train Using AutoDL tool pane, for Input Training Data, click the Browse button. Browse to Folders and the AutoDL_tutorial folder, click the trainingdata folder, and click OK.

    Input Training Data browse dialog box with trainingdata folder selected

  4. For Output Model, click the Browse button. Browse to the AutoDL_tutorial folder, type ClassifiedSARLULC, and click Save.

    Output Model browse dialog box with new name ClassifiedSARLULC added

    This creates a new folder to contain the output trained model or models.

    Next, you'll specify how long the tool should spend on training the models.

  5. For Total Time Limit (Hours), type 4.

    Train Using AutoDL tool with Total Time Limit set to 4 hours

    The tool will work on the task for up to four hours. Depending on your computer's GPU, it may use the entire time, or it may complete the task in a shorter time.

    Based on the format of the training data, you'll see a list of supported neural networks specific to pixel classification.

  6. Expand the Advanced Options section.

    Advanced Options section

  7. For Neural Networks, click Add Many.

    Neural Networks Add Many button

    The Neural Networks list appears.

    Neural Networks selection list

  8. Check the following neural networks:
    • HRNet
    • PSPNetClassifier
    • UnetClassifier
  9. Click Add.

    The neural networks are added to the list of networks to be trained and evaluated by the Train Using AutoDL tool.

    Networks added to the list for training and evaluation

    These are neural networks that classify pixels in a raster using semantic segmentation. They are commonly used for land-cover classification.

  10. Check the Save Evaluated Models check box.

    Save Evaluated Models check box is checked.

    You have specified that the Train Using AutoDL tool should run for four hours training and evaluating the three models.

    Note:

    If you have a computer with a suitable GPU, you can run the tool to train and evaluate the three models for the next four hours. To learn more about GPUs and how they are used for deep learning processes, see the Check for GPU availability section in the Get ready for deep learning in ArcGIS Pro tutorial. Optionally, you can skip the training step and review a folder that has been prepared for you with all of the outputs of the tool. If you are not going to run the model training process, read the next four steps and start working again in the next section, Review the model training results.

  11. Click the Environments tab.

    Environments tab

  12. In the Processor Type section, for GPU ID, type 0.

    GPU ID set to 0

    If your CUDA-enabled GPU has a different GPU ID, use that ID number. This may be necessary when your computer has more than one GPU.

  13. Optionally, click Run.

    The process will run for up to four hours.

    You can view messages about the status of the process as the tool runs.

  14. At the bottom of the Train Using AutoDL pane, click View Details.

    View Details link

  15. Click the Messages tab.

    Tool messages on Messages tab

    When the process is complete, you can vew the outputs in the Messages pane.

    Model training result metrics messages

    In the training process, the tool randomly selects 10 percent of the training dataset to reserve for validation and trains the models on the other 90 percent. As training proceeds, the model calculates how well the predicted values match the values in the validation dataset. The table summarizes how well each model performed. Partly because of this random selection of validation samples, training models with this tool is not deterministic. The tool also randomly sets some initial conditions each time it is run. For more information see the help for the tool. Given the same set of training data, different models may be chosen as the best model, and different values will appear in this table.

    The table includes columns for both Training and Validation Loss. Training Loss shows how well the model learned. Validation Loss shows how well what the model learned performed on the validation set, in effect, showing how generalized the learning was. Lower values for these two measures indicate better training performance, so in this case, the UnetClassifier model performed best in terms of Training Loss and Validation Loss.

    The PSPNetClassifier model had a higher Accuracy value than the other models. For accuracy, higher values are better.

    Accuracy and dice are measures of how well the model correctly classified pixels. The PSPNetClassifier model also had a higher Accuracy value than the other models.

    The Learning Rate is a hyperparameter used in the training of the neural networks. If you did not specify the value, it will be calculated by the training tool. The tool attempts to optimize the learning rate, weighing speed against quality. The resulting Learning Rate value listed in the table is primarily of interest if, as an advanced user, you want to continue training the model and need guidance on choosing the Learning Rate value for that additional training.

    The Time column indicates the time it takes to train each model. You will notice that the time is larger for the first model than for the others. This is because there is some data processing overhead that is done for the first model and is then reused by models that follow.

    The backbone is the default model backbone. If you set the AutoDL Mode parameter to Advanced instead of Basic, multiple backbones may be tried.

  16. Close the Messages pane.

Review the model training results

The project package you downloaded includes a zipped folder of the Train Using AutoDL tool results. You'll review these now.

  1. In the Catalog pane, browse to Folders and the AutoDL_tutorial folder. Right-click the userdata folder and choose Copy Path.

    Copy Path option

  2. In Microsoft File Explorer, paste the path into the path box.

    File Explorer showing the path to the userdata folder

    The path will be similar to: C:\Users\username\Documents\ArcGIS\Packages\AutoDL_tutorial_7bd31e\userdata

    Inside this folder there are two zip archives. The LULCClassifierModel.zip archive contains the results created by running the Train Using AutoDL tool with the settings specified earlier.

    The TrainingData.zip archive contains the data used to create the training data.

  3. Right-click LULCClassifierModel.zip archive and choose Extract All.

    Extract All option

  4. Click Extract.

    Extract window with the compressed file tool showing the output path

  5. Open the LULCClassifierModel folder.

    Extracted folder in File Explorer

    This folder contains several outputs from running the tool. These include the following:

    • ModelCharacteristics is a folder containing images used in the README.html file.
    • models is a folder containing all the trained models that were evaluated on the subset of training data.
    • ArcGISImageClassifier is a Python script with code used in classifying imagery for the training process.
    • ClassifiedSARLULC.dlpk is a complete package of all the files stored in model output folder including the trained model, the model definition file, and the model metrics file. This package can be shared to ArcGIS Online and ArcGIS Enterprise as a trained model item for others to use.
    • ClassifiedSARLULC.emd is a model definition file that contains model information about the tile size, classes, model type, and so on.
    • ClassifiedSARLULC.pth is a pretrained weights file, usually saved in a PyTorch format.
    • model_metrics.html is an HTML page with details about the learning rate used and the accuracy of the trained model.
    • README.html is an HTML page with details about the evaluation of models and accuracy of the best-performing model.

    The contents of the extracted folder in File Explorer

  6. Double-click the README.html file.

    The page opens in a browser tab. This page shows information about the best-performing model, how it compared to the other models, and how well it was able to classify LULC from your input training data.

    The Training and Validation loss section displays a graph of the amount of error that was present as the model trained over time. When the tool ran, ninety percent of your input data was used to train the model and ten percent was used to validate the model to determine its accuracy. Ideally, you would see these loss values decrease and converge as the number of images (batches) processed increases over the course of available time.

    Best Performing Model Report

    Note:

    Because the validation samples are randomly selected from the set of training image chips, and because some hyperparameters are randomly set to start training, the Training Loss and Validation Loss metrics can be different each time the tool is run, even on the same training dataset.

    In this graph, you can see that for the first 60 batches of images processed, the error shown by the Validation line is high but varies quite a bit for each batch. After 60 batches, the amount of error in the Validation process decreases and varies less, except for a peak at 120. The Train line shows a more steadily decreasing amount of error.

    The Analysis of the model section displays the precision of the classes of data. Your model technically had five classes: four for land cover and one for NoData. A higher precision value means the model is more confident in its results. You can read more about interpreting the precision and accuracy statistics of deep learning tools.

    Analysis of the model table

    Finally, this page shows a few sample chips comparing your original LULC training data, Ground Truth, on the left and the model's Predictions on the right. Ideally, the prediction should closely match the original ground truth.

  7. Close the README page on your web browser.
  8. On the Quick Access toolbar, click the Save Project button.

    Save Project button

You've trained a deep learning model for LULC Classification on Sentinel 1 imagery taken in 2018 and found that the best-performing model for that task is based on the UnetClassifier architecture. Next, you'll use this model to automatically classify land cover in Sentinel 1 imagery taken in 2024.


Apply the model

Once a deep learning model is created, it can be used to quickly classify land cover on similar data captured at different dates. This allows you to monitor land cover change over time. As an example, you'll take the model that you created and use it to classify Sentinel-1 imagery captured in 2024 in the same geographic area.

Use the trained model to classify new imagery

Next, you'll use the deep learning model to classify Sentinel-1 imagery collected in 2024 with the Classify Pixels Using Deep Learning tool.

  1. In ArcGIS Pro, click the Deploy Model map tab.

    Deploy Model map tab

    The Deploy Model map shows the SARImagery2024.tif layer.

    Deploy Model map

  2. In the Geoprocessing pane, click the Back button.
  3. In the search box, type classify pixels using deep learning. In the search results, click the Classify Pixels Using Deep Learning tool.

    Search result for the Classify Pixels Using Deep Learning tool

  4. On the Classify Pixels Using Deep Learning tool, for the Input Raster parameter, choose SARImagery2024.tif.

    Input Raster set to SARImagery2024.tif

  5. For Output Raster Dataset, type ClassifiedLULC2024.

    Output Raster Dataset parameter set to ClassifiedLULC2024

  6. For Model Definition, click the Browse button and browse to the ClassifiedSARLULC folder. Click the ClassifiedSARLULC.dlpk deep learning package.
    Note:

    If you did not train the model on your machine, you can use the trained model that is provided with the project. In the project folder structure, browse to the userdata\ClassifiedSARLULC\ClassifiedSARLULC folder, and click the ClassifiedSARLULC.dlpk deep learning package.

    The ClassifiedSARLULC.dlpk file in the ClassifiedSARLULC folder.

    The tool with Model Definition file specified.

    After the deep learning package is loaded by the tool, the model Arguments table appears. You'll accept the default values. You can reduce processing time by increasing the batch size to 8 or 16, if you have a computer with a GPU with at least 8 GB of dedicated VRAM. If your GPU has less than 8 GB of VRAM, you may need to decrease the batch size to 2.

    Arguments table

  7. Click the Environments tab.
  8. In the Raster Analysis section, for Cell Size, type 10.

    The cell size of the SAR data doesn't exactly match the cell size the model was trained on, so you can specify that the output should have a cell size of 10.

  9. For Processor Type, choose GPU.

    The process of classifying this image may take 40 minutes or more. Optionally, you can skip running the tool and view the tool output, which is provided in the project package.

  10. In the Processor Type section, for GPU ID, type 0.

    If your CUDA-enabled GPU has a different GPU ID, use that ID number. This may be necessary when your computer has more than one GPU.

  11. Optionally, click Run.

    If you run the tool, when it finishes, view the results on the map.

    Note:

    The image class colors will be randomly assigned. You can change them to suit your preferences. Right-click a class symbol, and in the color palette, choose a color of your liking.

  12. If you didn't run the tool, click the Results map tab to see tool results.

    The classified imagery appears.

    Results map

    Note:

    Deep learning is not a deterministic process, so the results you obtain may be slightly different.

    You can compare the 2024 raster to the 2018 raster to detect large scale land use change over time. Now that the deep learning model has been trained, you can apply it to new SAR imagery every year, or more frequently. This trained model can become part of an effective workflow to monitor land-cover change over time.

  13. Press Ctrl+S to save the project.

In this tutorial, you used the Train Using AutoDL tool to train multiple models to classify land cover from Sentinel-1 SAR imagery and automatically identify which performed best. You then applied the best-performing trained model to more recent imagery.

Note:
Esri provides over 60 pretrained models in ArcGIS Living Atlas to expedite the process of classifying imagery and detecting objects. These models are free to download, and you can deploy them directly on compatible imagery inputs. You can also fine-tune these pretrained models on your own training data; this usually takes less time than training a model from scratch. For example, see the Detect objects with a deep learning pretrained model and Map floods with SAR data and deep learning tutorials.

Visit this tutorial series for more deep learning tutorials.

For more information on preparing SAR imagery for deep learning workflows, see this tutorial.