Assisted Modeling Tool

Use of the Assisted Modeling tool requires participation in the Alteryx Analytics Beta program. Visit the the Alteryx beta program, also known as the Alteryx Customer Feedback Program, to find out more. All Alteryx Beta Program notifications and disclaimers apply to this content. Functionality described here may or may not be available as part of the Beta Program. To send feedback about the documentation, send an email message to helpfeedback@alteryx.com.

Step-by-Step Tutorial

The Assisted Modeling tool simplifies the model-building process. With Assisted Modeling, you're guided through the process of building and evaluating several predictive models and selecting the one that best suits your business use case. Assisted Modeling helps you identify a target, set data types, select features, select the most relevant algorithms, and build your models.

At each step, Alteryx analyzes your dataset and the choices you've made so far, makes further suggestions and then allows you to make the final decision.

To get started, follow the steps below. If you want to practice with sample datasets, visit Practice Assisted Modeling.

Step 1. Create Samples

Example workflow showing a dataset split using the Create Samples tool

Sample workflow (yours may be different)

As a best practice, you should split your dataset into two datasets. One for training the model and another for testing the model. You can then compare the model's accuracy with how well it performed on the test data.

It is often a good idea to manually validate your model. The Assisted Modeling tool applies a cross-validation method when building your predictive model. It does this behind the scenes. With your own validation dataset in hand, you will be that much more prepared to apply your model to new, unseen data. You can then compare the model's accuracy with how well it performed on the test data. If the model does not perform well on the test data, then you have a basis for tuning the model. You might also want to keep a holdout sample available. To find out more about these all-important samples, visit the Alteryx Community Data Science blog post, Holdouts and Cross Validation: Why the Data Used to Evaluate your Model Matters.

One way to create samples is with the Create Samples Tool.

To do this:

  1. Prepare and clean your dataset.
  2. Is your dataset is ready for training a predictive model? Learn more about why this is a crucial part of the data science lifecycle. Visit the Alteryx Community Data Science blog, The Data Science Lifecycle.
  3. Create a new workflow and label it Samples.
  4. Split your dataset into training and testing datasets using the Create Samples Tool. Alteryx recommends 80% estimation and 20% validation.
  5. Use the Output Data Tool to output the smaller sample to an Alteryx database file and save it for later.

Once you've completed these steps, you can start building a machine learning pipeline.

Splitting your dataset into test and train datasets is only one of several options you could use for validating a machine learning pipeline. For example, you could also perform cross validation.
To compare the different models in the background, Assisted Modeling uses a sampling and training technique known as 3-fold cross validation to generate model evaluation scores such as accuracy and log loss.

Step 2. Create Model Pipeline

Example workflow showing a machine learning pipeline output from assisted modeling

Sample workflow (yours may be different)

As a best practice, you should create a workflow to train and save your model. This workflow will contain your model pipeline. Assisted modeling creates a model pipeline for you during the Assisted Modeling process.

To create an assisted modeling pipeline:

  1. Create a new workflow and label it "training only".
  2. Drag an Input Data Tool onto the canvas and connect it to your training dataset. You can use the 80% sample you created and saved.
  3. Click the Assisted Modeling tool in the Machine Learning tool palette and drag it to the workflow canvas, connecting it to your existing workflow. At minimum, you must have an input tool, such as the Input Data Tool, already connected to your dataset.
  4. Run your workflow to enable Assisted Modeling. If Assisted Modeling does not display, click the Assisted Modeling tool on the canvas and then click Run.
  5. Select the Assisted option.
  6. Click Start Assisted Modeling.
  7. Click Start Building in the onboarding window. Onboarding is an introductory tutorial explaining the steps in Assisted Modeling. You can dismiss onboarding so that it does not display again.

  8. Follow the steps in assisted modeling to build and select a model:

Step 3. Validate Your Model

Sample workflow showing how the model and a small sample dataset connect to the Score tool

Sample workflow (yours may be different)

As a best practice, you should test, or validate, your model. This test is a simulation of the way the model will behave once it sees new data. You can compare predictions between the dataset used to build the model and the sample you saved off for testing purposes. The predicted outcome may not be exactly the same, but it should not be vastly different.

  1. Create a copy of your workflow and label it Testing.
  2. Using an Input Data Tool, connect your 20% sample dataset to the D anchor on the Predict Tool.
  3. Add a Browse tool and then run the workflow to evaluate results.

Did the model perform as well or better than it did in the Assisted Modeling tool? If yes, you can create more workflows and connect unseen datasets in place of the test data input. If your test dataset is representative of the training dataset, you can expect results to be nearly the same.

If the model did not perform well on the test data, then you have a basis for tuning the model. To find out more, visit the Alteryx Community Data Science blog, Hyperparameter Tuning Black Magic.