Simulation Scoring Tool
The Simulation Scoring tool takes a sample from an approximation of a model object error distribution. Whereas standard scoring attempts to predict the mean predicted value, Simulation Scoring also considers the error distribution to provide a range of possible values.
This tool uses the R tool. Go to Options > Download Predictive Tools and sign in to the Alteryx Downloads and Licenses portal to install R and the packages used by the R Tool. See Download and Use Predictive Tools.
Connect inputs
- M anchor: The model object produced by one of the R-based predictive modeling tools.
- V anchor: Optional. The validation dataset to use when connecting a non-Linear Model (non-LM). Alteryx tools that create non-LM models are Logistic Regression Tool,Count Regression Tool,Gamma Regression Tool, Boosted Model Tool,Decision Tree Tool,Forest Model Tool,Naive Bayes Classifier Tool, Neural Network Tool,Spline Model Tool,Stepwise Tool, andSupport Vector Machine Tool.
- If you are scoring an LM model, the error distribution can be directly sampled due to the properties of LMs.
- If you are scoring other models (non-LM), homoscedasticity of the error distributions with respect to the predictors is assumed. This allows a single error distribution to be calculated by scoring the model against a validation set. That error distribution is then sampled and added to the score results for the incoming data.
- S anchor: The simulation data to score. This must contain all of the fields (with identical types and names) used to create the associated predictive model.
data:image/s3,"s3://crabby-images/33f5d/33f5d1b7df57b2630dcf37a3f467e8294866863d" alt="Closed"
Warning
Do not connect this input when the incoming model object uses aLinear Regression Tool.
Configure the tool
- Name results of score simulation: The field name for the generated results. The field name must start with a letter and may contain letters, numbers, and the special characters period(".") and underscore ("_"). Note that R is case-sensitive.
- The number of records to score at a time: The tool can break the input data into chunks, score a chunk at a time, and thereby avoid R's in-memory processing limitation. This option controls the maximal number of incoming records contained in each chunk of data.
- How many samples from error distribution per iteration: The number of draws from the model's error distribution for each incoming record.
- Set Random Seed: (Optional) Specify a random seed. This option is hidden if there is a seed field in the data to be scored.
View the output
- D anchor: The data to be scored, along with the simulated score.