Create Samples Tool

The Create Samples tool splits the input records into two or three random samples. In the tool you specify the percentage of records that are in the estimation and validation samples. If the total is less than 100%, the remaining records fall in the holdout sample.

Configure the tool

  1. Estimation sample percent: The percentage of the data to be placed in the estimation sample (between 1% and 99%).
  2. Validation sample percent: The percentage of the data to be placed in the validation sample (between 1% and 99%).
  3. Random seed: An integer value between 1 and 1000. Changing this value will alter the sample that an individual row of the data is placed in. Unless there is a specific reason to change this value, the default value of 1 is the recommended choice.

View the output

There are 3 outputs from the Create Samples tool:

  • E anchor: The Estimation output stream will contain a random sample of input records. The count of records in this stream will be equal to the percent of total records specified in the Estimation property above.
  • V anchor: The Validation stream will contain a random sample of input records. The count of records in this stream will be equal to the percent of total records specified in the Validation sample property above.
  • H anchor: The Holdout stream will include any leftover records that were not placed in either the Estimation or Validation samples.

If there is an odd number of records and Estimation and Validation are both set to 50%, the Estimation output stream will have one more record than the Validation steam