In Chapter 7, Evaluating and Optimizing Models, you learned many important concepts about model tuning. Let’s now explore this topic from a practical perspective.
In order to tune a model on SageMaker, you have to call create_hyper_parameter_tuning_job and pass the following main parameters:
In SageMaker, the main metric that you want to use to evaluate the models to select the best one is known as an objective metric.
In the following example, you are configuring HyperParameterTuningJobConfig for a decision tree-based algorithm. You want to check the best configuration for a max_depth hyperparameter, which is responsible for controlling the depth of the tree.
In IntegerParameterRanges, you have to specify the following:
Important note
Each type of hyperparameter must fit in one of the parameter range sections, such as categorical, continuous, or integer parameters.
In ResourceLimits, you are specifying the number of training jobs along with the number of parallel jobs that you want to run. Remember that the goal of the tuning process is to execute many training jobs with different hyperparameter settings. This is so that the best one will be selected for the final model. That’s why you have to specify these training job execution rules.
You then set up our search strategy in Strategy and, finally, set up the objective function in HyperParameterTuningJobObjective:
tuning_job_config = {
“ParameterRanges”: {
“CategoricalParameterRanges”: [],
“ContinuousParameterRanges”: [],
“IntegerParameterRanges”: [ { “MaxValue”: “10”, “MinValue”: “1”, “Name”: “max_depth” } ]
},
“ResourceLimits”: { “MaxNumberOfTrainingJobs”: 10, “MaxParallelTrainingJobs”: 2 },
“Strategy”: “Bayesian”,
“HyperParameterTuningJobObjective”: {“MetricName”: “validation:auc”, “Type”: “Maximize” }
}
The second important configuration you need to set is TrainingJobDefinition. Here, you have to specify all the details regarding the training jobs that will be executed. One of the most important settings is the TrainingImage setting, which refers to the container that will be started to execute the training processes. This container, as expected, must have your training algorithm implemented.
Here, you present an example of a built-in algorithm, eXtreme Gradient Boosting, so that you can set the training image as follows:
training_image = sagemaker.image_uris.retrieve(‘xgboost’, region, ‘1.0-1’)
Then, you can go ahead and set your training definitions:
training_job_definition = {
“AlgorithmSpecification”: { “TrainingImage”: training_image,”TrainingInputMode”: “File” },
Next, you have to specify the data input configuration, which is also known as the data channels. In the following section of code, you are setting up two data channels – train and validation:
“InputDataConfig”: [
{
“ChannelName”: “train”, “CompressionType”: “None”, “ContentType”: “csv”, “DataSource”: {
“S3DataSource”: { “S3DataDistributionType”: “FullyReplicated”, “S3DataType”: “S3Prefix”, “S3Uri”: s3_input_train }
}
},
{
“ChannelName”: “validation”, “CompressionType”: “None”, “ContentType”: “csv”, “DataSource”: {
“S3DataSource”: { “S3DataDistributionType”: “FullyReplicated”, “S3DataType”: “S3Prefix”, “S3Uri”: s3_input_validation }
}
}
],
You also need to specify where the results will be stored:
“OutputDataConfig”: {
“S3OutputPath”: “s3://{}/{}/output”.format(bucket,prefix)
},
Finally, you set the resource configurations, roles, static parameters, and stopping conditions. In the following section of code, you want to use two instances of type ml.c4.2xlarge with 10 GB of storage:
“ResourceConfig”: { “InstanceCount”: 2, “InstanceType”: “ml.c4.2xlarge”, “VolumeSizeInGB”: 10 },
“RoleArn”: <<your_role_name>>,
“StaticHyperParameters”: { “eval_metric”: “auc”, “num_round”: “100”, “objective”: “binary:logistic”, “rate_drop”: “0.3”, “tweedie_variance_power”: “1.4” },
“StoppingCondition”: {
“MaxRuntimeInSeconds”: 43200
}
}