Important note
Please note that you are using other variables in this configuration file, bucket and prefix, which should be replaced by your bucket name and prefix key (if needed), respectively. You are also referring to s3_input_train and s3_input_validation, which are two variables that point to the train and validation datasets in S3.
Once you have set your configurations, you can spin up the tuning process:
smclient.create_hyper_parameter_tuning_job(
HyperParameterTuningJobName = “my-tuning-example”,
HyperParameterTuningJobConfig = tuning_job_config,
TrainingJobDefinition = training_job_definition
)
Next, let’s find out how to track the execution of this process.
Once you have started the tuning process, there are two additional steps that you might want to check: tracking the process of tuning and selecting the winner model (that is, the one with the best set of hyperparameters).
In order to find your training jobs, you should go to the SageMaker console and navigate to Hyperparameter training jobs. You will then find a list of executed tuning jobs, including yours:
Figure 9.11 – Finding your tuning job
If you access your tuning job, by clicking under its name, you will find a summary page, which includes the most relevant information regarding the tuning process. On the Training jobs tab, you will see all the training jobs that have been executed:
Figure 9.12 – Summary of the training jobs in the tuning process
Finally, if you click on the Best training job tab, you will find the best set of hyperparameters for your model, including a handy button for creating a new model based on those best hyperparameters that have just been found:
Figure 9.13 – Finding the best set of hyperparameters
As you can see, SageMaker is very intuitive, and once you know the main concepts behind model optimization, playing with SageMaker should be easier. Now, you understand how to use SageMaker for our specific needs. In the next section, you will explore how to select the instance type for various use cases and the security of our notebooks.
SageMaker uses a pay-for-usage model. There is no minimum fee for it.
When you think about instances on SageMaker, it all starts with an EC2 instance. This instance is responsible for all your processing. It’s a managed EC2 instance. These instances won’t show up in the EC2 console and cannot be SSHed either. The names of this instance type start with ml.
SageMaker offers instances of the following families:
In the following table, you have a visual comparison between the CPU and memory ratio of 2x large instance types from each family:
t3.2x large | m5.2x large | r5.2x large | c5.2x large | p3.2x large | g4dn.2x large |
8 vCPU, 32 GiB | 8 vCPU, 32 GiB | 8 vCPU, 64 GiB | 8 vCPU, 16 GiB | 8 vCPU, 61 GiB | 8 vCPU, 32 GiB |
Table 9.1 – A table showing the CPU and memory ratio of different instance types