Terminal
by selecting View -> Terminal
to open the terminal git clone <URL to your repo>
models/model1/
directorytrain.py
or score.py
, just rename the existing examples for later use as referenceipynb
notebook, you can easily convert it to a python script using these commands: pip install nbconvert
jupyter nbconvert --to script training_script.ipynb
train.py
outline with parameters for inputting the source for data pathconda env
conda env export > temp.yml
from the correct Conda envaml_config/train-conda.yml
(make sure to keep the azureml-*
specific dependencies!)pip
requirements.txt
, run pip freeze > requirements.txt
requirements.txt
into aml_config/train-conda.yml
(make sure to keep the azureml-*
specific dependencies!)aml_config/train-conda.yml
(make sure to keep the azureml-*
specific dependencies!)outputs/
joblib
. Adapt your code to something like this: ```python import joblib, osoutput_dir = ‘./outputs/' os.makedirs(output_dir, exist_ok=True) joblib.dump(value=clf, filename=os.path.join(output_dir, "model.pkl")) ```
This is the target architecture we'll use for this section:
aml_config/train-local.runconfig
in your editorscript
parameter to point to your entry script (default is train.py
)arguments
parameter and point your data path parameter to /data
and adapt other parametersenvironment -> docker
section, change arguments: [-v, /full/path/to/sample-data:/data]
to the full path to your data folder on your diskTerminal
in VSCode and run the training against the local instance View -> Terminal
to open the terminal az ml folder attach -g <your-resource-group> -w <your-workspace-name>
# Using the defaults from before: az ml folder attach -g aml-demo -w aml-demo
cd models/model1/
train-local.runconfig
against the local host (either Compute Instance or your local Docker environment) az ml run submit-script -c train-local -e aml-poc-local
-c
refers to the --run-configuration-name
(which points to aml_config/<run-configuration-name>.runconfig
) and -e
refers to the --experiment-name
.Experiments
in the UIThis is the target architecture we'll use for this section:
Storage Account
that belongs to the AML workspace (should be named similar to the workspace with some random number), then select Blob Containers
and find the container named azureml-blobstore-...
training_data
az storage account keys list -g <your-resource-group> -n <storage-account-name>
az storage container create -n <container-name> --account-name <storage-account-name>
az storage blob upload -f <file_name.csv> -c <container-name> -n file_name.csv --account-name <storage-account-name>
File Dataset
under Datasets
, click + Create dataset
, then select From datastore
and follow through the dialog training_data
Compute --> Compute clusters
+ New
Compute name
to cpu-cluster
Virtual Machine type
(depending on your use case, you might want a GPU instance)Minimum number of nodes
to 0Maximum number of nodes
to 1Idle seconds before scale down
to e.g., 7200 (this will keep the cluster up for 2 hours, hence avoids startup times)Create
aml_config/train-amlcompute.runconfig
in your editorscript
parameter to point to your entry scriptarguments
parameter and point your data path parameter to /data
and adapt other parameterstarget
section and point it to the name of your newly created Compute cluster (default cpu-cluster
)id
using the command line: az ml dataset list
data
section, replace id
with your dataset's id: data:
mydataset:
dataLocation:
dataset:
id: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx # replace with your dataset's id
...
pathOnCompute: /data # Where your data is mounted to
cuda
drivers, e.g.: baseImage: mcr.microsoft.com/azureml/base-gpu:openmpi3.1.2-cuda10.1-cudnn7-ubuntu18.04
cuda
version matches your library versiontrain-amlcompute.runconfig
against the AML Compute Cluster az ml run submit-script -c train-amlcompute -e aml-poc-compute -t run.json
-t
stands for --output-metadata-file
and is used to generate a file that contains metadata about the run (we can use it to easily register the model from it in the next step).Experiments
in the UIrun.json
, which is referencing the last training run: az ml model register -n demo-model --asset-path outputs/model.pkl -f run.json \
--tag key1=value1 --tag key2=value2 --property prop1=value1 --property prop2=value2
-n
stands for --name
, under which the model will be registered. --asset-path
points to the model's file location within the run itself (see Outputs + logs
tab in UI). Lastly, -f
stands for --run-metadata-file
which is used to load the file created prior for referencing the run from which we want to register the model from.Great, you have now trained your Machine Learning on Azure using the power of the cloud. Let's move to the next section where we look into moving the inferencing code to Azure.