Pass information to and from your notebook step
The following sections describe ways to pass information to your notebook as environment variables and parameters.
Pass environment variables
Pass environment variables as a dictionary to the environment_variable
argument of your NotebookJobStep
, as shown in the following example:
environment_variables = {"RATE": 0.0001, "BATCH_SIZE": 1000} notebook_job_step = NotebookJobStep( ... environment_variables=environment_variables, ... )
You can use the environment variables in the notebook using os.getenv()
,
as shown in the following example:
# inside your notebook import os print(f"ParentNotebook: env_key={os.getenv('env_key')}")
Pass parameters
When you pass parameters to the first Notebook Job step in your
NotebookJobStep
instance, you might optionally want to tag a cell in your
Jupyter notebook to indicate where to apply new parameters or parameter overrides. For
instructions about how to tag a cell in your Jupyter notebook, see Parameterize your notebook.
You pass parameters through the Notebook Job step's parameters
parameter,
as shown in the following snippet:
notebook_job_parameters = { "company": "Amazon", } notebook_job_step = NotebookJobStep( ... parameters=notebook_job_parameters, ... )
Inside your input notebook, your parameters are applied after the cell tagged with
parameters
or at the beginning of the notebook if you don’t have a tagged
cell.
# this cell is in your input notebook and is tagged with 'parameters' # your parameters and parameter overrides are applied after this cell company='default'
# in this cell, your parameters are applied # prints "company is Amazon" print(f'company is {company}')
Retrieve information from a previous step
The following discussion explains how you can extract data from a previous step to to pass to your Notebook Job step.
Use properties
attribute
You can use the following properties with the previous step's properties
attribute:
-
ComputingJobName
—The training job name -
ComputingJobStatus
—The training job status -
NotebookJobInputLocation
—The input Amazon S3 location -
NotebookJobOutputLocationPrefix
—The path to your training job outputs, more specifically{
. containing outputsNotebookJobOutputLocationPrefix
}/{training-job-name
}/output/output.tar.gz -
InputNotebookName
—The input notebook file name -
OutputNotebookName
—The output notebook file name (which may not exist in the training job output folder if the job fails)
The following code snippet shows how to extract parameters from the properties attribute.
notebook_job_step2 = NotebookJobStep( .... parameters={ "step1_JobName": notebook_job_step1.properties.ComputingJobName, "step1_JobStatus": notebook_job_step1.properties.ComputingJobStatus, "step1_NotebookJobInput": notebook_job_step1.properties.NotebookJobInputLocation, "step1_NotebookJobOutput": notebook_job_step1.properties.NotebookJobOutputLocationPrefix, }
Use JsonGet
If you want to pass parameters other than the ones previously mentioned and the JSON
outputs of your previous step reside in Amazon S3, use JsonGet
.
JsonGet
is a general mechanism that can directly extract data from JSON
files in Amazon S3.
To extract JSON files in Amazon S3 with JsonGet
, complete the following
steps:
-
Upload your JSON file to Amazon S3. If your data is already uploaded to Amazon S3, skip this step. The following example demonstrates uploading a JSON file to Amazon S3.
import json from sagemaker.s3 import S3Uploader output = { "key1": "value1", "key2": [0,5,10] } json_output = json.dumps(output) with open("notebook_job_params.json", "w") as file: file.write(json_output) S3Uploader.upload( local_path="notebook_job_params.json", desired_s3_uri="s3://
path
/to
/bucket
" ) -
Provide your S3 URI and the JSON path to the value you want to extract. In the following example,
JsonGet
returns an object representing index 2 of the value associated with keykey2
(10
).NotebookJobStep( .... parameters={ # the key job_key1 returns an object representing the value 10 "job_key1": JsonGet( s3_uri=Join(on="/", values=["s3:/", ..]), json_path="key2[2]" # value to reference in that json file ), "job_key2": "Amazon" } )