Saving and restarting hybrid jobs using checkpoints - Amazon Braket

Saving and restarting hybrid jobs using checkpoints

You can save intermediate iterations of your hybrid jobs using checkpoints. In the algorithm script example from the previous section, you would add the following lines commented with #ADD to create checkpoint files.

from braket.aws import AwsDevice from braket.circuits import Circuit from braket.jobs import save_job_checkpoint #ADD import os def start_here(): print("Test job starts!!!!!") device = AwsDevice(os.environ["AMZN_BRAKET_DEVICE_ARN"]) #ADD the following code job_name = os.environ["AMZN_BRAKET_JOB_NAME"] save_job_checkpoint( checkpoint_data={"data": f"data for checkpoint from {job_name}"}, checkpoint_file_suffix="checkpoint-1", ) #End of ADD bell = Circuit().h(0).cnot(0, 1) for count in range(5): task = device.run(bell, shots=100) print(task.result().measurement_counts) print("Test hybrid job completed!!!!!")

When you run the hybrid job, it creates the file <jobname>-checkpoint-1.json in your hybrid job artifacts in the checkpoints directory with a default /opt/jobs/checkpoints path. The hybrid job script remains unchanged unless you want to change this default path.

If you want to load a hybrid job from a checkpoint generated by a previous hybrid job, the algorithm script uses from braket.jobs import load_job_checkpoint. The logic to load in your algorithm script is as follows.

checkpoint_1 = load_job_checkpoint( "previous_job_name", checkpoint_file_suffix="checkpoint-1", )

After loading this checkpoint, you can continue your logic based on the content loaded to checkpoint-1.

Note

The checkpoint_file_suffix must match the suffix previously specified when creating the checkpoint.

Your orchestration script needs to specify the job-arn from the previous hybrid job with the line commented with #ADD.

job = AwsQuantumJob.create( source_module="source_dir", entry_point="source_dir.algorithm_script:start_here", device_arn="arn:aws:braket:::device/quantum-simulator/amazon/sv1", copy_checkpoints_from_job="<previous-job-ARN>", #ADD )