Troubleshooting machine learning inference
Use the troubleshooting information and solutions in this section to help resolve issues with your machine learning components. For the public machine learning inference components, see the error messages in the following component logs:
If a component is installed correctly, then the component log contains the location of the library that it uses for inference.
Issues
- Failed to fetch library
- Cannot open shared object file
- Error: ModuleNotFoundError: No module named '<library>'
- No CUDA-capable device is detected
- No such file or directory
- RuntimeError: module compiled against API version 0xf but this version of NumPy is <version>
- picamera.exc.PiCameraError: Camera is not enabled
- Memory errors
- Disk space errors
- Timeout errors
Failed to fetch library
The following error occurs when the installer script fails to download a required library during deployment on a Raspberry Pi device.
Err:2 http://raspbian.raspberrypi.org/raspbian buster/main armhf python3.7-dev armhf 3.7.3-2+deb10u1 404 Not Found [IP: 93.93.128.193 80] E: Failed to fetch http://raspbian.raspberrypi.org/raspbian/pool/main/p/python3.7/libpython3.7-dev_3.7.3-2+deb10u1_armhf.deb 404 Not Found [IP: 93.93.128.193 80]
Run sudo apt-get update
and deploy your component again.
Cannot open shared object file
You might see errors similar to the following when the installer script fails to download
a required dependency for opencv-python
during deployment on a Raspberry Pi
device.
ImportError: libopenjp2.so.7: cannot open shared object file: No such file or directory
Run the following command to manually install the dependencies for
opencv-python
:
sudo apt-get install libopenjp2-7 libilmbase23 libopenexr-dev libavcodec-dev libavformat-dev libswscale-dev libv4l-dev libgtk-3-0 libwebp-dev
Error: ModuleNotFoundError: No module named '<library>'
You might see this error in the ML runtime component logs
(variant.DLR.log
or variant.TensorFlowLite.log
)
when the ML runtime library or its dependencies aren't installed correctly. This error can
occur in the following cases:
-
If you use the
UseInstaller
option, which is enabled by default, this error indicates that the ML runtime component failed to install the runtime or its dependencies. Do the following:-
Configure the ML runtime component to disable the
UseInstaller
option. -
Install the ML runtime and its dependencies, and make them available to the system user that runs the ML components. For more information, see the following:
-
-
If you don't use the
UseInstaller
option, this error indicates that the ML runtime or its dependencies aren't installed for the system user that runs the ML components. Do the following:-
Check that the library is installed for the system user that runs the ML components. Replace
ggc_user
with the name of the system user, and replacetflite_runtime
with the name of the library to check. -
If the library isn't installed, install it for that user. Replace
ggc_user
with the name of the system user, and replacetflite_runtime
with the name of the library.For more information about the dependencies for each ML runtime, see the following:
-
If the issue persists, install the library for another user to confirm whether this device can install the library. The user could be, for example, your user, the root user, or an administrator user. If you can't install the library successfully for any user, your device might not support the library. Consult the library's documentation to review requirements and troubleshoot installation issues.
-
No CUDA-capable device is detected
You might see the following error when you use GPU acceleration. Run the following command to enable GPU access for the Greengrass user.
sudo usermod -a -G video ggc_user
No such file or directory
The following errors indicate that the runtime component was unable to set up the virtual environment correctly:
-
MLRootPath
/greengrass_ml_dlr_conda/bin/conda: No such file or directory -
MLRootPath
/greengrass_ml_dlr_venv/bin/activate: No such file or directory -
MLRootPath
/greengrass_ml_tflite_conda/bin/conda: No such file or directory -
MLRootPath
/greengrass_ml_tflite_venv/bin/activate: No such file or directory
Check the logs to make sure that all runtime dependencies were installed correctly. For more information about the libraries installed by the installer script, see the following topics:
By default MLRootPath
is set to
.
To change this location, include the DLR runtime or TensorFlow Lite runtime runtime component directly in your deployment,
and specify a modified value for the
/work//greengrass/v2
component-name
/greengrass_mlMLRootPath
parameter in a configuration
merge update. For more information about configuring component, see Update component configurations.
Note
For the DLR component v1.3.x, you set the MLRootPath
parameter in the
configuration of the inference component, and the default value is
$HOME/greengrass_ml
.
RuntimeError: module compiled against API version 0xf but this version of NumPy is <version>
You might see the following errors when you run machine learning inference on a Raspberry Pi running Raspberry Pi OS Bullseye.
RuntimeError: module compiled against API version 0xf but this version of numpy is 0xd ImportError: numpy.core.multiarray failed to import
This error occurs because Raspberry Pi OS Bullseye includes an earlier version of NumPy than the version that OpenCV requires. To fix this issue, run the following command to upgrade NumPy to the latest version.
pip3 install --upgrade numpy
picamera.exc.PiCameraError: Camera is not enabled
You might see the following error when you run machine learning inference on a Raspberry Pi running Raspberry Pi OS Bullseye.
picamera.exc.PiCameraError: Camera is not enabled. Try running 'sudo raspi-config' and ensure that the camera has been enabled.
This error occurs because Raspberry Pi OS Bullseye includes a new camera stack that isn't compatible with the ML components. To fix this issue, enable the legacy camera stack.
To enable the legacy camera stack
-
Run the following command to open the Raspberry Pi configuration tool.
sudo raspi-config
-
Select Interface Options.
-
Select Legacy camera to enable the legacy camera stack.
-
Reboot the Raspberry Pi.
Memory errors
The following errors typically occur when the device does not have enough memory and the component process is interrupted.
-
stderr. Killed.
-
exitCode=137
We recommend a minimum of 500 MB of memory to deploy a public machine learning inference component.
Disk space errors
The no space left on device
error typically occurs when a device does not
have enough storage. Make sure that there is enough disk space available on your device before
you deploy the component again. We recommend a minimum of 500 MB of free disk space to deploy
a public machine learning inference component.
Timeout errors
The public machine learning components download large machine learning model files that are larger than 200 MB. If the download times out during deployment, check your internet connection speed and retry the deployment.