Troubleshooting Agent Issues - AWS Elastic Disaster Recovery

Troubleshooting Agent Issues

Error: Installation Failed

When the installation of the AWS Replication Agent on a source server fails during the running of the Installer file, you will receive an error message.

This type of error means that the Agent was not installed on the source server, and therefore the server will not appear on the AWS Elastic Disaster Recovery Console. After you fix the issue that caused the installation to fail, you need to rerun the Agent Installer file to install the Agent.

This app cant run on your PC error – Windows

If you encounter the following error "This app can't run on your PC", when trying to install the AWS Replication Agent on your Windows 10 source machine, try the following.

This error is indicative that your particular version of Windows 10 is likely the 32-bit version. To verify this, you can

1. Use the Windows key + I keyboard shortcut to open the Settings app.

2. Click System.

3. Click About.

4. Under System type, you will see two pieces of information: if it says 32-bit operating system, x64-based processor, then it means that your PC is running a 32-bit version of Windows 10 on a 64-bit processor.

If it says 32-bit operating system, x86-based processor, then your computer doesn't support Windows 10 (64-bit).

At the moment, only 64 bit operating systems are supported for Elastic Disaster Recovery Service.

If your OS is indeed 64-bit, then there may be other elements blocking the installation of your agent. The block is actually coming from the Windows Operating System itself. You would need to identify what the cause is, (for example, broken registry key),

Is having a mounted '/tmp' directory a requirement for the Agent?

The simple requirement is just to have enough free space. There is no need for this to be a separate mount. The need for the '/tmp' requirement is actually only if '/tmp' is a separate mount. If '/tmp' is not a separate mount, then it would fall under '/', for which we have the 2 GiB free requirement. This allows for the '/tmp' to fall into this requirement.

Installation Failed – Old Agent

Installation may fail due to an old AWS Replication Agent. Ensure that you are attempting to install the latest version of the AWS Replication Agent. You can learn how to download the Agent here.

Installation Failed on Linux Machine

If the installation failed on a Linux source server, check the following:

  1. Free Disk Space

    Free disk space on the root directory – verify that you have at least 3 GB of free disk on the root directory (/) of your Source machine. To check the available disk space on the root directory, run the following command: df -h /

    Free disk space on the /tmp directory – for the duration of the installation process only, verify that you have at least 500 MB of free disk on the /tmp directory. To check the available disk space on the /tmp directory run the following command: df -h /tmp

    After you have entered the above commands for checking the available disk space, the results will be displayed as follows:

  2. The format of the list of disks to replicate

    During the installation, when you are asked to enter the disks you want to replicate, do NOT use apostrophes, brackets, or disk paths that do not exit. Type only existing disk paths, and separate them with a comma, as follows:

    /dev/xvdal,/dev/xvda2.

  3. Version of the Kernel headers package

    Verify that you have kernel-devel/linux-headers installed that are exactly of the same version as the kernel you are running.

    The version number of the kernel headers should be completely identical to the version number of the kernel. To handle this issue, follow these steps:

    1. Identify the version of your running kernel.

      To identify the version of your running kernel, run the following command:

      uname -r

      The 'uname -r' output version should match the version of one of the installed kernel headers packages (kernel-devel-<version number> / linux-headers-<version number>).

    2. Identify the version of your kernel-devel/linux-headers.

      To identify the version of your running kernel, run the following command:

      On RHEL/CENTOS/Oracle/SUSE:

      rpm -qa | grep kernel

      Note: This command looks for kernel-devel.

      On Debian/Ubuntu: apt-cache search linux-headers

    3. Verifying that the folder that contains the kernel-devel/linux-headers is not a symbolic link.

      Sometimes, the content of the kernel-devel/linux-headers, which match the version of the kernel, is actually a symbolic link. In this case, you will need to remove the link before installing the required package.

      To verify that the folder that contains the kernel-devel/linux-headers is not a symbolic link, run the following command:

      On RHEL/CENTOS/Oracle/SUSE:

      ls -l /usr/src/kernels

      On Debian/Ubuntu:

      ls -l /usr/src

      In the above example, the results show that the linux-headers are not a symbolic link.

    4. [If a symbolic link exists] Delete the symbolic link.

      If you found that the content of the kernel-devel/linux-headers, which match the version of the kernel, is actually a symbolic link, you need to delete the link. Run the following command:

      rm /usr/src/<LINK NAME>

      For example: rm /usr/src/linux-headers-4.4.1

    5. Install the correct kernel-devel/linux-headers from the repositories.

      If none of the already installed kernel-devel/linux-headers packages match your running kernel version, you need to install the matching package.

      Note: You can have several kernel headers versions simultaneously on your OS, and you can therefore safely install new kernel headers packages in addition to your existing ones (without uninstalling the other versions of the package.) A new kernel headers package does not impact the kernel, and does not overwrite older versions of the kernel headers.

      Note: For everything to work, you need to install a kernel headers package with the exact same version number of the running kernel.

      To install the correct kernel-devel/linux-headers, run the following command:

      On RHEL/CENTOS/Oracle/SUSE:

      sudo yum install kernel-devel-`uname -r`

      On Debian/Ubuntu:

      sudo apt-get install linux-headers-`uname -r`

    6. [If no matching package was found] Download the matching kernel-devel/linux-headers package.

      If no matching package was found on the repositories configured on your machine, you can download it manually from the Internet and then install it.

      To download the matching kernel-devel/linux-headers package, navigate to the following sites:

  4. The make, openssl, wget, curl, gcc and build-essential packages

    Note: Usually, the existence of these packages is not required for Agent installation. However, in some cases where the installation fails, installing these packages will solve the problem.

    If the installation failed, the make, openssl, wget, curl, gcc, and build-essential packages should be installed and stored in your current path.

    To verify the existence and location of the required packages, run the following command:

    which <package>

    For example, to locate the make package:

    which make

  5. Error: urlopen error [Errno 110] Connection times out

    This error occurs when outbound traffic is not allowed over TCP Port 443. Port 443 needs to be open outbound to the AWS Elastic Disaster Recovery Manager.

  6. Powerpath support

    powermt check

    If so, contact AWS Support for instructions on how to install the AWS Replication Agent on such machines.

  7. Error: You need to have root privileges to run this script

    Make sure you run the installer either as root or by adding sudo at the beginning:

    sudo python installer_linux.py

Installation Failed on Windows Machine

If the installation failed on a Windows Source server, check the following:

  1. .NET Framework

    Verify that .NET Framework version 3.5 or above is installed on your Windows Source servers.

  2. Free disk space

    Verify that there is at least 1 GB of free disk space on the root directory (C:\) of your Source servers for the installation.

  3. net.exe and sc.exe location 

    Verify that the net.exe and/or sc.exe files, located by default in the C:\Windows\System32 folder, are included in the PATH Environment Variable.

    1. Navigate to Control Panel >System and Security >System >Advanced system settings.

    2. On the System Properties dialog box Advanced tab, click the Environment Variables button.

    3. On the System Variables section of the Environment Variables pane, select the Path variable. Then, click the Edit button to view its contents.

    4. On the Edit System Variable pane, review the defined paths in the Variable value field. If the path of the net.exe and/or sc.exe files does not appear there, manually add it to the Variable value field, and click OK.

Windows – Installation Failed - Request Signature

If the AWS Replication Agent installation fails on Windows with the following error:

botocore.exceptions.ClientError: An error occurred (InvalidSignatureException) when calling the GetAgentInstallationAssetsElastic Disaster RecoveryInternal operation: {"message":"The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details.

Attempt to rerun the installer with power shell instead of CMD. At times, when the installer is ran in CMD, the AWS Secret Key does not get pasted properly into the installer and causes installation to fail.

Error – driver was compiled for a different kernel not loading

This error may manifest if a significant amount of time has passed between when you performed a failover and when you are performing a failback.

This error may occur on the source server or on the recovery instance. You can identify this error by looking at the agent log in /var/lib/aws-replication-agent/agent.log.0

To fix this issue on a recovery instance, reboot the recovery instance and reinstall the AWS Replication Agent as recovery instance.

To fix this issue on a source server, reboot the source server and then reinstall the AWS Replication Agent.

Error – certificate verify failed

This error (CERTIFICATE_VERIFY_FAILED) may indicate that the OS does not trust the certification authority used by our endpoints. To resolve this issue, try the following steps:

  1. Open Microsoft Edge or Internet Explorer to update the operating system trusted root certificates. This will work if the operating system does not have restrictions to download the certificates.

  2. If the first step does not resolve the issue, download and install the Amazon Root Certificates manually.