Step 1: Define the required metadata Step 2: Build the metadata storage and collection processes Step 3: Document metadata requirements and collection processes in a runbook Task exit criteria

Task 2: Defining processes for identifying, collecting, and storing metadata

In the previous task, you validated the initial discovery data, the migration strategies, and the migration patterns for your large migration. In this task, you identify what metadata is required and decide how you will collect it. This task consists of the following steps:

Step 1: Define the required metadata
Step 2: Build the metadata storage and collection processes
Step 3: Document metadata requirements and collection processes in a runbook

As you complete the steps in this section, consider the entire migration cycle from a metadata perspective. Consider portfolio assessment, wave planning, migration, testing, post-cutover activities, and then analyze all possible use cases and related use cases. Thinking about the information that you need to complete the full migration process helps you identify all of the metadata for that pattern.

Step 1: Define the required metadata

Before you can determine the required metadata attributes, you must understand the migration pattern. For example, you need different metadata for migrating a server to Amazon EC2 and for migrating a database to Amazon RDS. Most patterns are made up of many small tasks. In order to perform the migration pattern, you need to know what metadata attributes are required and then collect the metadata for that application. You must determine and gather the required metadata in the initialization stage so that you can perform the migration efficiently and without delay in the implementation stage.

The person or team that defines the metadata attributes begins by defining the steps and tasks needed to perform the migration pattern. The tasks determine what metadata is needed, so working through each task builds a comprehensive collection of the required metadata. The person who determines what metadata is required typically needs to have a comprehensive understanding of how to complete the migration pattern. Coordination with the person writing the migration runbook might be required. For more information, see the Migration playbook for AWS large migrations.

During a large migration, there are many processes spread across all workstreams that have a dependency on metadata. Having timely and accurate metadata has broad and significant impact to the success of a large migration.

In this step, you define the pattern or task and then use the definition to identify the metadata required.

Identify the key components of the migration patterns and supporting tasks

In this step, for each migration pattern or supporting task, you define the key components, such as the action, source object, target object, and tools used. You then name the pattern or task based on your answers.

Supporting tasks include the operational activities that the portfolio and migration workstreams need to perform during the migration, such as wave planning, application prioritization, dependency analysis, governance, disaster recovery, performance testing, or user-acceptance testing. Because you need metadata to support these tasks, you perform these steps for both the migration patterns and the supporting tasks.

Action – Identify the migration strategy or supporting task. Remember that one action might have other actions associated with it. For example, you might want to define operations for migration. Example actions include:
- Migration strategy, such as rehost, replatform, or relocate
- Wave planning
- Application prioritization and dependency analysis
- Operation
- Governance
- Disaster recovery
- Testing, such as performance testing or user-acceptance testing (UAT)
Source object – Identify the source object on which the action will be performed. Example source objects include:
- Waves
- Server
- Database
- File share
- Application
Tools – Identify the services or tools used to perform the action. You might use more than one tool or service. Example tools include:
- AWS Application Migration Service
- AWS DataSync
- AWS Database Migration Service (AWS DMS)
- AWS Backup
- Performance monitoring tools
Target object – Identify the target object, service, or location where the source will reside when the action is complete. Example objects, services, or locations include:
- Amazon Elastic Compute Cloud (Amazon EC2)
- Amazon Relational Database Service (Amazon RDS)
- Amazon Elastic File System (Amazon EFS)
- Amazon Elastic Container Service (Amazon ECS)
- Wave plan
Pattern name – Combine your answers to the previous steps as follows:

<action> <source object> on/to <target object> using <tool>

The following are examples:
- Rehost (action) waves, applications, or servers (source object) to Amazon EC2 (target object) using Application Migration Service or Cloud Migration Factory (tools)
- Replatform (action) file shares (source object) to Amazon EFS (target object) using DataSync (tool)
- Replatform (action) databases (source object) to Amazon RDS (target object) using AWS DMS (tool)
- Performance monitoring (action) of applications (source object) on Amazon EC2 (target object) using Amazon CloudWatch (tool)
- Back up (action) servers (source object) on Amazon EC2 (target object) using AWS Backup (tools) after migration
- Wave planning (action) waves, applications, or servers (source object) to create a wave plan (target object)

The following is an example of how you might record Pattern 1: Rehost to Amazon EC2 using Application Migration Service or Cloud Migration Factory from the migration patterns table.

Pattern ID	1
Pattern name	Rehost to Amazon EC2 using Application Migration Service or Cloud Migration Factory
Action	Rehost migration
Source object	Waves, applications, or servers
Tools	Application Migration Service or Cloud Migration Factory
Target object	Amazon EC2

Determine the metadata required for each pattern or task

Now that you have defined the pattern or task, you determine the metadata required for the source object, target object, tools, and other business information. To explain this process, this playbook uses Pattern 1: Rehost to Amazon EC2 using Application Migration Service or Cloud Migration Factory from the migration patterns table as an example. Note that for some patterns or tasks, some steps might not apply.

Analyze the target object – Working backwards from the target object, manually create the object and identify the metadata needed to support it. Capture the metadata as demonstrated in the following table.

For example, when you create an EC2 instance, you must choose an instance type, storage type, storage size, subnet, security group, and tags. The following table includes examples of metadata attributes that you might need if your target object is an EC2 instance.

Attribute name	Object type	Description or purpose
`target_subnet`	Target EC2 instance	Subnet of the target EC2 instance
`target_subnet_test`	Target EC2 instance	Test subnet of the target EC2 instance
`target_security_group`	Target EC2 instance	Security group of the target EC2 instance
`target_security_group_test`	Target EC2 instance	Test security group of the target EC2 instance
`IAM_role`	Target EC2 instance	AWS Identity and Access Management (IAM) role of the target EC2 instance
`instance_type`	Target EC2 instance	Instance type of the target EC2 instance
`AWS_account_ID`	Target EC2 instance	AWS account to host the target EC2 instance
`AWS_Region`	Target EC2 instance	AWS Region to host the target EC2 instance

Analyze the tools – Use the tool to create a target object and check for differences. Capture the tool-specific metadata as demonstrated in the following table, and remove the attributes from the previous table if it is not supported by the migration tool. For example, you cannot customize the OS type and storage size for Application Migration Service because the rehost migration tool is like-for-like. Therefore, you would remove target OS and target disk size if these attributes were included in the previous table. In the previous example table, all attributes are supported by the tool, so no action is required.

The following table includes examples of metadata that you might need for the tools.

Attribute name	Object type	Description or purpose
`AWS_account_ID`	Tools (Application Migration Service)	AWS account ID for AWS Application Migration Service
`AWS_Region`	Tools (Application Migration Service)	AWS Region for Application Migration Service
`replication_server_subnet`	Tools (Application Migration Service)	Subnet for the Application Migration Service replication server
`replication_server_security_group`	Tools (Application Migration Service)	Security group for the Application Migration Service replication server

Analyze the source object – Determine the required metadata for the source object by assessing the actions as follows:

To migrate servers, you need to know the source server name and fully qualified domain name (FQDN) in order to connect to the server.
To migrate applications along with their servers, you need to know the application name, application environment, and application-to-server mapping.
To perform a portfolio assessment, prioritize applications, or define a move group, you need to know the application-to-server mapping, application-to-database mapping, and application-to-application dependencies.
To manage waves, you need to know the wave ID and the start and end times of the wave.

The following table includes examples of metadata that you might need for the source object.

Attribute name	Object type	Description or purpose
`wave_ID`	Source wave	ID of the wave (for example: wave 10)
`wave_start_date`	Source wave	Start date for the wave
`wave_cutover_date`	Source wave	Cutover date for the wave
`wave_owner`	Source wave	Owner of the wave
`app_name`	Source application	Source application name
`app_to_server_mapping`	Source application	Application-to-server relationship
`app_to_DB_mapping`	Source application	Application-to-database relationship
`app_to_app_dependencies`	Source application	External dependencies of the application
`server_name`	Source server	Source server name
`server_FQDN`	Source server	Fully qualified domain name of the source server
`server_OS_family`	Source server	Operating system (OS) family of the source server (for example: Windows or Linux)
`server_OS_version`	Source server	OS version of the source server (for example: Windows Server 2003)
`server_environment`	Source server	Environment of the source server (for example: development, production, or test)
`server_tier`	Source server	Tier of the source server (for example: web, database, or application)
`CPU`	Source server	Number of CPUs in the source server
`RAM`	Source server	RAM size of the source server
`disk_size`	Source server	Disk size of the source server

Consider other attributes – In addition to the primary action, consider other actions and attributes related to the target object or application. For the example pattern, Pattern 1: Rehost to Amazon EC2 using Application Migration Service or Cloud Migration Factory, the action is rehost, and the target object is Amazon EC2. Other related actions for this target object might include backing up to Amazon EC2, monitoring the EC2 instance after the migration, and using tags to manage costs associated with the EC2 instance. You might also want to consider other application attributes that help you manage the migration, such as the application owner, who you might need to contact for questions or cutover purposes.

The following table includes examples of additional metadata that are commonly used. This table includes tags for your target EC2 instance. For more information about tags and how to use them, see Tag your Amazon EC2 resources in the Amazon EC2 documentation.

Attribute name	Object type	Description or purpose
`Name`	Target EC2 instance (tag)	Tag to define the name of a target EC2 instance
`app_owner`	Source application	The owner of a source application
`business_unit`	Target EC2 instance (tag)	Tag to identify the business unit for a target EC2 instance (for example: HR, finance, or IT)
`cost_center`	Target EC2 instance (tag)	Tag to identify the cost center for a target EC2 instance

Create a table – Combine all of the metadata identified in the previous steps into a single table.

Attribute name	Object type	Description or purpose
`wave_ID`	Source wave	ID of the wave (for example: wave 10)
`wave_start_date`	Source wave	Start date for the wave
`wave_cutover_date`	Source wave	Cutover date for the wave
`wave_owner`	Source wave	Owner of the wave
`app_name`	Source application	Source application name
`app_to_server_mapping`	Source application	Application-to-server relationship
`app_to_DB_mapping`	Source application	Application-to-database relationship
`app_to_app_dependencies`	Source application	External dependencies of the application
`AWS_account_ID`	Tools (Application Migration Service)	AWS account to host the target EC2 instance
`AWS_Region`	Tools (Application Migration Service)	AWS Region to host the target EC2 instance
`replication_server_subnet`	Tools (Application Migration Service)	Subnet for the Application Migration Service replication server
`replication_server_security_group`	Tools (Application Migration Service)	Security group for the Application Migration Service replication server
`server_name`	Source server	Source server name
`server_FQDN`	Source server	Fully qualified domain name of the source server
`server_OS_family`	Source server	Operating system (OS) family of the source server (for example: Windows or Linux)
`server_OS_version`	Source server	OS version of the source server (for example: Windows Server 2003)
`server_environment`	Source server	Environment of the source server (for example: development, production, or test)
`server_tier`	Source server	Tier of the source server (for example: web, database, or application)
`CPU`	Source server	Number of CPUs in the source server
`RAM`	Source server	RAM size of the source server
`disk_size`	Source server	Disk size of the source server
`target_subnet`	Target server	Subnet of the target EC2 instance
`target_subnet_test`	Target server	Test subnet of the target EC2 instance
`target_security_group`	Target server	Security group of the target EC2 instance
`target_security_group_test`	Target server	Test security group of the target EC2 instance
`instance_type`	Target server	Instance type of the target EC2 instance
`IAM_role`	Target server	AWS Identity and Access Management (IAM) role of the target EC2 instance
`Name`	Target server (tag)	Tag to define the name of a target EC2 instance
`app_owner`	Source application	The owner of a source application
`business_unit`	Target server (tag)	Tag to identify the business unit for a target EC2 instance (for example: HR, finance, or IT)
`cost_center`	Target server (tag)	Tag to identify the cost center for a target EC2 instance

Repeat – Repeat this process until you have documented the required metadata for each pattern.

Step 2: Build the metadata storage and collection processes

In the previous step, you defined the metadata required to support your migration. In this step, you build a process for collecting and storing the metadata. This step consists of two tasks:

Analyze the required metadata from the previous step and identify the source.
Define a process for efficiently storing and collecting the metadata.

Analyze the metadata sources

There are many common metadata sources. Usually, the first thing you can access is a high-level asset inventory, which is typically exported from a configuration management database (CMDB) or from another existing tool. However, you need to collect metadata from other sources as well, using both automated and manual processes.

The following table contains common sources, the standard collection process for that source, and the common metadata types that you can expect to find from that source.

Metadata source	Collection type	Metadata type
Discovery tools	Automated	Source server
CMDB	Automated	Source server
Inventory from other tools, such as RVTools for VMware vSphere	Automated	Source server
Application owner questionnaire	Manual	Source server, target server, wave
Application owner interview	Manual	Source server, target server, wave
Application design documentation	Manual	Target server
Landing zone design documentation	Manual	Target server, tools

After listing all the possible sources of your metadata, you analyze the metadata type and map each source to the metadata attributes that you identified in the previous step.

Get a complete list of metadata attributes from Step 1: Define the required metadata.
Analyze each metadata type and determine which types cannot be retrieved using an automated process. This is usually the target server metadata and wave metadata types because these require decisions from the application owners. For example, which subnet and security group will you use for the target EC2 instances?
Analyze each metadata attribute and map it to a metadata source in the previous table. It is common to have a combination of multiple sources. You can use discovery tools to collect some source server metadata. For information about using discovery tools to collect metadata, see Get started with automated portfolio discovery on the AWS Prescriptive Guidance website.

Create a table to map the metadata attribute to its type and source. The following table is an example.

Metadata attribute	Metadata type	Metadata sources
`app_name`	Source application	CMDB
`app_owner`	Source application	CMDB
`app_to_server_mapping`	Source application	CMDB, discovery tools, or application owner questionnaire
`app_to_DB_mapping`	Source application	CMDB, discovery tools, or application owner questionnaire
`app_to_app_dependencies`	Source application	CMDB, discovery tools, or application owner questionnaire
`server_name`	Source server	CMDB
`server_FQDN`	Source server	CMDB
`server_OS_family`	Source server	CMDB
`server_IP`	Source server	Discovery tools
`disk_size`	Source server	Discovery tools
`instance_type`	Target server	Discovery tools
`target_subnet`	Target server	Application owner questionnaire
`target_security_group`	Target server	Application owner questionnaire
`AWS_Region`	Target server	Application owner questionnaire
`AWS_account_ID`	Target server	Application owner questionnaire
`replication_server_subnet`	Tools (Application Migration Service)	Landing zone design documentation
`replication_server_security_group`	Tools (Application Migration Service)	Landing zone design documentation
`Name`	Target server (tag)	Application owner questionnaire
`business_unit`	Target server (tag)	Application owner questionnaire
`cost_center`	Target server (tag)	Application owner questionnaire
`wave_ID`	Wave planning	Application owner interview
`wave_start_date`	Wave planning	Application owner interview
`wave_cutover_date`	Wave planning	Application owner interview

Define a single metadata store

After mapping each metadata attribute to its source, you define where to store the metadata. Regardless of how and where you store the metadata, you need to choose only one repository. This ensures that you have a single source of truth. Storing metadata in multiple places is a common mistake in large migrations.

Option 1: Store metadata in a spreadsheet in a shared repository

Although this option might sound like a very manual process, it is the most common data store for large migrations. It is also common to store the spreadsheet in a shared repository, such as a Microsoft SharePoint site.

A Microsoft Excel spreadsheet is easy to customize and doesn’t take a long time to build. The disadvantages are that it will get very complex if you have a lot of metadata and that it can be difficult to manage the relationships between assets, such as between the server, application, and database. The other challenge is version management. You need to limit write access to only a few people, or you need to use an automated process to update the spreadsheet.

In the portfolio playbook templates, you can use the Dashboard template for wave planning and migration (Excel format) as a starting point for building your own data store spreadsheet.

Option 2: Store metadata in a purpose-built tool

You can use a prebuilt tool, such as TDS Transition Manager (TDS website), to store your data, or you can build your own tool. When you build your own tool, you need database tables just like Excel spreadsheet tabs in option 1. For example:

Server table
Application table
Database table
Application-to-server and application-to-database mapping table
Wave-planning table
Application owner questionnaire table

Define the metadata collection processes

In the previous steps, you mapped the metadata to its source and defined a data store where you will collect the metadata. In this step, you build processes to effectively collect the metadata. You should minimize the manual copy-and-paste process and use automation to collect the metadata from each source. There are three steps:

Build an extract, transform, and load (ETL) script for each metadata source based on the metadata mapping table.
Build a scheduled task that imports metadata from each source automatically on a regular basis.
Build an export process or provide application programming interface (API) access to the metadata stored in the repository.

The following table is an example of the metadata attributes collected by each ETL script. The metadata is stored in the location you defined in the previous section, such as a spreadsheet or purpose-built tool.

Metadata attribute	Metadata type	Metadata source	Collection process
`app_name`	Source application	CMDB	ETL script – CMDB
`app_owner`	Source application	CMDB	ETL script – CMDB
`app_to_server_mapping`	Source application	CMDB	ETL script – CMDB
`app_to_DB_mapping`	Source application	CMDB	ETL script – CMDB
`app_to_app_dependencies`	Source application	Discovery tool	ETL script – discovery tool
`server_name`	Source server	CMDB	ETL script – CMDB
`server_FQDN`	Source server	CMDB	ETL script – CMDB
`server_OS_family`	Source server	CMDB	ETL script – CMDB
`server_OS_version`	Source server	CMDB	ETL script – CMDB
`disk_size`	Source server	Discovery tool	ETL script – discovery tool
`instance_type`	Target server	Discovery tool	ETL script – discovery tool
`target_subnet`	Target server	Application owner questionnaire	ETL script – application owner
`target_security_group`	Target server	Application owner questionnaire	ETL script – application owner
`AWS_Region`	Target server	Application owner questionnaire	ETL script – application owner
`AWS_account_ID`	Target server	Application owner questionnaire	ETL script – application owner
`Name`	Target server (tag)	Application owner questionnaire	ETL script – application owner
`business_unit`	Target server (tag)	Application owner questionnaire	ETL script – application owner
`cost_center`	Target server (tag)	Application owner questionnaire	ETL script – application owner
`wave_ID`	Wave planning	Application owner questionnaire	ETL script – application owner
`wave_start_date`	Wave planning	Application owner questionnaire	ETL script – application owner
`wave_cutover_date`	Wave planning	Application owner questionnaire	ETL script – application owner

Step 3: Document metadata requirements and collection processes in a runbook

In this task, you document your decisions in a metadata management runbook. During the migration, your portfolio workstream adheres to this runbook as the standard procedures for collecting and storing metadata.

In the portfolio playbook templates, open the Runbook template for metadata management (Microsoft Word format). This serves as a starting point for building your own runbook.
In the Metadata attributes section, create a metadata attributes table for each migration pattern, and populate the tables with the metadata attributes identified in Step 1: Define the required metadata.
In the Source locations section, document the sources you identified in Analyze the metadata sources.
In the Source location access instructions section, document the steps a user would need to follow in order to access the metadata source locations.
In the Metadata store section, document the steps a user would need to follow in order to access the metadata store you created in Define a single metadata store.
In the Data collection types section, identify the data collection process that you will use for each metadata source. Ideally, you should automate all metadata collection by using automation scripts.
In the Data collection by metadata attribute section, for each metadata attribute, identify the following according to the instructions in Define the metadata collection processes:
1. Metadata type
2. Metadata source
3. Metadata store
4. Collection type
In the Collect metadata section, update the process as needed for your use case. This is the process the portfolio workstream follows in the implementation stage when they collect metadata for waves.
Verify that your runbook is complete and accurate. This runbook should be a source of truth during the migration.
Share your metadata management runbook with the team for review.

Task exit criteria

Continue to the next task when you have completed the following:

You have prepared a single repository for storing the collected metadata.
In your metadata management runbook, you have defined and documented the following:
- The metadata attributes required for each migration pattern
- Metadata sources and detailed instructions for how to access each source
- The metadata store and detailed instructions for how to access it
- The processes used to collect metadata
- A mapping table that maps metadata attributes to the metadata sources and collection processes

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Task 1: Performing the initial discovery and validating the migration strategy

Task 3: Defining the application prioritization process