Platform architecture
Establish and maintain guidelines, principles, patterns, and guardrails for your cloud environment.
A well-architected cloud
environment
Start
Define a multi-account strategy
A good multi-account strategy considers scale and operational efficiency concerns. This means
isolating your
workloads
Define preventative controls
Plan for a secure, multi-account environment with an embedded set of default controls (guardrails). Begin to understand and use a mechanism such as service control policies (SCPs) to manage service use across your organization, including the AWS Regions that are available for consumption within your cloud platform. Policies provide a centralized mechanism for controlling the maximum permissions available for all accounts and ensuring that they adhere to the organization's access control guidelines.
Define organizational unit structure
Organizational units (OUs) serve as a practical way to manage and categorize accounts based on regulatory requirements and software development lifecycle (SDLC) environments. By using OUs, organizations streamline the process of applying for appropriate policies and permissions across their cloud infrastructure. Workload OUs are specifically designed for accounts that support application infrastructure resources, and ensure that the right policies are enforced. Using OUs and SCPs help enhance your organization's cloud infrastructure's security and compliance while also ensuring the smooth operation of your applications and services. This ultimately leads to a more efficient and robust cloud adoption process.
Define network connectivity
Network
connectivity
When you design your network architecture, consider if you have workloads that you want to
retain on premises
-
Connectivity to and from the internet. This aspect involves providing secure and reliable connections between your applications or workloads and the internet. This connectivity is essential for facilitating access to web-based resources, enabling communications between users and applications, and ensuring that your services are accessible to the public when needed.
-
Connectivity across your cloud environments. This area focuses on establishing robust connections among various components and services within your cloud infrastructure. It ensures that data and resources are easily shared and accessed across different cloud services, promoting efficient collaboration and smoother operations. A key consideration here is your use of virtual private clouds (VPCs). To keep things simple, consider creating standards on how VPCs are created and tracked. Consider creating these standards programmatically, and plan to use an IP address management (IPAM) solution. Allocate enough IP space to allow for growth, and design subnet structures for easy troubleshooting when using multiple Availability Zones. Make sure to follow security best practices for VPCs when you design and implement network connectivity.
-
Connectivity between your on-premises network and your cloud environments. This aspect deals with the integration of your on-premises infrastructure with your cloud-based environment. By creating secure and reliable connections between the two, organizations benefit from the advantages of hybrid architectures. For example, you can use on-premises resources and cloud services simultaneously for improved performance, scalability, and cost optimization.
By addressing these three key areas of network connectivity, you can build a robust cloud infrastructure that supports your applications and workloads effectively, so you can capitalize on the benefits of cloud adoption. Take note of networking requirements, and create a simple design that enables you to scale in accordance with your multi-account strategy.
Define DNS strategy
A well-planned DNS strategy helps you avoid complications as your cloud environments grow. If you maintain on-premises DNS capabilities, we recommend that you design hybrid DNS architectures that use on-premises DNS infrastructure along with cloud DNS for any cloud-based DNS requirements. Integrate DNS resolution with on-premises DNS environments by using resolver endpoints and forwarding rules. Use private hosted zones to hold information about how you want cloud DNS to respond to queries for a domain and its subdomains within one or more networks.
Define tagging standards
Tagging resources is an essential practice to manage costs effectively and identify ownership of resources. Consider how your organization will further allow consumption in the cloud, including the use of specific services within the platform. Define a tagging strategy that tracks which resources are being deployed by which teams. Take inputs from the AWS CAF Operations perspective and use tags to automate tasks for your deployed infrastructure.
Additionally, by tagging resources with relevant metadata, you can group and track your spending based on your organizational requirements dictated in the Cloud Financial Management (CFM) capability in the AWS CAF Governance perspective. Identify a mechanism for reporting that supports your accounting and financial practices, including actions to be taken when financial policies are violated.
Define an observability strategy
Establishing an observability strategy is a critical step toward optimizing and securing
your cloud architecture. This strategy revolves around transforming the metrics and logs
produced by your cloud services into actionable insights for strategic decision-making.
Prioritize monitoring key performance indicators and setting up alerts to preemptively address
potential issues. To prevent tool proliferation, optimize costs, and focus on what matters most
to your organization, incorporate this observability strategy across both your platform and
applications. For further guidance, see our presentation on Developing an observability strategy
Advance
Define proactive and detective controls
To advance, your organization must identify the need for proactive and detective controls (guardrails) within the environment. Create policies that define the guardrails or limits that roles and users have in the accounts located within an organizational unit (OU). Review any default detective guardrails for the platform, and choose which guardrails to apply. Create additional preventive and detective controls as required, and group them by OUs to align them to your multi-account strategy. Consider which organizational tools and mechanisms you need to inspect non-compliant resources that are identified by detective controls.
Define standards for service onboarding
Create standards for the acceptable use of the platform and the patterns associated with service consumption and how that will be governed. Consider which initial services are allowed for use. Create a document that outlines these standards and publish them to users and operators of the platform. Ensure that these standards adapt over time to meet the changing objectives of the organization and the evolving capabilities of cloud computing.
Define patterns and principles
Consider which architectural patterns will be allowed within your organization by using inputs from application owners, and begin to define blueprints for standardization. Standardization allows for greater governance and lower administrative burden as you scale in the cloud. Define patterns that will use infrastructure as code (IaC) and plan for a simplified deployment model by using a service catalog that's integrated into your change control processes and IT service management (ITSM) systems. Define how these blueprints will be used and the circumstances for allowing exceptions. Plan for those exceptions and their governance, with considerations for authentication, security monitoring, and guardrails.
Excel
Define remediation patterns
Consider how to annotate and prioritize your detective guardrail findings so they can be remediated in accordance with your security and compliance frameworks. Plan to use automation to detect out-of-policy provisioning of resources, including those that violate budgetary and tagging policies. Identify the capabilities needed to set and measure service-level objectives while updating your runbooks and playbooks. Set periodic reviews of these practices and a feedback mechanism to capture data related to platform evolution. Define mechanisms to create and update runbooks and playbooks accordingly.
Communicate and refine policies
Create a centralized content management system for all documentation and distribute it to the users and operators of the platform. Create a mechanism to capture feedback for future consideration on changes to the policy.
Understand financial management capabilities
Organizations thrive when they maintain a transparent and comprehensive understanding of
their budget. This empowers them to make well-informed decisions, allocate resources
efficiently, and accomplish their strategic objectives. A clear view of the budget helps
organizations excel by facilitating informed decision-making, effective resource allocation,
cost control, performance measurement, and the maintenance of accountability and compliance.
This ultimately results in a more efficient, financially stable, and prosperous organization.
When you have a successful tagging strategy, you can use cost filters in AWS Budgets to filter expenses based on resource tags. This helps you create a
budget that's tailored to specific projects, departments, environments, or other criteria,
further enhancing financial management capabilities. You can associate cost
allocation tags and AWS Cost Categories