Engineering

How to build a secure AWS infrastructure

August 28, 2020

min

Table of Content

Example H2

Every day more businesses migrate from their traditional IT infrastructure, while the pandemic has only accelerated the adoption of cloud technologies among remote workforces. Cloud services such as Amazon Web Services (AWS) have been widely accepted as a channel for cloud computing and delivering software and applications to a global marketplace, cost effectively and securely. However, cloud consumers tend to wash their hands of the responsibility towards securing their cloud infrastructure.

Cloud service providers and consumers share the responsibility of ensuring a safe and secure experience on the cloud. While service providers are liable for the underlying infrastructure that enables cloud, users are responsible for the data that goes on the cloud and who has access to it.

AWS cloud service

The AWS Well-Architected Framework is a guide/ whitepaper issued by Amazon on AWS key concepts, design principles, and architectural best practices. Security is one of the five pillars that this Framework is based on, upholding the fact that protecting your data and improving security is crucial for AWS users. This blog intends to summarize the whitepaper on the security pillar and discuss:

Design principles for AWS
Few use case scenarios, and
Recommend ways to implement a securely designed AWS infrastructure.

AWS provides a variety of cloud services, for computation, storage, database management, etc. A good architecture commonly focuses on the efficient methods for reaching peak performance, scalable design, and cost saving techniques. But other cloud infrastructure design aspects are given more importance, quite often, compared to the security dimension.

The security of the cloud infrastructure can be divided into five phases:

Identity verification and access management with respect to AWS resources.
Attack detection, identification of potential threats and misconfigurations.
Controlling access via defining trust boundaries, applying best practices in operation.
Classifying all data, protecting data at all states: rest and transit.
Incident response: Pre-defined mechanisms to respond and mitigate any surfacing security incident.

The Shared Responsibility Model

As I mentioned earlier, it is the collective responsibility of the user and the AWS service provider to secure the cloud infrastructure. It is important to keep this in mind while we explore the different implementation details and design principles.

AWS provides plenty of monitoring, protection and threat identification tools to reduce the operational burden of its users, and it is very important to understand and choose an appropriate service to achieve a well secured environment.

AWS offers multiple services of different nature and use cases such as EC2 and Lambda. Each of these cloud services have varying levels of abstraction that enable users to focus on the problem to be solved instead of its operation. The share of each party’s responsibilities similarly vary based on the level of abstraction. With higher levels of abstraction, the share of responsibility to provide security in the cloud shifts further to the service providers (with some exceptions).

AWS - Shared Responsibility Model — AWS – Shared Responsibility Model

Management and Separation of User Accounts to Organise Workload

Based on the nature of processes that are run on AWS, and the sensitivity of the data that is processed, workloads can change. They must be separated by a logical boundary and organised into multiple user accounts to make sure that different environments are isolated. For instance, the production environment commonly has stricter policies, more compliance requirements, and must be isolated for the development and test environments.

It is important to note that the AWS root user account must not be used for common operations. And using AWS Organizations one could simplify things and create multiple users under the same organisation, with different access policies and roles. Also, it is ideal to enable Multi-Factor Authentication, especially on the root account.

Managing Identity and Permissions

AWS Resources can be accessed by humans (such as developers or app users) or machines (such as EC2 instance or Lambda functions). Setting up and managing an access control mechanism based on the identity of the requester is very important, as these individuals seeking access could be an external or internal part of the organization.

Each account should be granted access to different resources and actions using IAM (Identity and Access Management) roles, with policies defining the access control rules. Based on the identity of the user account and the IAM attached, certain critical functionalities can be disabled. For example, denying certain changes from all the user accounts, with exceptions for the Admin. Or preventing all users from deleting Amazon VPC flow logs.

For each identity added on AWS Organisation, they should be given access to only a set of functions that are necessary to fulfil the required tasks. This will limit unintended access to functionalities. And unexpected behaviours arising from any identity will only have a small impact.

Leveraging AWS Services to Monitor and Detect for Security Issues

Regular collection and analysis of logs generated from each workload component is very important to detect any unexpected behaviour, misconfiguration or a potential threat. However, collection and analysis of logs is not quite enough. The volume of incoming logs can be huge, and an alerting and reporting flow should be set up along with an integrated ticketing system. AWS provides services such as these to ensure automated and easy processes:

CloudTrail: Provides the event history of the AWS account activity which includes all AWS services, Management console, SDKs, CLIs, etc.
Config: Enables automated assessment, auditing, and evaluation of the configuration of each AWS resource.
GuardDuty: Continuous security monitoring service that flags malicious activity surfacing within AWS environments by analysing log data and searching for patterns that may indicate any sort of privilege escalation, exposed credentials, established connections to malicious IPs, or domains.
Security Hub: Presents a comprehensive view of the security status of AWS infrastructure by enabling aggregation, prioritization, deduplication of security alerts from multiple AWS services and even third party products.

Protecting the Infrastructure: Networks and Compute

Obsolete software programmes and outdated dependencies are not unusual and it is essential to patch all systems in the infrastructure. This can be done manually by system administrators, but it is better to use the AWS Systems Manager Patch Manager which basically automates the process of applying patches to the OS, applications and code dependencies.

It is crucial to set up AWS security groups in the right way, mainly during the phase when the infrastructure is growing at a fast rate. Things often go wrong when unorganized, messy security groups are added to the infrastructure. Creation of security groups and assignment of them should be dealt with caution, as even a slight overlook can result in the exposure of critical assets and data stores, on the internet. Security groups should clearly define ingress and egress traffic rules, which can be set under the Outbound traffic settings.

If some assets are required to be exposed on the internet, make sure your network is protected against DDoS attacks. AWS services such as Cloudfront, WAF, and Shield help to enable DDoS protection at multiple layers.

Protecting the Data

The classification of all data stored at multiple locations inside the infrastructure is essential. Unless it is clear which data is most critical and which ones can be directly exposed on the internet, setting up protection mechanisms can be a bit of a task. Data resting inside all the different data stores must be classified in terms of sensitivity and criticality. If the data is sensitive enough to prevent direct access from users, policies and mechanisms for ‘action at a distance’ shall be put in place.

AWS provides multiple data storage services, the most common ones being S3 and EBS disks. Application data can usually be found lying around inside data stores self hosted on EBS volumes. Also, all sensitive data that goes into S3 buckets should be properly encrypted prior to that. In fact, it would be better to enable encryption by default on these.

Protecting in transit data is also equally important, and to do that, secure connections are required, which can be obtained using TLS encryptions. Making sure that data is transferred over secure channels should be enough. AWS Certificate Manager is a good tool to manage SSL/ TLS certificates.

Preparing and Responding to Security Incidents the Right Way

Once all the automation has been set up, and security controls are put in place, designing incident response plans and playbooks becomes easier. A good plan must cover the response, communication, and recovery steps following any security incident. This is where the logs, snapshots and backups, GuardDuty findings play a critical role. They make the task relatively more efficient. Overall, the aim should be to prepare for an incident before it happens and to iterate and train the entire team to thoroughly follow the incident response plan.