Enterprises are rapidly adopting cloud-native architectures and design patterns to help deliver business values faster, improve user experience, maintain a faster pace of innovation, and ensure high availability and scalability of their products. Cloud-native applications leverage modern practices like microservices architecture, containerization, DevOps, infrastructure-as-code, and automated CI/CD processes.
Cloud-native application security is a cloud-first approach used to deploy applications securely at scale by embedding security into the software development lifecycle to detect vulnerabilities earlier. This article will walk through the critical challenges of cloud-native application security, demonstrate how to build security into the CI/CD pipeline, and introduce the core practices of cloud-native security.
Cloud-native architectures bring in challenges related to application and infrastructure security. Let us look at a few of the most prominent challenges organizations face related to cloud-native security.
Traditional security tooling is built for static environments and is ineffective in the dynamic and rapidly changing cloud-native landscape. Furthermore, with the advent of microservices, containers, service meshes, and multi-cloud environments, it has become increasingly difficult for organizations to track software vulnerabilities. As a result, there is an increased dependency on automation and continuous monitoring throughout the application lifecycle.
When development teams build products, their primary focus areas are functionality and usability. Faster release cycles make it difficult to inspect and resolve security vulnerabilities correctly. In addition, development teams do not always have the required skillset to identify security issues and, at the same time, do not want to be slowed down by unknown security concerns. As a result, security often takes a back seat. However, it would be best to consider security an integral part of the DevOps pipeline amidst the need to deliver high-quality software in a cloud-native landscape.
By adding reusable external dependencies in the codebase, developers can leverage complex functionalities without developing and maintaining them. However, open-source libraries are susceptible to being compromised, causing security issues in your application. Therefore, you must do your due diligence to ensure that software dependencies are inspected for malware and vulnerabilities.
Security in the cloud brings a new set of challenges that your organization might not be trained to handle. Hence it is imperative that you evaluate and finalize the right tools to secure your applications in a cloud-native world.
With containers spinning up and down within seconds, you need tools to provide real-time visibility into your containerized environments. The attack surface in the cloud is rapidly increasing, and there are numerous cases of data breaches, compliance issues, and compromised APIs. From a security standpoint, having complete observability of your workloads by leveraging the right tools for logging, metrics, traces, and alerting is critical.
Today, enterprises leverage third-party security tooling and managed services provided by their public cloud provider to build their cloud security posture. However, it is challenging to develop centralized policies and guardrails that apply across your cloud-native environments. This requires your development teams to work closely with the security team. As a best practice, you should have guardrails in place, which can disallow actions that lead to policy violations.
Having DevOps processes in place results in improved efficiency, reduced failures, faster deployment cycles, enhanced application performance, and better customer experience. Taking a step further, DevSecOps can be defined as a practice to deliver secure software through a continuous delivery model. Therefore, security should be considered an integral part of your CI/CD pipeline, as seen in Figure 1, and you need to ensure that it is built into the application lifecycle phases in an iterative and automated manner.
The velocity and frequency of feature deployment had increased tremendously in a cloud-native world. A single security team in an organization cannot be entirely responsible for the security of all your applications in the cloud. Bridging the gap between development, operations, and security teams are critical to deploying secure applications. It would be best to build security controls into all your pipeline stages to shift security left. Fixing security issues in production is expensive, and hence incorporating security practices during the development phase is highly recommended. Shifting left requires collaboration and engagement between teams during the early stages of your development cycle.
Hardening security requirements during the initial design and development phases is essential. It is best to encourage development teams to keep security in mind while writing unit, integration, and end-to-end tests. As a best practice, do not just focus on happy-path workflows but have effective coverage on negative workflows, boundary conditions, and edge cases. Always test the error handling scenarios’ authentication workflows and maintain extensive coverage for high-risk and frequently used code. Since testing is built into the CI/CD process, you cannot release code to production without passing tests.
Static code analysis tools have many security-related rules that cover well-established security standards such as OWASP Top 10 and CWE. You can also add custom rules to identify security issues. Security injection rules like cross-site scripting, SQL injection, denial of service, and code injection indicate issues at the application level that needs to be addressed by developers who follow coding standards.
Security hotspots are sensitive pieces of code to be reviewed during the code review process. However, when a security vulnerability is detected, it might have a broader impact on your application and need to be fixed immediately. As part of the CI/CD pipeline, every code change will get scanned by these security rules and flagged if there are outliers. You can fail your quality gates, as seen in Figure 2, when the security standards are not met.
Figure 2: SonarQube Quality Gate
Peer code reviews are a common practice among development teams. You can implement mandatory code reviews to promote secure code writing by catching common mistakes and vulnerabilities before it gets committed to source control. When a pull request gets created for a particular functionality, ensure a security focus while reviewing the changes. Look out for secure practices like sanitizing outputs, proper secret management, no hardcoding of sensitive data, authentication workflows, session management, logging, and exception handling.
Cloud-Native Security Patterns and Anti-Patterns
The cloud-native architecture enables organizations to build and run scalable applications in a dynamic environment. However, it does come with several challenges — security, cost, governance, observability, and more. Let us look at some of the best practices every development team working in the cloud-native space needs to embrace to secure their applications.
Zero trust is a strategic approach to rebuild and modernize security by enforcing strict access controls to protect data, applications, and networks. By inspecting and monitoring network traffic to catch any malicious activity, the zero-trust architecture helps reduce the blast radius in case of a compromise. In a cloud-native architecture that uses a combination of microservices and containers, service mesh helps reduce the surface area of attack and implement the zero-trust security model.
|Pattern||Every entity must authenticate itself, and implicit trust in data and applications is denied even within a network perimeter.|
|Anti-Pattern(s)||Workloads are not monitored for misconfigurations and vulnerabilities. A least-privilege access strategy between components is not implemented.|
IAM is a core component of the security management posture within an organization that enables the proper entities to access the right resources. IAM protects against compromised access, safeguards resources within the network, and provides comprehensive security against phishing and ransomware attacks.
|Pattern||Following the zero-trust model, each entity is authenticated and authorized when logging in or accessing resources.|
|Anti-Pattern(s)||Not visualizing IAM as a framework of policies and processes like single sign-on, multi-factor authentication to help mitigate risk.|
The least-privilege policy grants permissions to only the resources required to perform the task; no other access gets assigned. Having overprivileged users and roles in an organization increases the risk factor. With an increasing number of security breaches caused by privileged credentials, it is best to always validate policies and adopt the least-privilege principle by default.
|Pattern||As a security best practice, when you create your IAM policies, start with a minimum set of permissions and grant additional permissions as required.|
|Anti-Pattern(s)||Providing too broad permissions increases the blast radius and risk factor.|
Cloud secrets management refers to tools and methods to securely manage secrets — passwords, certificates, SSH keys, encryption keys, and API tokens. You should have a strategy to rotate your passwords periodically. Public cloud providers provide managed services to handle secrets and their management.
|Pattern||It is essential to have policies and procedures for secret management established, documented, and communicated across the development teams in your organization.|
|Anti-Pattern(s)||Storing sensitive credentials in code repositories.|
Building incident response and triaging strategies are challenging when you have microservices running in a Kubernetes cluster in a cloud-native landscape. When you treat your workloads running in containers as cattle and not pets, performing post-mortem analysis and gathering audit trail events become difficult.
Containers spin up and down frequently, and so, responding to security threats in a transient environment requires a different strategy. Incident response is critical to resolving security issues efficiently and spreading awareness within your organization about operational duties.
|Pattern||As you start creating an incident response playbook, it is crucial to have access to proper observability tools, including logs, metrics, and traces.|
|Anti-Pattern(s)||There is no proper audit trail and monitoring to support troubleshooting activities.|
Cloud-native microservices support polyglot persistence, and therefore, development teams have flexibility in choosing the appropriate database technology, as seen in Figure 3, for developing their services. These datastores can store both structured and unstructured data to support a variety of functions like search, reporting, time-series, caching, transactional, etc.
|Pattern||You will need to support critical data management functions like backup and recovery, archival, data replication, data encryption at rest, and motion. When it comes to data auditing, be aware of regulatory compliance laws set by the government like HIPAA, GDPR, and FedRAMP to protect consumer rights.|
|Anti-Pattern(s)||Excluding data from your automated CI/CD pipeline.|
Figure 3: Polyglot persistence in cloud-native applications
Many organizations are running containerized workloads in production. Containers make it easy to package, deploy and run your code, thereby increasing the speed and portability of your application. It is necessary to secure the container image to secure containers.
|Pattern||Images in the popular container registries are not guaranteed to be free from vulnerabilities; hence, you should have a process for vulnerability scanning of your container images before deploying them to production.|
|Anti-Pattern(s)||No automated strategy to periodically scan the container images.|
Developing services in the public cloud can trigger new security threats like malware and ransomware. You can leverage managed services provided by cloud providers or third-party vendors that use machine learning and artificial intelligence to identify security threats and vulnerabilities across your organization.
|Pattern||You need to continuously monitor your cloud resources, have unified visibility into security incidents, and develop a strategy to detect unauthorized activities.|
|Anti-Pattern(s)||No policies have been created to detect malicious activities like suspicious user actions, unsuccessful login attempts, network anomalies, and unusual activities that indicate credential compromise.|
Cloud-native architectures leverage the principle of immutability to manage infrastructure resources. If you need to make any configuration changes, you don’t modify the server; instead, build a new server with the updated configuration. IaC ensures consistency between environments and enables better DevOps practices by deploying infrastructure code in an automated and repeatable manner.
|Pattern||Development and security teams can use IaC tools like Terraform, Chef, Puppet, and Ansible to create guardrails, policies, patch vulnerabilities, and fix configuration issues seamlessly across environments without worrying about drifts. With IaC, all your infrastructure changes are peer-reviewed and stored via source control for increased visibility.|
|Anti-Pattern(s)||Making infrastructure changes manually creates configuration drifts across environments.|
With enterprises growing their workloads rapidly and adapting multi-cluster/multi-cloud environments, it becomes crucial to have a centralized view of your systems. Furthermore, to have a sound observability strategy, you need to continuously profile your applications and collect a considerable volume of data round the clock.
|Pattern||Provide observability to the teams as a platform offering — and not something they have to build and maintain for individual services.|
|Anti-Pattern(s)||Lack of robust security tooling to make sense of the high volume of logs, metrics, and trace data produced by your applications.|
Security is a shared responsibility between the cloud service provider and its customers in the public cloud. The shared model helps to reduce the operational burden on customers, as the cloud provider protects the entire infrastructure containing the service deployments.
At the same time, customers are responsible for securing the application code, data, identity and access, containers, and workloads running in the cloud that contain business logic. Once you have clarity on these shared responsibilities, development teams can focus on building business features and not worry about the day-to-day operational issues in the infrastructure layer.
To summarize, the cloud provider is responsible for the security “of” the cloud, whereas the customer is responsible for the security “in” the cloud.
Figure 4 below illustrates Microsoft’s shared responsibility model in the cloud and the various responsibilities between Microsoft and its customers.
Figure 4: Microsoft’s shared responsibility model
Image Source: “Shared responsibility in the cloud”
OWASP Top 10 is a set of development techniques that helps developers improve their web applications’ security and enables teams to shift security earlier into the design and coding phases. It encourages guidelines like integrating security into the CI/CD pipeline, parameterizing queries, validating all inputs, implementing error handling, improving logging strategy, leveraging the benefits of security frameworks, protecting data at rest and encryption, reducing sensitive data exposure, implementing secure access controls, and more.
Cloud-native architectures have seen rapid adoption in the recent past. However, there are numerous security challenges because of the complex and dynamic landscape. Users have faced multiple security risks like data breaches, data loss, denial of service, insecure APIs, account hijacking, vulnerabilities, and identity and access management. Enterprises need to continuously adapt security best practices to handle these issues, as explained in this article.
The core security concepts cannot be isolated and must be consistently integrated into the development lifecycle. Enterprises have been able to find ways to balance security and the speed of delivery by embracing automation, continuous delivery, and most importantly, building a DevOps culture.
It is highly recommended that article readers also study the CNCF Cloud Native Security Whitepaper that focuses on key challenges cloud-native application security and provides guidance to architects and developers.
Editor’s Note: This is one of my articles that was originally published in the DZone’s Refcard for CloudNative Application Security