Best Practices For Data Security In The Cloud

What is data security?

Data security is the practice of safeguarding sensitive data and intellectual property from unauthorized access, tampering, and breaches—across storage, transit, and processing. In modern environments, that work happens almost entirely in the cloud, where data sprawls across providers, regions, and third-party services without a single perimeter to defend.

The challenge is that data no longer sits behind a single perimeter. It sprawls across providers, regions, and third-party services, often without centralized visibility.

In practice, data security comes down to four things:

Knowing where the data is
Controlling who can access it
Reducing the ways attackers can reach it
Proving what happened when something goes wrong

Mapping data stores, the identities that can access them, and the exposure conditions around them can help you make security decisions with full context.

The Data Security Best Practices [Cheat Sheet]

No time to sift through lengthy guides? Our Data Security Best Practices Cheat Sheet condenses expert-recommended tips into a handy, easy-to-use format. Get clear, actionable advice to secure your cloud data in minutes.

Why data security matters

Data security is essential for reducing data breach costs, preventing data exposure, and maintaining compliance across multi-cloud environments. Without strong controls and visibility, organizations face increased risk of data exfiltration, costly downtime, regulatory penalties, and long-term revenue loss from customer churn.

Cloud data security benefit	Business impact
Reduced breach risk and blast radius	Limits access to sensitive data by tightening IAM policies and eliminating public exposure, lowering the likelihood and scope of data exfiltration.
Continuous regulatory compliance	Tracks where regulated data lives and who can access it, reducing the risk of fines, failed audits, and expensive remediation efforts.
Stronger business continuity	Protects against ransomware and data corruption with secure backups and access controls, minimizing downtime and speeding recovery.
Increased customer trust and retention	Prevents exposure of customer and proprietary data, protecting brand reputation and reducing revenue loss from churn.
Full visibility across cloud environments	Identifies shadow data, training datasets, and other hidden assets to reduce blind spots and limit unexpected attack paths.

Where data security breaks down

Data security often breaks down in predictable ways. For instance, in cloud environments, things change quickly, and access is mostly controlled by identity and APIs rather than network perimeters. You can be compliant on paper and still have a real path to sensitive data in production.

Common challenges include:

Data sprawl and visibility gaps: Data copies spread across buckets, snapshots, backups, and dev environments, and nobody owns the full map.
Misconfiguration risk: A single misstep—like a publicly accessible S3 bucket, overly permissive IAM policy, or disabled encryption setting—can expose entire datasets instantly.
Compliance across jurisdictions: Data often spans regions with conflicting regulatory requirements, such as GDPR data residency rules versus US-based processing needs.
Shared responsibility confusion: Cloud providers secure infrastructure, but customers are responsible for data, access, and configurations—a boundary frequently misunderstood.
Insider threats and overprivileged access: Human and machine identities accumulate broad access, often through inherited roles and shared policies.
AI and machine learning data exposure: AI pipelines introduce new data stores that are often less governed, potentially exposing sensitive data through model access or poorly secured pipelines.

The shared responsibility model

Cloud security operates under a shared responsibility model, where both the cloud provider and the customer are responsible for different layers of security. Understanding this split is critical to preventing cloud data breaches because it defines exactly where risk originates.

Layer	Responsibility	Examples
Cloud service provider (CSP)	Secures the underlying cloud infrastructure	Physical data centers, hardware, networking, host systems
Customer	Secures everything built and stored in the cloud	Data, IAM policies, encryption, applications, configurations

In practice, most cloud data breaches originate from customer misconfigurations, not provider failures. Mismanaged access, exposed storage, and weak encryption create direct paths to data exposure, making customer-side mistakes the primary driver of risk in cloud environments.

That responsibility also shifts depending on the cloud service model, which changes how much control and risk you own:

IaaS: Customers manage most security controls, including operating systems, applications, and data.
PaaS: Providers manage more of the infrastructure, but customers still control data and access.
SaaS: Providers handle most of the stack, but customers remain responsible for data security and user access.

13 key data security best practices

Most data breaches come from gaps in visibility, access control, and configuration. The following best practices close those gaps by progressing from foundational data hygiene through advanced threat detection, giving you a structured approach to protecting sensitive data wherever it lives.

1. Define and discover sensitive data

Before you can protect sensitive data, you need to find it. Modern environments scatter data across storage systems, databases, SaaS applications, development pipelines, and endpoints — often without a central inventory. The problem compounds in cloud environments, where new buckets and services spin up faster than any manual process can track.

Effective discovery requires two steps:

Establish what constitutes sensitive data for your organization: Definitions vary, so decide what counts as sensitive for your enterprise.
Use automated tools to discover data across all environments: Manual discovery leaves gaps that attackers exploit. Invest in DSPM tools to automate discovery and eliminate shadow data. wherever it lives—on-premises, in the cloud, or in third-party services.

For example, a healthcare provider typically holds sensitive customer health and payment information. An insurance service provider might store medical histories, bank statements, and traffic offense histories. To determine whether a dataset is sensitive, consider what its disclosure to the public could do to your customers and enterprise.

2. Classify and label data

Consistent access controls and compliance reporting depend on knowing what type of data you're protecting. Data classification assigns sensitivity levels to data assets, and labeling applies those classifications as metadata that downstream systems can enforce.

Implement automated data classification: The framework you use should clearly define data sensitivity levels using easy-to-understand categorization schemes. For example, "public," "internal," and "confidential" might be your way of classifying low, medium, and high-risk assets.
Use metadata and tagging for easy identification: Feed automated data classification tools with consistent labeling data. These include keyword tags and metadata like timestamps (e.g., "creation date"), access levels (something like "admin access only") and retention tags (for example, "retain for 3 years").

3. Encrypt data at rest and in transit

Encryption renders data unreadable to anyone without the correct keys, whether an external attacker or an insider with unauthorized access. This protection applies in two contexts:

At rest: Data stored in databases, object storage, or file systems. Use AES-256 or equivalent algorithms.
In transit: Data moving between services or to end users. Enforce TLS 1.2 or higher for all connections.

Key management is just as critical as encryption itself. Store cryptographic keys in hardware security modules (HSMs) or dedicated key management services rather than alongside the data they protect.

4. Implement strong access controls

Access controls fail in two ways. Granting too much access creates an attack surface, while granting too little pushes users to create workarounds. Tilting too far in either direction can create shadow access, where users bypass your controls entirely and often go unmonitored.

A zero-trust approach assumes no identity is inherently trusted. Every access request is verified based on identity, device posture, and context before granting the minimum permissions required for the task.

Key ways to enforce access controls include:

Enforcing the principle of least privilege (PoLP): Ensure users — and AI agents and machine identities — only have access to the sensitive data and systems they need to complete their tasks.
Using role-based access control (RBAC) and attribute-based access control (ABAC): Wiz research found that 3% of service accounts with sensitive data are accessible by all users. RBAC limits permissions by job role, and ABAC adds fine-grained attributes.
Implementing multi-factor authentication (MFA) for all access points: MFA prevents attackers from accessing sensitive data using stolen or compromised credentials.
Adopting an identity and access management (IAM) tool: Understand effective permissions and identify who can access what across your environment.
Adding cloud security posture management (CSPM) for cloud-hosted data: Cloud environments add a layer of identity complexity that on-prem IAM tools weren't built for. CSPM automatically identifies misconfigurations and excessive permissions specific to cloud resources.

5. Monitor and audit data access

Visibility into who accesses sensitive data is essential for detecting breaches in progress. Continuous monitoring surfaces anomalies like bulk downloads or access from unusual locations and catches privilege escalation attempts before they succeed.

Monitoring strategies include:

SIEM integration: Route access logs to your SIEM platform for correlation with other signals.
Data loss prevention (DLP): Deploy DLP controls to detect and block unauthorized data movement — sensitive files uploaded to personal cloud storage, attached to outbound emails, or copied to removable media.
Detailed audit trails: Maintain logs of all data access and modifications to support incident investigation and compliance audits.

6. Prioritize regular data backups and disaster recovery plans

Data loss can result from ransomware, accidental deletion, or infrastructure failure. Recovery depends on having both tested backups and a documented response plan.

3-2-1 backup strategy: Maintain three copies of critical data on two different storage types, with one copy stored off-site — whether that's a separate cloud region, an offline backup, or a geographically distant data center.
Incident response planning: Define roles, communication protocols, and escalation paths before a breach occurs. Tabletop exercises validate plan execution.
Regular testing: Periodically restore from backups to verify data integrity and measure recovery time objectives.

7. Ensure compliance with regulatory requirements

Regulatory requirements vary by industry, geography, and data type. For example, a healthcare organization in California must satisfy HIPAA, CCPA, and PCI-DSS simultaneously. Each framework has distinct requirements for consent, retention, and breach notification.

Data residency adds another layer of complexity. Some regulations require certain data types to remain within specific geographic boundaries — and your architecture has to support that whether data sits in a cloud region, a private data center, or a third-party processor.

Understand the specific industry and regional regulations applicable to your business.
Create data access and usage policies that align with regulatory standards. Implement strict access controls for staff handling sensitive records.
Conduct regular compliance audits to ensure policies, configurations, and access controls in your environment meet regulatory requirements. For multi-cloud or hybrid environments, validate that cross-border transfers comply with applicable frameworks like GDPR's Standard Contractual Clauses.

8. Identify and remediate misconfigurations

Misconfigurations cause more data breaches than sophisticated attacks. A single publicly accessible storage bucket, an overly permissive IAM policy, or an unauthenticated database endpoint can expose sensitive data without any attacker action required.

Regularly scan for misconfigured storage and resources like public access controls, weak encryption settings, and exposed services.
Use CSPMs for cloud environments as automated configuration management tools to detect misconfigurations in real time and minimize human errors. For on-premises and hybrid systems, configuration management databases (CMDBs) and infrastructure-as-code scanners play a similar role.
Establish baseline configurations and enforce them consistently across your IT stack—cloud, on-prem, and SaaS. Focus on areas like access control, encryption, firewall settings, and patching.

9. Address vulnerabilities promptly

Vulnerability management at scale requires automation and prioritization that traditional periodic scanning can't match, especially in cloud environments, where new resources spin up between scan cycles. Perform regular vulnerability assessments and penetration testing to identify and resolve risks before attackers can exploit them.

Keep all systems and applications up to date with patches.
Use agentless vulnerability scanners to automatically uncover vulnerabilities throughout your environment, leaving no blind spots.
Use intrusion detection and prevention systems (IDPS) to catch threat actors in real time before they can do damage.
Use risk-based prioritization over CVSS-only scoring to make informed decisions on what needs fixing first.

10. Secure data in development environments

Despite containing sensitive data and secrets, data security is often an afterthought in development environments. According to Wiz's State of Code Security Report, 61% of organizations have secrets exposed in public repositories. Protecting data during software development is critical.

Scan code repositories for hard-coded secrets and sensitive data using code scanners and vulnerability management solutions.
Secure credentials like encryption keys, API keys, and tokens by storing them in secrets management solutions.
Scan CI pipelines for security risks like dependency vulnerabilities and pipeline poisoning issues using IaC security software.
Implement secure coding practices and code reviews to mitigate software vulnerabilities that could put data at risk before deployment.

The Secure Coding Best Practices [Cheat Sheet]

With curated insights and easy-to-follow code snippets, this 11-page cheat sheet simplifies complex security concepts, empowering every developer to build secure, reliable applications.

Download Cheat Sheet

11. Gain full context around data risks

Context turns thousands of alerts into a prioritized action list. To detect attack paths, correlate data risks with the conditions that make them reachable — network exposures, insecure APIs, misconfigurations, identity sprawl, and lateral movement opportunities.

Integrate data security insights with broader security operations using a DSPM. In cloud environments, this works best as part of a cloud-native application protection platform (CNAPP) that ties data, identity, and runtime risk together.
Use graph-based context risk scoring to prioritize remediation efforts, ensuring the most critical risks are resolved first.

12. Detect unusual behavior and potential threats

Continuously monitor your environment for suspicious activities, regardless of where the data lives.

Employ user and entity behavior analytics (UEBA) to spot deviations that traditional monitoring misses, such as legitimate credentials used in anomalous patterns.
Set up anomaly detection for data access patterns, ensuring they are flagged and alerted to immediately.
Respond swiftly to indicators of compromise (IoCs) to eliminate exploitable risks and safeguard sensitive data.

13. Protect AI and machine learning data

Sensitive data is often used to train AI models to improve their output. But if it's not properly managed, the process can expose both the training data and the model, creating an entirely new risk category. To avoid this:

Identify where sensitive AI training data is, gain visibility into the data, and eliminate all associated risks.
Remove attack paths to training data to prevent exposure through inference attacks or corruption via data poisoning.
Implement output filtering, data anonymization, and differential privacy to prevent models from revealing sensitive training data. These techniques keep data safe without changing AI model behavior.

Implementing data security best practices with Wiz DSPM

Figure 1: Wiz correlates and prioritizes data security risks

Protecting your data means closing the gaps between where data lives, who can access it, and how it can be reached. That’s why Wiz approaches data security as a connected problem, not a collection of point solutions, by correlating data locations with context. It also surfaces misconfigurations and attack paths that lead to sensitive data.

This correlation surfaces the risks that actually matter rather than generating thousands of undifferentiated alerts.

DDR capabilities: Wiz DSPM scans data with context to deliver detection and response capabilities that keep you ahead of threat actors.
Data access governance: Wiz DSPM identifies effective permissions and spots over-privileged entities. It also detects and removes identity risks like excessive permissions and weak authentication.
Automated compliance assessment: Wiz DSPM assesses your compliance with hundreds of frameworks. View your compliance posture at a glance using the Wiz heatmap and scores.
Multi-cloud support: Wiz integrates easily with various cloud providers and unifies multi-cloud data security into a single dashboard for easier management.

Figure 2: Wiz compliance heatmap

For organizations deploying AI workloads, Wiz AI Security extends this visibility to training data, model pipelines, and inference endpoints. The same graph-based approach that identifies attack paths to databases also identifies attack paths to AI training data.

The best time to start implementing data security best practices is now. Start with a free Wiz demo.

Protect your most critical cloud data

Learn why CISOs at the fastest companies choose Wiz to secure their cloud environments.

13 Data Security Best Practices Every Security Team Needs