Data security findings: A technical deep dive

Greg Wiseman

Data security findings: A technical deep dive

Published:

July 31, 2025

Table of contents

This is the block containing the component that will be injected inside the Rich Text. You can hide this block if you want.

We recently shared a high-level look at our new Sysdig Secure data security findings feature, highlighting how it helps security leaders reveal data exposure and contextualize threats. That post covered the “what” and “why” of the feature. In this post, we’ll take a closer look at the “how.”

Building deep data security without adding overhead

Delivering meaningful data security insights often comes at the cost of additional tools, agents, or operational complexity. We wanted to avoid that trap. Rather than requiring customers to deploy a separate, standalone scanning product that introduces new agents or moves sensitive data out of their cloud environment for analysis, we've extended the power of the Sysdig Secure platform with sensitive data discovery, powered by an integration with Bedrock Security. This approach treats data as a first-class security signal, alongside vulnerabilities, misconfigurations, and runtime events. This creates a unified model of risk that practitioners can use to prioritize and remediate with confidence.

The technical challenge: From data to findings

In our previous post announcing data security findings, we touched on the challenge of data sprawl. For engineers, this is more than a buzzword: It’s a persistent technical challenge. Cloud storage is often unstructured, constantly changing, and distributed across multiple services. Traditional scanning solutions can be slow and resource-intensive, and require exfiltrating data to a separate system for analysis. That not only increases cost but also introduces additional security and compliance risks.

To solve this, we’re using a scanning mechanism that operates in place, directly within the customer’s cloud environment. By partnering with Bedrock Security, we leverage AI-powered classification and detection to analyze data where it lives — for example, in S3 buckets — without moving sensitive content outside of the account. Only classification results and metadata are returned to Sysdig, where they are ingested as structured findings and correlated with other elements of your cloud security posture.

This serverless architecture, powered by Adaptive Sampling, efficiently scales to analyze massive and constantly changing datasets. It’s cost-effective, accurate, and avoids the pitfalls of lift-and-shift data analysis. Most importantly, it keeps sensitive data in place while enriching the Sysdig risk model with data-driven intelligence.

How it works

Understanding the architecture behind new capabilities is key for practitioners. We've designed Sysdig data security findings to operate with minimal friction and maximum security by keeping sensitive data in place while enhancing visibility and risk context across your cloud security environment. The diagram below illustrates the high-level flow, from data discovery within your cloud environment to actionable insights in the Sysdig Secure Platform.

As the diagram shows, the process begins directly within your cloud, leveraging serverless technology to efficiently scan and classify sensitive data without requiring new agents. Only classification results and metadata are generated by the in-environment outpost analyzer and then transmitted to the Bedrock SaaS Platform for further processing before being securely sent as structured findings to the Sysdig Secure platform. This "metadata only" approach is a critical design choice, ensuring sensitive information never leaves your environment. Once integrated into Sysdig Secure, these findings immediately enrich your existing cloud security posture, enabling powerful new use cases that we'll explore next.

Integrating data into the graph

The real value of data security findings lies in how they connect to Sysdig’s broader context. A single data finding may indicate a risk, but combining it with other signals, like a misconfiguration or anomalous runtime activity, turns it into an actionable, prioritized security event. This directly addresses the problem of security teams lacking context to effectively prioritize risk or enforce policies for business-critical data.

Enriching the Risk Engine

Attack paths in Sysdig Secure map potential attack paths across assets, configurations, and identities. With data security findings, sensitive data is now a critical node in this graph. This provides crucial information to help apply the right security controls and prioritize fixes that reduce the most risk. For example, if a publicly accessible S3 bucket contains PII, the graph can now highlight a direct path from a misconfiguration to sensitive data exposure.

Actionable SysQL Queries

For teams that prefer automation or need custom alerts, all data findings are fully queryable via SysQL. This enables rapid identification of high-risk resources without the need to manually correlate findings across tools.

By querying for sensitive data in exposed or misconfigured environments, security teams can quickly elevate the threats that matter most, accelerating triage and policy enforcement. This aligns with Data Detection and Response (DDR) principles and allows risks in Sysdig Secure to be precisely tailored to your business priorities.

For example, to find exposed cloud resources containing credit card data in a staging environment, you could run:

MATCH CloudResource IN Zone WHERE CloudResource.isExposed = true AND Zone.name IN ['pci-dev']

MATCH CloudResource CONTAINS SensitiveData WHERE SensitiveData.dataClass IN ['Credit Card PAN'] RETURN DISTINCT CloudResource, Zone, SensitiveData LIMIT 50;

Or if you’re concerned about AI-related workloads with access to sensitive data (for example because of training concerns or potential leaks via retrieval-augmented generation) you could try:

MATCH EC2Instance PACKAGE_INSTALLED_ON Package OVER RuntimeMetadata WHERE Package.isAI = true MATCH EC2Instance ASSOCIATED_WITH IAMInstanceProfile CONTAINS IAMRole HAS_ACCESS S3Bucket CONTAINS AS CONTAINS2 SensitiveData RETURN DISTINCT EC2Instance, Package, IAMInstanceProfile, IAMRole, S3Bucket, SensitiveData LIMIT 50;

A Unified User Experience

Data security findings integrate directly into the Sysdig Secure UI. Resource detail views now include a dedicated data findings section, alongside existing context like vulnerabilities, misconfigurations, and other findings. Teams can see everything they need in one place without breaking out of workflows, leading to faster, more confident decisions during incident response.

Making CNAPP more context-aware

Data security findings don’t just add another signal; they strengthen the overall intelligence of our CNAPP platform. By connecting the where (cloud resources), the how (runtime activity and vulnerabilities), and now the what (sensitive data), Sysdig Secure provides a more accurate view of business risk and helps satisfy several key compliance requirements.

Our goal is to reduce noise, highlight the issues that matter most, and help teams protect sensitive data with better speed and confidence. This continuous feedback loop directly supports reducing Mean Time to Remediate (MTTR) for risks with sensitive data findings. We encourage you to explore this new capability and connect with your account team to enable data discovery in your environment.

Want to see Sysdig Secure in action? Request a demo today!

‍

join our newsletter

Stay up to date– subscribe to get blog updates now

Thank you!

We’ve received your submission and will be in touch soon.

About the author

No items found.

featured resources

Test drive the right way to defend the cloud with a security expert

GET A DEMO