Retail tech company triples threat remediation speed with zero downtime and 680% ROI

Holiday Season Breach Attempt. No Impact. No Downtime.

< back to customer stories

VIEW PDF

Holiday Season Breach Attempt. No Impact. No Downtime.

threat remediation

Zero

downtime

680%

return on investment

threat remediation

Zero

downtime

680%

return on investment

“We cannot afford a single misstep, especially during our peak season. When we lost visibility, it wasn’t just about downtime. It meant putting customer trust on the line. That's where Sysdig made the difference.”

SVP Technology, E-commerce Leader

Company Overview

A leading platform for return management, this company serves top-tier online retailers and is scaling rapidly. The first quarter is its most critical period, with intense transaction volumes following the holiday season. Customers expect speed and reliability; merchants demand ironclad service-level agreements for uptime, performance, and data integrity, especially during the peak post-holiday return season.

The stakes are high. Even seconds of downtime during peak season can damage customer trust and disrupt revenue.

The company's Kubernetes-based architecture enables agility but also amplifies risk when visibility gaps occur. For the senior vice president (SVP) of technology, visibility and control are top priorities during high-stakes periods. That became urgent when a serious security incident unfolded at the worst possible time.

Business Challenges

Visibility gaps from misconfigurations and offline agents exposed critical workloads

Operational shortcuts increased risk during peak revenue periods

Traditional tools failed to detect advanced in-memory and kernel-level threats

High pressure to protect customer trust with no tolerance for downtime

Retail Tech Company

A stealth cloud attack became a catalyst for faster containment, automated defenses, and measurable business impact.

headquarters

‍Industry: E-commerce

Infrastructure: Amazon Web Services

Solution: Sysdig Secure

Company Overview

The stakes are high. Even seconds of downtime during peak season can damage customer trust and disrupt revenue.

Business Challenges

Visibility gaps from misconfigurations and offline agents exposed critical workloads

Operational shortcuts increased risk during peak revenue periods

Traditional tools failed to detect advanced in-memory and kernel-level threats

High pressure to protect customer trust with no tolerance for downtime

Retail Tech Company

A stealth cloud attack became a catalyst for faster containment, automated defenses, and measurable business impact.

headquarters

‍Industry: E-commerce

Infrastructure: Amazon Web Services

Solution: Sysdig Secure

Table of Contents

Text Link

This is the block containing the component that will be injected inside the Rich Text. You can hide this block if you want.

Every Second Counts

According to the 2023 Global Cloud Threat Report, cloud attackers are quick and opportunistic, spending only 10 minutes staging an attack. This is down from 16 days in on-premises environments.

The Gap: Misconfigurations and Missing Agents

A Perfect Storm of Vulnerability

It began with what seemed like a routine change. During a maintenance window, a Kubernetes service was taken offline and the Sysdg agent responsible for runtime security was disabled. Perimeter protections were intact, but with the environment now blind to runtime activity, attackers found their window.

Unbeknownst to the team, malicious reconnaissance tools were continuously sweeping the internet for just this kind of oversight. The exposed workload, running PHP-FPM, was a known target for remote code execution. In today’s high-speed threat landscape, even minor misconfigurations can become a siren call for opportunistic adversaries scanning billions of endpoints for vulnerable openings.

Initial cryptomining attempts silently failed, likely blocked by default container constraints such as restricted write access or limited privileges. But those failures didn’t deter the attackers. They escalated, impersonating trusted agents and executing lateral movement. Ultimately, they deployed Perfctl, a stealthy rootkit designed to siphon computing resources for cryptomining while evading detection across scanners, logs, and traditional monitoring tools.

The Incident: Advanced Lateral Movement and Perfctl

A Midnight Alert from Sysdig Threat Research

While the company’s engineers had missed the initial after-hours alert, Sysdig’s Threat Research team did not. Around midnight, they flagged a high-severity alert. Attackers were masquerading as Datadog agents.

The threat escalated quickly. The attackers exploited the impersonation to pivot into a private internal namespace that should never have been exposed. Containers in that namespace were running with root privileges, and no runtime policies were in place to detect or block the intrusion. Without visibility or enforcement controls, the team initially couldn’t assess the blast radius or contain the spread.

The only viable path forward was a full rebuild. Within approximately 20 minutes, the team wiped and redeployed every pod and container, quickly removing the attackers' immediate foothold.

Post-Restoration Discovery – A Stealth Rootkit

Once Sysdig agents were restored, the full scope of the attack became clear. Telemetry revealed that the attacker had returned with Perfctl, a cryptoming rootkit engineered to hide its presence. Build-time scanners and cloud security posture management tools never saw it. Network-based intrusion detection and intrusion prevention systems missed it. And standard host and application logs – especially with containers running as root in shared namespaces – offered no insight into in-memory or kernel-level exploits. Only real-time, in-container telemetry could expose and stop this threat.

As they traced the attacker’s movements, the SVP of technology was left confronting difficult questions: Was any customer data accessed? Were multiple services compromised? Could they contain the threat without taking the platform offline?

These were high-stakes questions during the most critical revenue period of the year, when even a small misstep could carry major consequences.

The Response: Real-Time Forensics and Rule-Based Defense

Runtime Signals – Visibility in Minutes

Within seconds of being restored, alerts began streaming in that were precise, context-rich, and immediately actionable. The rules surfaced Perfctl’s behavior patterns, including specific process spawns, system calls, and lateral movement attempts. From there, the investigation unfolded with surgical precision:

System call telemetry revealed the attacker’s techniques, ranging from pivot attempts to binary execution.
Drift analysis identified containers that no longer matched their original baseline images.
Forensic snapshots helped the team retrace the attacker’s path in detail.

With guidance from Sysdig’s incident response experts, the engineering team acted swiftly, wiping and redeploying pods and containers to halt the attack.

The Outcome: A Win for Secure Velocity

Operational Control. Strategic Maturity. No Customer Impact.

Despite contending with a stealthy, evasive adversary during the most critical revenue window of the year, the company emerged unscathed. There was no downtime. No impact to customers. No evidence of data exfiltration.

But the real victory wasn’t just operational, it was strategic. The incident became a vivid proof point that a small, fast-moving engineering team could detect, contain, and recover from a complex cloud-native attack without sacrificing agility or business continuity.

More importantly, it sparked lasting change.

In the days that followed, the SVP of technology and his team translated the lessons of the incident into these systemic improvements:

Full agent coverage was restored across all critical workloads, eliminating the blind spots that had enabled the intrusion.
Policy‑as‑code guardrails were embedded into continuous integration/continuous delivery pipelines, ensuring that Kubernetes misconfigurations and web application firewall gaps were caught before reaching production.
Runtime protection was reinforced by tuning Falco rules to detect Perfctl‑style behaviors early.
Drift and malware response policies were activated with automated “kill container” actions to prevent future threats.

“This incident could have undermined customer trust and our peak-season performance,” said the SVP of Technology. “Instead, with Sysdig, we contained the threat and improved our security posture without missing a beat.”

This wasn’t just a lucky escape. It was cloud security executed the right way.

‍

Holiday Season Breach Attempt. No Impact. No Downtime.

Holiday Season Breach Attempt. No Impact. No Downtime.

Company Overview

Business Challenges

Company Overview

Business Challenges

The Gap: Misconfigurations and Missing Agents

A Perfect Storm of Vulnerability

The Incident: Advanced Lateral Movement and Perfctl

A Midnight Alert from Sysdig Threat Research

Post-Restoration Discovery – A Stealth Rootkit

The Response: Real-Time Forensics and Rule-Based Defense

Runtime Signals – Visibility in Minutes

The Outcome: A Win for Secure Velocity

Operational Control. Strategic Maturity. No Customer Impact.

Global Infrastructure Provider Cuts SOC 2 Audit Work by 80%

Greater Stability, Smarter Planning: How a Global Enterprise Gained Control of Its Cloud

Unlocking the Power of AI: How Partior Saves One Week Each Month with Sysdig Sage™

Caught in Runtime: How Sysdig Detected Credential Exposure in a Crypto Platform Before It Became a Breach

Good-Enough Security Isn’t Good Enough When You Serve a Billion People

BigCommerce Achieves Real-Time Cloud Security | Sysdig

Healthcare IT Provider Cuts Alerts by 99.8%, Reduces Vulnerability Noise by 98%

Loglass Scales Compliance to Secure Cloud Growth with Sysdig

CoinDCX Triples Threat Remediation Speed with Sysdig

JumpCloud slashes 80% of vulns and 99.8% of noise

Neo4j: Building a Secure Future with Sysdig CNAPP

Zerobank Partners with Sysdig for Real-Time Protection and AI-Driven Insights

Syfe cuts compliance time by 75%, boosts CIS score 30 points

Automox Cuts False Positives by 80% and Boosts Vulnerability Response Speed by 30%

RSI secures 100% of production environments in 6 weeks

Worldpay Gains Competitive Edge With Faster Delivery of Innovative PCI-Compliant Payment Solutions

Sprout Social detects threats 99% faster, cuts noise 98%

NTT DOCOMO Relies on Sysdig to Secure 80+ Million Users

Network’s Journey to Robust Cloud Security

Ben Visa Vale secures 800K cardholders, remediates 70% faster

Apree Health Partners with Sysdig to Gain Container Visibility and Meet Compliance

Data Notebook Company Supports Compliance and Shuts Down Advanced Attacks With a Single Solution

Bloomreach Achieves 350% ROI with Sysdig

Game Development Company Saves Millions While Scaling 10X

Goldman Sachs: Accelerating Business With Microservices

Gini Ensures Adherence to Strict EU Compliance Standards, While Reducing Dev and Ops Burdens

ICG Consulting Leverages Sysdig and AWS To Compete With Major Shops

BlaBlaCar Security Team of Four Empowers Developers to Manage Security Risk With Sysdig

Worldpay Gains Competitive Edge With Faster Delivery of Innovative PCI-Compliant Payment Solutions

SAP Concur Delivers Secure, Compliant Solutions to More than 50M End Users Globally

BitMEX halves triage time, investigates in 30 seconds

Immuta gains full visibility in 30 days, cuts false positives 85%

Mezmo Delivers Higher Uptime and Improved Customer Experience

Like what you see?