Techniques25-Feb-26|11 min read

Steganographic Payload Extraction Methods

Steganographic Extraction Techniques Used by APT Groups and Commodity RATs

Definition:

Steganographic payload extraction is the process by which malware retrieves hidden data from carrier files, typically images. Each extraction technique corresponds to a specific embedding method, from reading bytes after an end-of-file marker to reconstructing payloads from individual pixel colour values. Understanding the extraction code that malware executes reveals the embedding method used, the post-extraction processing required, and the detection opportunities available to defenders.

"Steganography" is often discussed as though it were a single technique. It is not. It is a family of distinct methods, each with its own extraction logic, capacity constraints, and detection characteristics. The code a malware sample runs to recover its payload from an image varies enormously between campaigns. An XWorm variant scanning for a Base64 flag marker after a JPEG's FFD9 byte has almost nothing in common with OceanLotus reconstructing AES-encrypted payload bits from the least significant bits of PNG pixel values. The embedding differs. The extraction code differs. The detection approach differs. And critically, what the defender recovers after successful extraction differs. This article examines five distinct extraction methods from the attacker's perspective: what the malware code actually does to get the payload out. A companion article covers detection techniques from the defender's perspective, including statistical analysis, entropy measurement, and practical workflow. Here, the focus is on the extraction logic itself and what it tells defenders about the threat they are facing.

Five extraction methods

Steganographic extraction techniques cluster into five categories based on where the payload is stored and how the malware locates and reassembles it. Each category has distinct code patterns, capacity limits, and forensic signatures. The order below roughly follows increasing implementation complexity, though not necessarily increasing detection difficulty.

Appended data extractionThe simplest extraction method. The attacker appends payload data after the image's end-of-file marker: FFD9 for JPEG, IEND for PNG, 0x3B for GIF. The image renders normally in any viewer because parsers stop reading at the termination marker. The malware reads the file as raw bytes, seeks past the marker, and extracts everything that follows. The XWorm StegoCampaign (2025) used this approach with an explicit flag marker. The extraction routine scanned for the string <<BASE64_START>> appended after the image data, then extracted everything following it. The payload was a Base64-encoded MZ executable (identifiable by the characteristic TVq prefix in its Base64 form). The entire extraction logic was fewer than ten lines of code: open the file, find the marker, decode Base64, write to disk, execute. Variants of this technique use different signalling mechanisms. Some calculate a fixed byte offset from the end of the file. Others use custom delimiter sequences. The GhostPoster campaign used triple equals signs or four greater-than symbols as delimiters, varying across extension variants. Regardless of the signalling method, the underlying principle is identical: payload bytes sit outside the image format boundary, invisible to parsers but trivially accessible to code.

Detection difficulty: Low. The file is larger than its image dimensions require. Entropy analysis reveals a sharp transition at the end-of-file marker: structured image data on one side, high-entropy payload on the other. Format-aware scanners that compare actual file size against expected size catch this reliably. The technique persists because many security tools still treat image files as opaque blobs and never perform this comparison.

Least Significant Bit pixel encodingThe classic steganographic method. The payload is encoded into the lowest bits of pixel colour values. Each pixel in a 24-bit image has three colour channels (red, green, blue), each represented by eight bits. Replacing the least significant bit of each channel encodes three bits of payload data per pixel while producing colour changes imperceptible to the human eye. At two bits per channel, a standard 1920x1080 image carries approximately 1.5 megabytes of hidden data. OceanLotus (APT32) deployed a bespoke LSB implementation first observed in the wild in September 2018 and published by BlackBerry Cylance in April 2019. Their extraction routine read pixel values from specific PNG images, reassembled the least significant bits in scan order, then passed the result through AES-128 decryption followed by XOR deobfuscation. The dual encryption layers were not afterthoughts. They were integral to the design, ensuring that even successful LSB extraction produced only ciphertext without the correct keys. Worok (2022), documented by ESET, used LSB encoding in PNG files to deliver PowerShell payloads. Witchetty (2022), reported by Symantec, embedded an XOR-encrypted backdoor in a Windows logo bitmap image hosted on GitHub. In each case, the extraction code followed the same fundamental pattern: iterate through pixels, extract low bits, reassemble into a byte stream, then apply whatever decryption the campaign required.

BlackBerry Cylance's 2019 research described the OceanLotus implementation as bespoke and therefore not easily prone to detection by standard analysis tools. By layering AES-128 encryption and XOR obfuscation on top of LSB encoding, the group ensured that statistical tests targeting raw LSB patterns would find only the uniform distribution characteristic of encrypted data, not the structured patterns those tests are designed to detect.

Detection difficulty: High when encrypted. Statistical tests such as chi-square analysis and RS analysis are effective at detecting unencrypted LSB embedding because the encoding process creates measurable distortions in pixel value pair distributions. However, when the payload is encrypted before embedding, as OceanLotus and Witchetty both did, the embedded bits are statistically indistinguishable from random noise. The encrypted data produces a uniform distribution that defeats the very statistical anomalies these tests rely on. For a detailed treatment of chi-square and RS analysis methods, see Steganography Detection Techniques.

Chunk and metadata injectionImage formats include metadata fields and extensible container structures that carry arbitrary data without affecting how the image displays. PNG supports tEXt, zTXt, and iTXt chunks for textual metadata. JPEG supports EXIF, COM (comment), and APP marker segments. These fields are designed to store camera settings, copyright notices, and application-specific data. Attackers repurpose them to store executable payloads. PHP webshells embedded in JPEG EXIF headers represent one of the oldest and most persistent variants of this technique. The attacker places PHP code in the EXIF Comment or ImageDescription field. When a web application processes the uploaded image and the server interprets the file as PHP (through misconfiguration or a local file inclusion vulnerability), the embedded code executes. The image remains valid and displays normally throughout. SteamHide (2021) used a more sophisticated approach. The malware embedded encrypted executable code within Steam user profile images, specifically in the PropertyTagICCProfile metadata field. This field normally stores colour management data and can hold arbitrary binary content. The extraction routine fetched the profile image from Steam's CDN, parsed the ICC profile field, decrypted the content, and executed the result. Because Steam profile images are served from Valve's infrastructure, network-level blocking was impractical without disrupting legitimate Steam traffic.

Detection difficulty: Low to Medium. Metadata inspection tools exist and are widely available. The challenge is that most security products do not examine image metadata for executable content. A JPEG with PHP code in its EXIF Comment field passes through email gateways, web application firewalls, and antivirus scanners without triggering alerts. Detection requires purpose-built rules that scan metadata fields for code patterns: eval, base64_decode, exec, script syntax, or high-entropy binary data in fields that should contain human-readable text.

IDAT pixel stream injectionPNG images store their pixel data in IDAT chunks, where it is zlib-compressed. Rather than appending data after the image or hiding it in metadata, this technique embeds the payload within the compressed pixel data stream itself. The payload is not simply concatenated; it is interwoven with or positioned alongside legitimate image data using format-specific containers. IcedID (also known as BokBot) is the most prominent user of this method. Its loader fetches a PNG image from attacker infrastructure, parses the IDAT chunks, locates the encrypted payload using embedded markers, and decrypts it with an RC4 key stored within the same image file. The self-contained nature of this design is significant: the image carries both the encrypted payload and its decryption key, meaning no external key exchange is required. The resulting images appear grey and noisy to human observers because the pseudo-random encrypted data produces high-entropy pixel values, but they remain structurally valid PNG files. GHOSTPULSE v1 (2023), documented by Elastic Security Labs, used a variation of this approach. Its extraction routine scanned PNG files for an "IDAT" string marker followed by a four-byte validation tag. Chunks matching this pattern were concatenated, XOR-decrypted using a key derived from the file, and decompressed with LZNT1. The multi-step extraction chain (locate markers, concatenate, decrypt, decompress) meant that intercepting any single step produced only intermediate data that resisted analysis without knowledge of subsequent steps.

Detection difficulty: Medium. IDAT-embedded payloads produce measurable anomalies. The entropy of individual IDAT chunks is higher than expected for typical image content, often approaching 8.0 bits per byte. The number of IDAT chunks may be unusual. Marker scanning patterns can be detected through YARA rules targeting specific byte sequences within the compressed stream. However, distinguishing a payload-bearing IDAT from a legitimately high-entropy photographic image requires contextual analysis rather than simple thresholds.

Offset-marker based extractionThis method uses a structured header or marker at a known position to direct the extraction routine to the payload's location. Rather than scanning the entire file or parsing image structures, the malware reads a fixed offset, interprets the bytes there as a descriptor (payload offset, length, encryption parameters), and jumps directly to the payload location. Turla's LightNeuron implant (2019), documented by ESET, applied this technique to PDF email attachments. The malware placed a custom header at the start of PDF files specifying the byte offset of AES-256 encrypted commands embedded deeper in the document. The extraction routine read the header, calculated the offset, extracted the specified number of bytes, and decrypted them. The PDF rendered normally because the header occupied space before the PDF's own header marker, and most PDF viewers are tolerant of leading bytes. EdgeSpot documented a related technique in PDF exploit campaigns active from 2017 to 2019. These exploits used the JavaScript getIcon and iconStreamFromIcon APIs in PDF readers to extract data from embedded image streams within the document. The extraction was invisible to users because it operated through the document's internal scripting capabilities rather than external network connections. The payload was encoded within an image stream that the PDF legitimately contained, making structural analysis of the PDF appear clean.

Detection difficulty: Medium to High. Offset-marker extraction leaves minimal forensic traces. The marker itself is a short sequence of bytes at a known position, indistinguishable from random data without knowledge of the format. The payload occupies a region that may overlap with or replace legitimate document content. Detection requires either prior knowledge of the specific marker format used by a given campaign, or anomaly detection at the document structure level: unexpected leading bytes in PDFs, oversized embedded resources, or JavaScript accessing image stream APIs in ways that do not correspond to visible document elements.

The encryption layer problem

Successful extraction is rarely the end of the analysis. In the majority of documented campaigns, the extracted payload is encrypted, encoded, or both. Steganography conceals the data's existence. Encryption protects its contents. The two layers serve complementary purposes, and defeating one does not automatically defeat the other.

OceanLotus layered AES-128 and XOR on top of LSB encoding. Turla's LightNeuron used AES-256 to encrypt commands embedded via offset markers. XWorm's StegoCampaign applied AES-ECB encryption to its appended payloads. IcedID uses RC4 with a key embedded in the carrier image itself. GHOSTPULSE v1 combined XOR decryption with LZNT1 decompression. In every case, a defender who successfully identifies and extracts the steganographic payload is left holding ciphertext that requires a second, entirely separate analytical pass.

This is where the distinction between detection and analysis becomes critical. A steganographic detector can identify that hidden data exists. It may even extract the raw bytes. But if those bytes are AES-encrypted, the detector has produced an artefact that reveals nothing about the threat without the corresponding decryption key. Key recovery, cipher identification, and decryption are separate disciplines from steganographic extraction. For a deeper treatment of how encryption keys function as threat intelligence, see Malware Encryption Keys as Threat Intelligence.

Beyond encryption, extracted payloads frequently contain additional obfuscation layers. A decrypted payload might be Base64-encoded PowerShell that, when decoded, reveals a GZip-compressed stream containing obfuscated JavaScript with string concatenation and character code arithmetic. The steganographic image was just the outermost container. The full analysis chain runs: image analysis, extraction, decryption, deobfuscation (often through multiple iterations), and finally IOC generation from the recovered plaintext.

A steganographic detector alone is insufficient for threat analysis. The full chain requires: carrier identification, extraction using the correct technique, decryption (often requiring key recovery from the same file or related infrastructure), multi-layer deobfuscation of the decrypted payload, and IOC extraction from the final plaintext. Stopping at any intermediate step produces incomplete intelligence.

Extraction techniques compared

Key Examples
Appended data
Malware FamilyXWorm StegoCampaign, GhostPoster
Image FormatJPEG, PNG, GIF
Encryption LayerAES-ECB (XWorm), XOR + Base64 (GhostPoster)
Detection DifficultyLow
LSB pixel encoding
Malware FamilyOceanLotus, Worok, Witchetty
Image FormatPNG, BMP
Encryption LayerAES-128 + XOR (OceanLotus), XOR (Witchetty)
Detection DifficultyHigh (when encrypted)
Chunk/metadata injection
Malware FamilyPHP webshells, SteamHide
Image FormatJPEG, PNG
Encryption LayerCustom encryption (SteamHide), none (PHP)
Detection DifficultyLow-Medium
IDAT stream injection
Malware FamilyIcedID, GHOSTPULSE v1
Image FormatPNG
Encryption LayerRC4 (IcedID), XOR + LZNT1 (GHOSTPULSE)
Detection DifficultyMedium
Offset-marker extraction
Malware FamilyTurla LightNeuron, EdgeSpot PDF
Image FormatPDF
Encryption LayerAES-256 (Turla), none (EdgeSpot)
Detection DifficultyMedium-High

Several patterns emerge from this comparison. First, PNG is the dominant carrier format, appearing in four of five categories. Its lossless compression preserves embedded data exactly, and its chunk-based structure provides multiple hiding locations. Second, encryption is the norm rather than the exception. Only the simplest techniques (basic appended data, PHP EXIF injection) sometimes ship without an encryption layer. Third, detection difficulty correlates more strongly with the presence of encryption than with the complexity of the embedding method. LSB encoding is well-understood and has mature detection tools, but those tools fail when the embedded data is encrypted first.

What extraction code reveals to defenders

Analysing the extraction logic in a malware sample provides intelligence beyond just recovering the payload. The choice of technique fingerprints the threat actor. IcedID's RC4-over-IDAT pattern is distinctive. OceanLotus's bespoke LSB with AES-128 is attributable. GhostPoster's delimiter-plus-character-swap chain is recognisable across campaign variants. When a new sample uses a known extraction pattern, attribution narrows immediately.

The extraction code also reveals the detection opportunities. Appended data techniques are vulnerable to file size comparison. IDAT injection produces entropy anomalies. Metadata injection is catchable with field-level content inspection. Offset-marker techniques leave structural anomalies in the carrier document. Each extraction method implies a corresponding detection method, and defenders who understand the extraction logic can write targeted YARA rules, configure format-aware scanning, or instrument monitoring for the specific file access patterns each technique requires.

Finally, the extraction code dictates the post-extraction workflow. If the extraction routine includes an RC4 decryption step with a key read from the same file, the defender knows to look for the key in the carrier. If the routine applies XOR followed by decompression, the defender knows the decryption order. If the routine passes output to eval or Invoke-Expression, the defender knows the decrypted payload is executable code that likely requires further deobfuscation. The extraction code is, in effect, a recipe for reversing the entire concealment chain. For the broader taxonomy of stegomalware types and the decade of campaigns that shaped current technique adoption, see What is Stegomalware?. When the carrier is a PDF document rather than a standalone image, additional analysis layers come into play, as described in Analysing Steganographic Document Attacks.

KlaroSkope detects steganographic payloads across all five extraction methods and feeds extracted content directly into its deobfuscation engine. No manual tool-chaining required. Test Against a Real Sample →

Frequently Asked Questions

Q

What is the most common steganographic technique used by malware?

Appended data after end-of-file markers is the most common technique due to its simplicity. The malware reads past the image termination byte (FFD9 for JPEG, IEND for PNG) and extracts everything that follows. Campaigns like XWorm and GhostPoster use this method because the extraction code is trivial to implement and the image renders normally in any viewer. More sophisticated groups like OceanLotus and IcedID use pixel-level and IDAT chunk methods that are harder to detect but require more complex extraction logic.
Q

Can steganographic malware hide its own decryption key in the same image?

Yes. IcedID is the most prominent example: it embeds both the RC4-encrypted payload and the decryption key within the same PNG image. The key is stored in a separate region from the payload, and the extraction routine knows the offset of each. This self-contained design eliminates the need for external key exchange, making the image a complete attack package that requires no additional network communication to activate.
Q

How do APT groups make steganography harder to detect?

The primary method is encrypting the payload before embedding it. Statistical detection tools like chi-square and RS analysis work by identifying the patterns that steganographic encoding creates in pixel value distributions. When the payload is encrypted first, the embedded bits are uniformly random, producing no detectable statistical pattern. OceanLotus used AES-128 plus XOR before LSB embedding specifically to defeat these automated detection tools. Turla used AES-256 encryption for its offset-marker payloads.
Q

What is the difference between steganography and obfuscation?

Steganography hides the existence of data by embedding it within an innocuous carrier file such as an image. The goal is invisibility: the carrier looks normal to human observers and security tools. Obfuscation transforms data to make it unreadable while remaining visible. An obfuscated PowerShell script is clearly present in a file but its purpose is obscured through encoding, encryption, and string manipulation. In practice, malware frequently combines both: steganography to conceal the payload's existence, followed by obfuscation layers within the payload itself.
Q

Which image formats support steganographic embedding?

All common image formats support some form of steganographic embedding. PNG is the most versatile due to lossless compression (which preserves data exactly) and its extensible chunk structure (tEXt, zTXt, iTXt, IDAT, private chunks). JPEG supports appended data after EOI and metadata injection via EXIF and COM segments, though its lossy compression makes pixel-level encoding unreliable. BMP supports LSB and appended data. GIF supports appended data after the 0x3B trailer. PDF documents support embedded image streams and offset-marker techniques. Even ICO files have been used in targeted campaigns.

Found this useful? Sharing is caring!

Ready to decode?

See KlaroSkope transform obfuscated scripts into actionable intelligence.

Try It Free