Detecting and Extracting Steganographic Content

Earlier I wrote about using outguess to embed hidden text into JPEG images. This is not a new concept, people have been using some form of steganography for many, many years.

While easy enough to hide data in images it does not appear to be that simple to detect these hidden messages, unless you know about the methods in advance. If you do find data hidden within an image, extracting that data can be difficult if you don’t know the passphrase (assuming the passphrase is strong). At least not with the compute power or tools available to me.

If the method used to hide the data is known then it is just a matter of determining what the passphrase is. If you have the compute power and a good dictionary then finding the passphrase using brute force can be possible. That is only one part of the picture though.

Different techniques used to hide data may be hard to detect, or they may be identified incorrectly.

For example, I once placed a string of text inside of a JPEG picture using nothing but a text editor. (I did this by placing the text directly after the JFIF marker and prior to anything else). This method is extremely weak and hokey. However, if someone did not think to look at the file with strings, a hex editor or just plain old notepad they could be mislead into thinking data was hidden with outguess or jphide, or invisible secrets, etc. None of those tools would properly extract the data. Using something as simple as notepad would work fine. The point is, identifying the method is a rather important first step.

In addition to some commercially marketed tools, there are are handful of open source tools available. I listed a couple of those tools above.
In the examples that follow I’ll stick with stegdetect and stegbreak. stegdetect can be used to look for steganographic content and stegbreak can be used to try and determine what the passphrase is. Both tools were written by Niels Provos, the author of outguess.

To illustrate I’ll use the example image from my previous post and a new example.

Example 1:

The image above contains data that was hidden with outguess version 2. The stegdetect tool does not report anything peculiar about the image. In other words, using stegdetect on files with data hidden using outguess version 2 is not going to tell us anything.

# stegdetect steged-coyote-a-leeched.jpg
steged-coyote-a.jpg : negative

Using the -s (sensitivity) switch does not help.

# stegdetect -s1000 steged-coyote-a-leeched.jpg
steged-coyote-a.jpg : negative

Telling stegdetect to look for outguess type patterns fails too.
# stegdetect -to -s1000 steged-coyote-a-leeched.jpg
steged-coyote-a.jpg : negative

Just to clarify, this image does contain hidden content. The line that says ‘cat out.txt’ just prints the contents of the file called out.txt.

# outguess -k 'super secret passphrase' -r steged-coyote-a-leeched.jpg out.txt
Reading steged-coyote-a.jpg....
Extracting usable bits: 83084 bits
Steg retrieve: seed: 125, len: 32
# cat out.txt
This is a super secret message.

Example 2:

The picture above contains data that was hidden with jphide. Using stegdetect to test the image we can see there is potentially a hidden message to look for. We also see that the data is likely hidden with jphide, but we can’t be certain of this.

# stegdetect steged-coyote-b.jpg
steged-coyote-b.jpg : negative
# stegdetect -s10 steged-coyote-b.jpg
steged-coyote-b.jpg : jphide(**)

Because stegdetect reports the file is likely to contain data hidden with jphide we can attempt to recover a passphrase using stegbreak. In this case the passphrase is in our dictionary, in fact it is the only word in our dictionary for this test — otherwise this would take much longer.

# stegbreak -f /usr/share/dict/words2 steged-coyote-b.jpg
Loaded 1 files...
steged-coyote-b.jpg : jphide[v3](passpass)
Processed 1 files, found 1 embeddings.
Time: 1 seconds: Cracks: 5, 5.0 c/s

Okay then, looks like it is positive for jphide version 3 with a highly intuitive passphrase of ‘pass’. Had this been a real world scenario the passphrase would be much more complex. Brute force against a strong passphrase could take forever – unless you’re located near Ft. George Meade in Maryland and have access to some real compute power.

So then, shall we extract the data?

# jpseek steged-coyote-b.jpg jphide-output.txt
jpseek, version 0.3 (c) 1998 Allan Latham
This is licenced software but no charge is made for its use.
NO WARRANTY whatsoever is offered with this product.
NO LIABILITY whatsoever is accepted for its use.
You are using this entirely at your OWN RISK.
See the GNU Public Licence for full details.
# cat jphide-output.txt
This is a super secret message.

We can now see our super secret message. Hidden with jphide, discovered with stegdetect, passphrase determined with stegbreak and extracted with jpseek.

Different detection tools and methods may have varied results. In this case we were able to use the stegdetect / stegbreak tools to survey and extract date from a file created with jphide. Using the same tools we were NOT able to survey a file created with outguess version 2.
Additionally, it is very unlikely that you will uncover any hidden data with such a weak passphrase. With enough compute power and a very large dictionary with a proper rule set you can uncover just about any passphrase. It all boils down to time/compute power.

Enough drivel for one day….

Note: Outguess v2 has been around for quite some time now. While out of the scope of this simple article there are a couple of documents floating around which cover studies on breaking outguess.


Comments are closed.