New Scientist recently ran an article about steganography. If you don’t already know, steganography is essentially encryption with a difference. Specifically, encryption is usually obvious. It may be that the data Alice sends via email to Bob with public key cryptography is entirely secure from eves-dropping by Eve (pun intended, sorry), but Eve will know data is being sent that she might be interested in. Steganography, by contrast, seeks to hide the encrypted data so Eve is not aware of its very existence.
It’s a very ancient idea, stretching back to ancient Greece. In modern times a common way to perform the trick is to hide data in an image. One of my more gifted undergraduate students did a final year project on this with me. We used a known password as a seed for a pseudo-random number generator to determine which pixels of the image we would embed the data in. By playing with the least significant bit of one colour in randomly spaced pixels, you can very effectively hide data.
The New Scientist article suggests that if you detect the steganography, and if we obtain the computer of the suspect and if they have carelessly wiped the software, there might be traces that tell you this was done. Now let’s remember the whole point of steganography is that the first step is improbable, you most likely won’t detect it.
The issue is, in today’s geopolitical situation, reasonably serious. It has been suggested (see the wikipedia article I linked above), that such techniques were used to exchange data on site like ebay to plan major terrorist attacks. With lots of analysis software only playing with known algorithms, or relying on comparing modified images with the original (where the original may not be available) what can such a major website do to prevent such abuse? Well, I thought an approach would be to essentially employ the same techniques with random data. That is, randomly poking data into bits in pixels here and there will, up to a certain point, not affect image clarity to the naked eye, but unless the encrypted data is loaded with huge amounts of error correcting code, it will destroy the payload. You could easily automatically run such a filter over uploaded data. I’m sure similar approaches would work for digital sound.