Subscribe to DSC Newsletter

People complain that governments or hackers are reading our messages for nefarious purposes. Of course this "reading" is done automatically, in large volume, by machines and NLP (natural language processing) algorithms looking (among other things) at keywords.

What about the following cave-man approach?

Insert one space between two characters. For instance, the text

Hello John, How are you? How was your trip?


H e l l o . J o h n . , . H o w . a r e . y o u ? . H o w . w a s . y o u r . t r i p ?

The advantage is that the text is still very easy to read by a human being, but unless a specific rule is introduced in NLP algorithms to detect such patterns (this means that the trick would be used by many people,) for automated algorithms, it looks like gibberish. And such rudimentary encoding is straightforward to implement. It is also more compact than encoding text as images using a screenshot capture tool. Another advantage is the fact that you can still copy and paste content encoded and received that way, then remove the spaces, and get a normal text, if for instance, you need to put the content emailed to you in (say) a business presentation; this is not possible (at least, much more difficult and prone to errors) if the text was emailed embedded in a picture. 

What do you think?

Related articleSteganography (the art of hiding messages into images and other media)

Views: 3392

Reply to This

Replies to This Discussion

Adam Bouscaren posted the following comment on a version of this question on Facebook:

Not sure what popular opinion is here but text as an image probably doesn't work against anyone other than those employing low level techniques (i.e. Extraction from packet sniffing). I've been training a Haar classifier so I can feed images with some addition preprocessing to a neural network - not sure how it's going to work out, but based on what I'm seeing in the process I think most images can be read by the types of enterprises that would scan email.

I replied:

I agree with you Adam, you would think that modern algorithms can easily "read" text in images. Yet you would be surprised to see that even some Facebook algorithms are unable to just recognize whether an image contains text or not

Increasing the  spaces amongst the words may make it more difficult for the NLP to contextualise the meaning of the sentence but the mere fact the sentence is in a horizontal line, it may be deciphered upon successive attempts. 


© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service