Posted by: OcRam June 19, 2008
Cool_old_stuff
Login in to Rate this Post:     0       ?        

Explanation of no. 2

Notepad uses the IsTextUnicode WinAPI function to recognize the text encoding of a file. This function does some statistical analysis to come up with a guess for the text encoding when special markers defining the encoding are missing. The recognition of a text file as a Unicode file can be triggered by the following:

  1. The user enters any combination of symbols in 4 words in the length of 4 characters (first word) - 3 characters (second word) - 3 characters (third word) - 5 characters (fourth word). The words can also be random and, meaningless. (for example, type one of these: "This app can break", "bush hid the right facts", "well its the blast" or some Linux reference)
  2. The file must be saved and Notepad closed.
  3. After re-opening the file, the typed letters are interpreted as Unicode sequences, resulting in a garbled text with hieroglyphs or rectangles (depending on the font)
Last edited: 19-Jun-08 12:53 PM
Read Full Discussion Thread for this article