Convert characters to character entities

I have written a script to convert some (or all) characters in a string to character entities. There are many reasons to do that:

  • The character is not representable in the encoding of the document
  • The character is not to be parsed as markup, but would be, if it isn't escaped as an entity
  • Documentation: If the character may not have a glyph in the font used by the UA, or if the user may not recognize that glyph, the character entity is a documentation of what character you are looking at. E.g. ᚱ is coded as the character entity ᚱ and the character name can be found at IBM.
  • The string doesn't allow spaces, so they must be encoded as  
  • Two consecutive hyphens are not allowed in HTML comments; but legal if at least one of them is a character entity.
  • "Encryption ultra lite"

Postet av: Aage Utnes

Anne van Kesteren also comments upon her choice of character set at

18.01.2005 @ 20:45

