Description

This page is vulnerable to various Unicode transformation issues such as Best-Fit Mappings, Overlong byte sequences, Ill-formed sequences.

Best-Fit Mappings occurs when a character X gets transformed to an entirely different character Y. In general, best-fit mappings occur when characters are transcoded between Unicode and another encoding.

Overlong byte sequences (non-shortest form) - UTF-8 allows for different representations of characters that also have a shorter form. For security reasons, a UTF-8 decoder must not accept UTF-8 sequences that are longer than necessary to encode a character. For example, the character U+000A (line feed) must be accepted from a UTF-8 stream only in the form 0x0A, but not in any of the following five possible overlong forms:

  • 0xC0 0x8A
  • 0xE0 0x80 0x8A
  • 0xF0 0x80 0x80 0x8A
  • 0xF8 0x80 0x80 0x80 0x8A
  • 0xFC 0x80 0x80 0x80 0x80 0x8A

Ill-Formed Subsequences As REQUIRED by UNICODE 3.0, and noted in the Unicode Technical Report #36, if a leading byte is followed by an invalid successor byte, then it should NOT consume it.

Remediation

Identify the source of these Unicode transformation issues and fix them. Consult the web references below for more information.

References

Related Vulnerabilities