Charset detection
- 字符集探测(字符编码探测是通过统计分析字节样式猜测代表文字的编码方式,其中探测UTF-8较为可靠。但该方法不完美,有时会出现误判,如将ASCII编码的文本误判为中文UTF-16LE。不完善的探测程序未优先进行UTF-8测试,导致将UTF-8误判为其他编码)
Charset detection
-
abstract:
Character encoding detection, charset detection, or code page detection is the process of heuristically guessing the character encoding of a series of bytes that represent text. The technique is recognised to be unreliable and is only used when specific metadata, such as a HTTP Content-Type: header is either not available, or is assumed to be untrustworthy.
以上来源于:
WordNet