You can rate examples to help us improve the quality of examples. If it is not present, read the encoding from the encoding attribute of the XML processing instruction. Just a note: Instead of using the often recommended (rather complex) regular expression by W3C ( Have you also considered the security issues that may arise from converting an escaped UTF8 code? It can read the text from a file or a given string and detect different types of the UTF character encoding. // ----------------------------------------------------------- You first have to detect what encoding has been used. Currently it can distinguish UTF-8, UTF-16, UTF-32 little or big endian encodings. Figure out what encoding your source file is in, then convert it to UTF8 and you should be good to go.After adding that to the top of the php file, all of the funky characters went away and it rendered as it should.
Much simpler UTF-8-ness checker using a regular expression created by the W3C:// Returns true if $string is valid UTF-8 and false otherwise. For example, a file with the first three bytes 0xEF,0xBB,0xBF is probably a UTF-8 encoded file. Debugging? Files generally indicate their encoding with a file header. There are many examples here. I ask because I think you may have a design issue here. If 10xxxxxx byte occurs alone i.e. A simple way to detect UTF-8/16/32 of file by its BOM (not work with string or file without BOM)// Unicode BOM is U+FEFF, but after encoded, it will look like this. I did have a problem with the following iconv-conversion. Posted by: admin December 15, 2017 Leave a comment. Sometimes mb_detect_string is not what you need. strict spécifie si l'on doit utiliser une détection de l'encodage strict ou non. The string being detected.. encoding_list. When using pdflib for example you want to VERIFY the correctness of utf-8. Function to detect UTF-8, when mb_detect_encoding is not available it may be useful. I seriously underestimated the importance of setlocale... What do you expect from an IDE? I figure that I should be using the utf8_decode() function when reading the files, but I don’t know how to tell which need decoding. if the function " mb_detect_encoding" does not exist ... // ---------------------------------------------------- // ---------------------------------------------------------------- // ---------------------------------------------------------------- // ---------------------------------------------------- // ------------------------------------------------------ // ------------------------------------------------------ encoding_list.