java.io.UTFDataFormatException: 5-byte UTF8 encoding not supported.

An interesting exception faced while parsing xml content. Also, on further analysis on this error caused below similar issues started to raise.

Another similar exception is:

java.io.UTFDataFormatException: Invalid byte 1 of 1-byte UTF-8 sequence

Here, I am trying to parse an xml string using byte array from the xml string, and give that array as input to xml reader stream. I have used java.lang.String.getBytes() for this.

Unfortunately, I got a chinese (other UTF-8) characters as a value of one node in the xml. Hence, I got up with the above error. Later, I found that getBytes() method supports only the western encoding, not UTF-8. So by using java.lang.String.getBytes("UTF-8") method, we solved the issue.

Good to note this in XML Programming 🙂 .

One thought on “java.io.UTFDataFormatException: 5-byte UTF8 encoding not supported.

  1. Nice post, I didn’t know getBytes behaves like that. This applies to any string, not only XMLs.

Leave a Reply

Your email address will not be published. Required fields are marked *