If you go to the documentation for that package that you referred to, you will see it says what I just said. Likewise the data you get from a socket connection is a stream of bytes if it is text then it can be converted into String data using some encoding. You're correct that the default encoding used by Readers and Writers comes from the file.encoding property you can use different encodings by using an InputStreamReader or an OutputStreamWriter and specifying the encoding. So a Reader converts those bytes into chars, and a Writer converts chars into bytes. You may not have realized that a file is also an array of bytes. Sometimes people are sloppy and start talking about "UTF-8 strings" when they really have an array of bytes that was encoded using UTF-8, or perhaps a String that was decoded from an array of bytes using UTF-8. The String.getBytes(encoding) method maps from chars to bytes, and the new String(bytes, encoding) constructor maps from bytes to chars. (Before Unicode 4.0 it was simpler, a char was just a Unicode character.)Īn encoding is a method of converting between a Java String (which consists of chars) and an array of bytes. Peter is right, all Java Strings are sequences of chars, and all Java chars are Unicode code-points in UTF-16. I assumed tha format is UTF-X, not implying by this that is always the case.Īnother option to convert a String from one enconding to another is the use of package by means of using the Encoder and Decoder classes.Sorry, but this is all totally incorrect. The example that I wrote is a way to convert a String from whatever format it is into ASCII format. So, Peter, how come you say all String in Java are UTF-16? The String class provides methods for such purposes as well as package. Strings are encoded according to every particular environment and you can just as easily convert a string from one encoding to the other. But doest not have anything to do with you your application. The encoding of the Java Strings is determined by the default encoding used by the JVM, declared in the file.encoding property.Īnother thing very different is the encoding of the Java files (*.java) which might be UTF. You continue to say that Java String are always in UTF-16.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |