esuslogo
 [To advertise Java(tm) Events here, contact joris@esus.com!]
banner

Java™
by example!






New @ Esus.com


  gb  In-house search engine for better results!

  gb  Get updates with the esus.com
newsletter!









  Home 
 Browse Categories 
 Ask a Java Question 
 Help 
  For Java Tips & Tricks, subscribe to the esus.com newsletter!
Search Java Q&A, Links, API's:   adv 

What is UTF-8?
The standard UTF-8 format is a Unicode encoding that is compatible with ASCII, allowing old programs to work with the new format (text searching, etc). ASCII values are encoded into a single byte. Java has a modified UTF-8 format. Arabic, Greek and Hebrew characters are encoded in two bytes and the rest is encoded in three bytes. The JVM does not recognize longer UTF-8 formats than 3 bytes. There is another exception in Java, '\u0000' is encoded in two bytes.

 
This code sample is only viewable to esus.com members
Login or become a member!


For example: (I'll take the example of the RFC - see links).

 
This code sample is only viewable to esus.com members
Login or become a member!




Further Information
Author of answer: Joris Van den Bogaert

Comments to this answer are only viewable by members. Login or become a member!





Terms of Service | Privacy Policy | Contact

Copyright © 2000-2003 Esus.com - All Rights Reserved 
Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. Esus.com is independent of Sun Microsystems, Inc. All other trademarks are the sole property of their respective owners.