I wrote this article after experiencing a “Broken surrogate pair” issue in Java. After investigation, I found that improper substring is the root cause of this problem.
- JAVA encode characters in 16-bits representation.
- Unicode chars may be encoded using multiple 16-bit entities.
- How can we cut it based on character index?