The Unicode Consortium announced the release of Unicode 6.1.0 yesterday. The new version adds characters for additional languages in China, other Asian countries and Africa. This version of the standard introduces 732 new characters.
In addition, the standard also added “labels” for character properties that will supposedly help implementers create better regular expressions that are both easier to read and easier to validate. I admit little knowledge about these labels at the moment, but will research and report on them in the future if time allows.
One of the oddities of the new version is the inclusion of 200 emoji variants. This is perhaps the only issue of the standard that I just don’t understand. Back in the day when I was more involved in Unicode development, we had a huge effort to unify variants of Chinese characters. We preached that Unicode characters were abstract entities with glyph renderings that were determined by font, style preferences of developers and apps. Now it appears that the Unicode consortium has changed its position on this. Or maybe partially?. The addition of 200 emoji “variants” just seems unnecessary, but that’s just my opinion and I admit that I may not know all the issues that formed the consortium’s decision.
We have some examples, straight from the announcement, that show only 4 of the 200 new emoji variants:
As the image shows, the “TENT” emoji has two variants — a text style and a more colorful, graphical emoji style. The standard defends these variants by saying that it allows implementations to distinguish preferred display styles. I think that is what fonts are for. Personally, I just don’t think variants are needed. And, I think that the variants make things more difficult for applications.
What do you think about variants in general? And what about emoji variants specifically?