“Mom, Dad, where do emoji come from?” The Unicode Consortium, son

xeni · October 20, 2015, 2:35pm

OtherMichael · October 20, 2015, 2:56pm

Which gives us the following sports:

runner
walking
dancer
rowboat
swimmer
surfer
bath
snowboarder
ski
snowman
bicyclist
mountain_bicyclist
horse_racing
tent
fishing_pole_and_fish
soccer
basketball
football
baseball
tennis
rugby_football
golf
trophy
running_shirt_with_sash
checkered_flag
musical_keyboard
guitar
violin
saxophone
trumpet
musical_note
notes
musical_score
headphones
microphone
performing_arts
ticket
tophat
circus_tent
clapper
art
dart
8ball
bowling
slot_machine
game_die
video_game
flower_playing_cards
black_joker
mahjong
carousel_horse
ferris_wheel
roller_coaster

No rifles, no hand-guns, no target-pistols.

SHOCKED, SHOCKED I AM!

although not as shocked as I will be if this thread manages to remain civil, sane, and on the topic of emojis in general

fuzzyfungus · October 20, 2015, 3:14pm

It’s entirely possible that I’m just a grumpy old man who is busy defending his lawn from kids these days; but it has been really depressing to watch the Unicode Consortium somehow get dragged into the business of being fairly close to the leading edge of the process of spewing out new emoji.

It all started innocently enough: Unicode has always balanced a desire for technical sanity and actually-being-implementable-in-finite-time-by-finite-entities with a desire to get adopted, which requires a certain amount of…tolerance…of various legacy encodings.

ASCII was incorporated as a proper subset for that reason, as were various other encodings in common use, even if they introduced duplicate characters, or involved choices contrary to the preferred Unicode way of doing things(eg. ligatures and digraphs are supposed to be handled by using the appropriate combination of discrete glyphs, not given their own codepoints; but various legacy encodings had ligatures and digraphs implemented that way, and backward compatibility was needed, so they got codepoints; lesser of two evils).

In the case of emoji, the Japanese handset market was unbelievably dysfunctional. A bunch of emoji floating around; but encoding could differ between carriers, between handset vendors, possibly even between different combinations of the two. Implementing translation layers so that messages between users on different handsets or different networks was bad enough; and potential foreign entrants to the market were loath to touch such a quagmire.

So, the Unicode consortium was called in and, as with other legacy encoding messes, just did what had to be done. All the emoji were lined up, any duplicates culled, and the remainder assigned code points. Ugly, completely idiosyncratic, and based on nothing except the inertia of certain twee little pictures in the Japanese text messaging market; but it was the closest thing to a clean break that could be arranged, and at least cauterized the oozing pustule that was the prior encoding arrangement.

Then things started to go bad: In Apple’s default system font, ‘smiley face’ was yellow. Accusations of racism arose. Apple (cynically and dishonestly) claimed that Unicode was at fault when, in fact, all Unicode did was specify that a given code point was ‘smiley face’ and offered no further clarification or specificity. Now it seems like everyone with some awful bit of clip-art wants their own codepoint; at the same time as a number of actual natural languages remain unincorporated or ill supported.

The Unicode Consortium has always had to deal with idiosyncratic and historically contingent situations(they are attempting to tackle natural langue, after all); but mayhem was, somewhat, mitigated by the fact that they were in the business of either absorbing legacy encodings that had already been market-tested and become entrenched, where those existed, or designing encodings in consultation with the relevant experts in the case of languages without active IT markets and legacy standards(whether because they are dead, alive but spoken by people without much IT in use, or whatever).

Now, they appear to be the place where every last idiotic proposal gets made first, without first undergoing proof and refinement by real world use. As best I can tell, they aren’t equipped for that. The task of describing the world’s characters is vast enough; but at least you can approach it empirically. The task of incorporating random images thrown at you is unbounded; and largely without any criteria for guiding inclusion and exclusion. When you simply describe the world, 'Well, do people use it?" is all you need to know. Once you abandon that criterion, how do you distinguish between emoji that just have to get into the next revision and ones that are pointless?

anon33466019 · October 20, 2015, 3:26pm

As an engineer there are two types of problems that give me night sweats. Time, and character encoding. Both appear deceptively easy, but are the product of millions of man hours gettin’ it wrong.

Boundegar · October 20, 2015, 3:30pm

Wow. As a civilian, I think I can say this doesn’t effect me in the slightest.

anon33466019 · October 20, 2015, 3:39pm

Consider yourself lucky. Character encoding sniffing is a darker art than anything Voldemort ever practiced. Writing Cyrillic to English web scraping bots for hostile websites still makes me want to pop a few Xanax.

fuzzyfungus · October 20, 2015, 3:44pm

effective.
Power
لُلُصّبُلُلصّبُررً ॣ ॣh ॣ ॣ
冗

anon33466019 · October 20, 2015, 3:45pm

snort
My phone gets the encoding wrong.

OtherMichael · October 20, 2015, 9:07pm

Castle Doctrine FTW

McGreens · October 21, 2015, 1:04pm

Taking a bath is a sport?!

Also, I wasn’t the only one to read the headline as “The Unicorn Consortium” was I?

OtherMichael · October 21, 2015, 1:23pm

So is wearing/listening-to a pair of headphones. YMMV

xeni · October 25, 2015, 2:35pm

This topic was automatically closed after 5 days. New replies are no longer allowed.

Topic		Replies	Views
Check out the 37 new emoji that were just approved boing	39	1986	September 20, 2021
New weed-themed emojis on the way boing	15	696	October 3, 2020
The brewing crisis over the pile of poo emoji boing	53	3629	November 8, 2017
How new emojis are born, a comic boing	15	1981	April 4, 2017
This guy upscaled over 1600 emojis and the results are really cool boing	17	1047	July 17, 2021

“Mom, Dad, where do emoji come from?” The Unicode Consortium, son

Related topics