Unicode and the Emoji Gender Gap

Unicode and the Emoji Gender Gap

In the past week, news spread quickly[1] that Google recently proposed a new set of female emojis.

Astute observers of Unicode may have noticed that this is not the first proposal to look at addressing the gender imbalances in emoji.

Just as 2015 saw the introduction of emoji modifiers for skin tone; 2016 may be the year we start seeing results for more active roles for women in emoji.


Above: Skin tone support was added in 2015 and gender is being addressed in 2016.

Much of the detail of how this may happen is scattered across multiple documents, so this is an attempt to summarize what's been proposed to date.

Unicode 9.0

The first step toward a more-inclusive emoji set came in what seems like an age ago now: mid-2015.

Some of the emoji candidates for Unicode 9.0 included gender pairs for explicitly male or female characters.

This resulted in the following new emojis:


Images: Apple iOS 9.3 + Emojipedia Sample Image Collection.

All of these are included in Unicode 9.0, which is due for release in June 2016.

Another emoji that somewhat fits the inclusiveness 'theme' is the pregnant woman. She is also due in June 2016.

This gender-pair matching was one of the first efforts to ensure that emojis with an explicit gender (either by name, or by common implementation) had versions available for male and female appearances.

Tag Sequences

Fast forward to February 2016, and a draft specification proposed a method of allowing emojis to be turned male, female, or gender-neutral.

This document, referred to as TR-52, was created by Mark Davis of Google and Peter Edberg of Apple and addressed more than just gender, but that's the part we'll look at here.

From TR-52 itself:

This document provides provides a new way of representing customizations of Unicode emoji characters. The first specified customizations provide for...gender variants (such as female runners or males raising a hand)

This proposal built on the example of the female runner first proposed by US Athlete Molly Huddle, and expanded this to a method of modifying other existing emojis to be male or female too.

The functionality is similar[2] to how modifiers or ZWJ sequences work, and looks something like this:

Base Emoji (eg 🏃) + Gender Tag + Female Attribute = Female Runner Emoji

These Tag Sequences, it was noted, could also apply to hair color (Hair Color Tag + Color Attribute) and direction (Direction Tag + Left or Right attribute)[3].

In short: an extensible way to allow a range of customisations in future, should additional tags be approved.

And then…

Expanding Emoji Professions

At the Unicode Technical Committee Meeting #147 last week, a brand new proposal came forward from four Google employees: Rachel Been, Nicole Bleuel, Agustin Fonts[4], and Mark Davis.

Expanding Emoji Professions: Reducing Gender Inequality is the name of the proposal, and the goal is stated:

"to create a new set of emoji that represent a wide range of professions for women and men with a goal of highlighting the diversity of women’s careers and empowering girls everywhere"

Amongst the proposed characters from Google are new roles and professions, with a focus on those employing women: Nurses, Teachers, Scientists, more.


Above: Sample images proposed by Google.

The previous TR-52 proposal, while making the current emoji set more representative, didn't cover cases where an existing male (or female) version of an emoji doesn't exist.

This proposal seeks to resolve that.

Multiple ways to implement these characters are presented, but the proposal ultimately recommends using ZWJ sequences to build these emojis, in the same way family variations are created.


Above: Zero Width Joiner Sequences are the proposed method for creating new emoji professions.

Using this approach, a woman (or man) emoji is combined with an object emoji to create a new emoji for each profession:

👩 Woman + 🔧 Wrench = Female Mechanic

👨 Man + 🏫 School = Male Teacher

👩 Woman + 🏥 Hospital = Female Nurse

Now hang on a minute. Why not just create new characters for each profession, and then use the aforementioned tag sequences from TR-52?

Speed.

Path to availability

If new emojis were agreed upon today for mechanic, teacher, or nurse; the earliest they would be included in the Unicode Standard is mid-2017, as part of Unicode 10.

Further to this, to ensure both male and female versions of any new emoji were available, TR-52 (emoji tag sequences) would also need to be approved in a similar timeframe.

Yet TR-52 is still considered a draft, and Unicode 10 remains off in the distance.

Now what?

To recap:

  • Four gender-pair emojis coming in Unicode 9 are here to stay. In retrospect, they may have been better served by a later proposal[5] rather than being encoded as with codepoints

Two methods for increasing emoji gender diversity remain:

  • Emoji Tag Sequences (TR-52) which can apply to more than just gender, but does not address new professions
  • A set of ZWJ sequences to address a lack of representative professions (Expanding Emoji Professions)

Generally speaking, any approach which requires less new codepoints or functionality is simpler to implement. Here's what has been achieved with few new codepoints in the past year:

  • Skin tones: addressed by five modifier characters
  • Inclusive families: addressed by a single ZWJ character (which already existed!)

Total new emoji images: 299
Total new codepoints: 5


Above: 299 new emoji images have been introduced with only five new codepoints.

Given this reality, it seems that Zero Width Joiners may be more pragmatic if the goal is to see women represented on keyboards sooner than 18 months from now.

Google explicitly mentions in their Expanding Emoji Professions proposal that ZWJ sequences were chosen as their preferred technique due to:

  • No need to encode new characters
  • Faster path to public availability

Culture v Implementation

Part of what can be frustrating about emoji progress stems from the way that cultural concerns and technical implementation details couldn’t be further from one another.

Agreeing in principal that emojis should be more inclusive[6] doesn’t magically get us over the line how that idea should be implemented.

It's clear that both proposals on gender have been well recieved globally.

Now it's time to get the details sorted on how these two methods can work together, or which is to be chosen as the path forward.



  1. Stephen Colbert loves his Unicode. ↩︎

  2. Similar, but not the same. New functionality would still be required. ↩︎

  3. Bundled into the same technical report was a method of encoding subregion flags, which didn’t fit the same tag + attribute model. ↩︎

  4. Yes, that's his real surname. ↩︎

  5. And in retrospect I also wouldn’t have had that cheesecake just before bed the other night, but we all live and learn. ↩︎

  6. Which I completely agree with. We do need better representations of women in emoji. ↩︎