Capturing images of diverse groups of people can be about more than who’s in front of the camera—certain skin color–based distortions have shaped the technology of photography throughout its entire history.
That’s the legacy that Dominique Mungin wrestles with in her role as Google’s Senior PgM of Product Inclusion and Equity. She’s worked on projects like the Monk Skin Tone Scale, a 10-shade spectrum developed in partnership with Harvard professor and sociologist Ellis Monk, and Google’s Real Tone tech, which aims to ensure cameras and image editing in Google products better capture darker skin.
Now, as Google and other companies push further into image generation and recognition AI, Mungin is trying to make sure societal biases aren’t similarly baked into the next generations of visual technology. Her team collaborated with stock photography company Tonl to supply more diverse imagery for training machine learning models. And the project more recently expanded to partner with Chronicon and Rampd to “source custom images featuring and centering individuals with chronic conditions and disabilities.”
Still, challenges persist. An entire subfield of AI is devoted to understanding and rectifying the ways the biases of AI’s creators and society at large play out in the technology, and Google has had its own problems in that area.
We talked to Mungin about her goals at Google, how biases have defined image technology, and the challenges of making AI inclusive and representative.
This conversation has been edited for length and clarity.
You’ve been at Google for 13 years in a number of different positions across the company. How did you get into your current role?
My journey at Google started in HR. One of the roles I held for a number of years was in our learning and development organization, where I started in diversity and inclusion-related training that progressed and evolved over the years to include and become a focus on racial equity training. And my passion, interest, and expertise in equity as a topic and as a measurable and achievable goal grew. And so I worked very regularly with the Product Inclusion folks to be able to quantify and talk about what does equity mean when it comes to product, and was able to move over to this team about a year and a half ago to work full-time on product equity.
Would you say image equity is something that the tech industry has been overlooking for a long time? What kinds of opportunities for reaching and including more people are there as these issues are addressed?
The development of commonly used technologies goes back further than the last couple generations of a product. There are long, long histories of development that impact the tech that we hold in our hands today and the tech that we use on a regular basis. So, for example, when we’re using the camera on a phone, the history and the product development of the camera began in the 1800s. And so acknowledging and understanding how these technologies have evolved, who and what use cases they were evolved for, for the majority of their duration—all of those things are really crucial to understanding the current limitations, as well as the opportunities of those technologies as we know them to be today.Real Tone is a great example of how the identification of historical limitations and how those limitations have manifested regularly in technological fails, essentially, have allowed us to create a new technology that really progresses image equity. And so I think there’s a lot more landscape to be covered. But it’s really exciting to see that historical arc acknowledged and see people energized by the opportunity that it presents.
Keep up with the innovative tech transforming business
Tech Brew keeps business leaders up-to-date on the latest innovations, automation advances, policy shifts, and more, so they can make informed decisions about tech.
How has the Monk Skin Tone scale evolved since it was first announced last year? What kind of influence has it had inside and outside of Google?
When talking about the Monk Skin Tone scale, I like to take a step back and talk about the core issue that it sought and seeks to address. And really, there isn’t yet an industry standard when thinking about how to identify and capture skin tone when it comes to computer vision and machine learning models. The most commonly used scale is a six-point scale that was a dermatological scale developed in the ’70s. And that scale really was focused on the UV spectrum and UV sensitivity. As a result, it skews lighter. So, acknowledging that, there was a search for a more representative scale and a scale that translated and could be used in machine learning models.
This 10-point Monk Skin Tone scale is really designed to represent a broader range of communities. And Google Research actually looked at and found that this particular scale was more useful for machine learning purposes, with participants finding it both more representative than the six-point scale and more inclusive. And so I bring it up because I think it’s a very exciting move forward. And I think it really begins to change not only how we think about skin tone and quantifying fairness, but also how we share with the industry.
One of the newer things that we’ve done [is] the Monk Skin Tone examples dataset. And that dataset actually was created in partnership with a stock photography company called Tonl, and it really seeks to share with the community and with other organizations ways to help train annotators, so that AI models more broadly can be more representative and understand the diversity of skin tone that’s out there.
What kinds of challenges have there been in equity in training data as Google moves further into image AI?
There are definitely challenges in creating diverse datasets, with finding imagery and videos, and making sure that they’re globally representative. The world is pretty big. And so finding a large representation of not only tone but style, hair texture, hair type, what folks may use as jewelry or other accessories, and doing so in a way that is culturally sensitive, requires intention. I think it’s something that we’re paying attention to and doing so with attention. But as technology evolves, the type of datasets that you need will also continually evolve. So from basic photos and images to video to unknown, unseen technology next, the datasets will continue to have to evolve with them.