This is the first piece in our Demystifying Algorithms series. Click here to read the second installment.
Five years ago, the world of artificial intelligence—and the algorithms it runs on—looked very different.
Asking your Google Home to play Adele’s chart-topping single wasn’t possible yet. IBM Watson was still widely considered a beacon for AI advancement, and DeepMind's AI victory over a human at Go was still fresh. Machine learning engineers were facing earlier versions of today’s image classification and speech recognition challenges. And though most tech giants hadn’t earmarked corporate funding for ethical AI, the conversation was becoming more mainstream as the impact of algorithms on human lives became clearer.
Behind each of these algorithmic advancements and challenges is a series of human judgment calls. Though the fact may get lost in the language we use sometimes, "the algorithm" is more an extension of human thinking than it is some sci-fi invention. That’s why we asked leading experts across research, business, and academia for their takes on what the field will face in the near future.
The question we posed: "What's the single biggest challenge that those building and working on algorithms will need to grapple with in the next five years?"
Read on for their predictions.
These interviews have been edited for length and clarity.
Having the hard conversations. People started using algorithms to avoid difficult conversations. In the last 10 or 15 years, big data, algorithms, and AI—which are all the same thing for me: automated decisions and predictions—have been useful mechanisms to do that. In the next five years, we’ll have to start having those difficult conversations, and there will be friction between those who want to and those who don’t.
Who deserves a loan? Who deserves homeless shelters? When is a teacher a “good” teacher? Right now, all of those questions are being answered automatically by algorithms, but not for any particularly good reason—and not according to any ethic board’s value system.
As automatic decision-making mechanisms, algorithms are relatively amenable to scrutiny: You can test them for how they behave under certain circumstances (whether they favor the rich, for example). It’ll be difficult for people to claim that their algorithms behave according to some principle if they can be shown not to by a simple test.
People who own algorithms have held the advantage in this argument, since they protect them with IP laws and other kinds of ownership rights. But it’s going to be harder for them to do that as people start learning how to ask the questions that need to be asked.
Bridging the gap between proof-of-concept and deployment. AI is creating tremendous value in the consumer internet, but in other industries, the number of valuable use cases is still in an earlier stage of growth. Despite many “demos” and proof-of-concept implementations, getting an AI system deployed in a factory, hospital, or farm today is still a labor of love.
Just as software engineering has evolved from wonderful creations developed through the heroic efforts of a handful of individuals to now reliable and valuable artifacts built by teams, AI also needs to build out the tools, processes and mindsets to turn AI creation and deployment of systems into a systematic and repeatable process.
With this shift, building and deploying AI will become something that many teams will reliably succeed in, while also making sure that AI systems across the board are safe, set up to perform well, and reasonably free from bias.
Representation in algorithm development. A critical (and overdue) challenge will be broadening participation in this area. Machine learning, computer science, statistics, and related areas have been foundational to algorithm development—which now touches on nearly every field, from finance to healthcare. Algorithm developers must meaningfully work with experts from a wide range of disciplines, including non-technical disciplines, throughout the development pipeline.
While algorithmic technologies are being developed rapidly and made continuously new, the problems they are applied to often aren’t. Consulting non-technical experts at the end of the development process—if at all—ignores input that we need to develop algorithms ethically.
Scholars in fields like history and critical race studies have poignantly raised issues with emerging technologies that echo past societal impacts of technological successes and failures. Social scientists offer expertise in the social systems that interact with technical systems, shedding light on the suitability of algorithmic technologies and their underlying data for different contexts. Stakeholders, too, are experts. They have firsthand experience with how systems operate in their communities.
As a field, we must work alongside those with complementary expertise in order to develop ethical technologies. In essence, problems that are not purely technical require solutions that are not purely technical.
Trust. Whenever we imbue machines with decision-making capabilities, we need to understand how we can trust those decisions and guard against unintended consequences.
There are at least four major considerations that go into trusting an algorithm. First, the decisions must be fair and unbiased. Second, algorithmic decisions should be transparent and explainable. Third, the algorithm must be robust—both against unexpected naturally-occurring inputs and against inputs that have been specifically designed to fool the system. Finally, we should only trust in algorithms when humans are willing to take ultimate responsibility for their impact.
While not inherent to the algorithm itself, the notion of responsibility and accountability is critical to whether we should trust its outputs. It’s far too easy to blame “the algorithm” when something goes wrong, but ultimately, humans must bear the responsibility for these decisions, and they must understand where the potential pitfalls lie.
Fairness. That’s what keeps me up at night. All humans have biases, which can easily creep into AI systems. These biases are hard to detect and change as society evolves, making it impossible to guarantee that any AI system is bias-free. So we need to focus on mitigating fairness issues as much as possible.
Keep up with the innovative tech transforming business
Tech Brew keeps business leaders up-to-date on the latest innovations, automation advances, policy shifts, and more, so they can make informed decisions about tech.
Empowering teams diverse in backgrounds and perspectives, and including stakeholders from outside the tech industry, are critical first steps. We also need to invest in data collection efforts so that data sets—and the systems they train—reflect the diversity of our society. Testing needs to be done in the real world, in the context in which an AI system will operate and on an ongoing basis.
Finally, because fairness is deeply contextual—and because there is no single definition of it that applies equally well to all AI systems—we will often need to make hard decisions based on competing priorities, including decisions to not build or deploy a system for certain purposes.
Simulation for scale. There’s never been a more exciting time to be in autonomous driving, but over the next five years, the biggest challenge will be to continue evolving these systems to handle an ever-broader range of scenarios—which is a prerequisite for deployment at scale. This includes the ability to robustly handle a set of challenging weather conditions, as well as different driving patterns across multiple cities and countries.
Much of our machine learning work is geared to meeting that goal of scaling. To achieve that, machine learning systems need to robustly model the range of situations that occur in real-world driving—including unlikely ones—and make effective use of the significant driving experience we’ve gathered and simulated over the last decade.
In order to help ensure our system will act safely and effectively in the messiness of the real world, evaluation methods must be rigorous, and simulation is a key pillar. It’s essential to quick and scalable progress that we can simulate driving scenes and agents in a very realistic way—another area where machine learning is vital.
Systemic racism. It’s in our training data and our models, and it must be taken into account in how the resulting AI systems are applied across society. Besides the myriad problems with facial recognition and predictive policing, we find racial bias in AI-informed medical care decisions, hiring recommendations, access to housing and social programs, visa application approvals, school exam results, hate speech detection, dynamic pricing algorithms for ride hailing services, and even dating apps.
So how do you “solve for” automated racism? It requires both societal and technical efforts. For one, we need meaningful regulation to identify racial bias occurrences, offer recourse and remediation to its victims, and make changes. Researchers and engineers have to understand bias and fairness measures and how to apply them.
But applications of algorithmic fairness or bias definitions do not work equally across all groups, cultures, or countries and can even make things worse. This is an emerging field with no agreed-upon standards or methods. Like racism in our society, racism in AI will not be a quick, clear-cut, or easy fix.
Large-scale understanding of individual users. Machine learning relies on scale, and its commercial applications today are still too often limited by the cost and complexity of building real-time data systems that can also be personalized to individual users. All in all, ML inherently benefits from platform dynamics: One infrastructure business powering a large user base can solve these problems for every user.
Stripe's Radar ML technologies, for instance, have seen 89% of all credit cards previously, which helps us gauge the differences between legitimate and fraudulent activity—such as how many credit cards we’ve seen from a given IP address. That, in turn, helps reveal which IP addresses are currently being used by fraudsters.
With the right network economics, these scale dynamics could help tackle any problem where progress is defined by systems' understanding of individual users.
Taking responsibility and systematic solutions. As computing impacts every aspect of society, our single biggest challenge will be accepting the responsibility that comes with that impact. We already see how easy it is to create algorithms that discriminate in hiring, lending, criminal justice, and more.
To fix this problem, we must change as a field. First, we must build systems that account for all those who are affected by our work. This accounting is a technical problem as well as a social one: we cannot treat minority groups as exceptions and edge cases; rather, our algorithms must be flexible enough to respond appropriately. Second, to facilitate these changes we must bring more diverse voices into computing. Our invisibility problem is not just that we do not see those who are standing in front of us, it is also that we do not notice those who are not standing in front of us. That blindness leads us to be, well, blind to our problems.
Finally, we need to move away from discussions about where specific products or algorithms go wrong, and start discussing systematic solutions that involve every point in the development pipeline. The work is hard, but we must get it right.