In northeast England, halfway between Norfolk and Yorkshire, an AI-powered robot spends its days looking at strawberries. It’s not as easy as it sounds.
A human farmer can gauge a strawberry’s ripeness level by sight and weight, but the process involves putting each strawberry on a scale, which can be destructive and time-consuming. The robot can do the same job for up to 4 million strawberries a day by performing a simple scan of the fruit, undisturbed.
FruitCast, the agricultural AI startup behind the robots, taught its bots how to do their jobs with data from V7 Labs, a London-based startup that helps AI companies automate the training-data process for models. Training can be one of the most labor-intensive parts of getting an AI system off the ground, since it often calls for not only time and resources, but also vetted and relevant data.
“The robots are kind of stupid until you put the intelligence on them,” Raymond Tunstill, CTO of FruitCast, which was spun off from the University of Lincoln’s food-tech institute, told Emerging Tech Brew. He added, “It’s all about taking examples from the real world—is it a ripe strawberry, or is it unripe—and showing that to our neural networks so that the neural networks can, essentially, learn. And without V7, we never would’ve been able to classify [them].”
Since its 2018 debut, V7 has used its computer vision platform to train AI models to identify everything from lame cows to grapevine bunches, depending on the client’s needs. In 2020, V7 raised a $10 million total seed round, and so far, its clients include more than 300 AI companies, as well as academic institutions like Stanford, MIT, and Harvard.
“The secret behind V7 is this system that we call AutoAnnotate,” the startup’s CEO Alberto Rizzoli told us. He and his cofounder, Simon Edwardsson, thought it up based on obstacles encountered in their previous business venture: Aipoly, a computer-vision startup that allowed blind users to identify objects using their phone cameras. Though the software worked “decently well,” Rizzoli recalled, “training data was the really difficult part to create.”
So they created AutoAnnotate, a general-purpose AI model for computer vision. When a client comes to V7 with training data—images or videos they’d like an AI model to learn from—V7 detects the object’s boundaries in each frame (like strawberries, for instance), and then uses AutoAnnotate to label it. According to its internal measurements, labeling a high-quality piece of training data could take a human up to 2 minutes, said Rizzoli, compared to about 2.5 seconds for AutoAnnotate.
Keep up with the innovative tech transforming business
Tech Brew keeps business leaders up-to-date on the latest innovations, automation advances, policy shifts, and more, so they can make informed decisions about tech.
To create that training data, V7’s model starts off with a “continual learning” approach. That could begin with subject matter experts in, say, horticulture, drawing boxes around images of fruit and classifying it by ripeness level (e.g., a “level-3 strawberry”). They then either accept or correct each of the model’s attempts to do the same.
After about 100 human-guided examples, a model is able to make relatively confident classifications, so it transitions into what Rizzoli calls a “co-pilot approach”—for any given choice, the AI provides its confidence score and the human makes corrections.
“Because it’s training data, we always have a human verify it, but it becomes a faster process,” Rizzoli said. Later, he added, “When they find something that is low-confidence, they fix it, otherwise it can go into the knowledge of the model—of the training set.”
The company finds human experts via a network of business process outsourcing companies, agencies, and consultants, which Rizzoli claims can find a group of labelers on most topics within 48 hours.
Think of it like sending your pup to dog training camp and still having responsibilities upon its return. When a customer develops their fully-trained model through V7, they’ll still need to keep an eye on it and correct any glaring mistakes, but it should, in theory, be much more capable than before. For example, a newly-trained model may be well-equipped to detect strawberry ripeness levels, but if it’s somehow presented with a photo of a strawberry keychain, it won’t know how to proceed.
Even if a model does become an expert in its domain, it’s risky to use it for tasks besides what it’s specifically trained for, since results could be unpredictable.
“If you have a car that is trained on data from the United States, it’s able to have certain weather conditions, it's not able to do certain road signs, and to figure out whether it can actually drive on snow or desert, you need to test it—you need to run it on a data set of desert-driving footage and check the accuracy,” Rizzoli said. “Believe it or not, this sounds pretty straightforward, but there are almost no tools for doing this. And very few people are actually doing benchmarking on training data, because it’s a new thing.”