Keep up with the innovative tech transforming business
Tech Brew keeps business leaders up-to-date on the latest innovations, automation advances, policy shifts, and more, so they can make informed decisions about tech.
The new wave of AI that has been captivating Silicon Valley is finally finding its voice.
Recent announcements from OpenAI and Amazon promise to bring speakable, conversational capabilities to generative AI for the first time on a mass scale. Meanwhile, Microsoft is focused on turning its AI into a more all-purpose digital assistant woven through all of its products.
The moves all add up to a wider push to make AI more versatile in its ability to interact with inputs beyond simple text, like voice and imagery. Here’s a quick rundown of the flurry of news out of the AI world this week:
- OpenAI announced a significant update to ChatGPT with new voice and image capabilities on Monday. The new-and-improved chatbot lets users do things like snap a photo of a pantry and have a voice conversation with the AI about what to cook for dinner, OpenAI claims.
- Last week, Amazon said it plans to outfit its Alexa digital assistant with a large language model (LLM) customized for voice interactions. The e-commerce giant also announced an up to $4 billion investment and a minority stake in OpenAI rival Anthropic in a signal that it’s getting more serious about the AI race.
- Microsoft is rolling out a new all-encompassing Copilot feature with generative AI capabilities in an effort to create an “everyday AI companion.”
A new era of AI: Gartner analyst Arun Chandrasekaran said it’s been clear for a while that multimodal AI chatbots that can flit between visual, audio, and text data with ease would be the next stage for the generative AI arms race. Google is also expected to accelerate this transformation with its forthcoming Gemini model, a multimodal AI offering expected by the end of the year.
“We do believe that for us to get to this next level of intelligence, the models have to incorporate multiple modalities very similar to the way human beings learn the world, and we’re starting to see that,” Chandrasekaran told Tech Brew. “We do believe multimodality will be much more of a common occurrence in the future.”—PK