Business

Microsoft Makes a New Push Into Smaller AI Systems

SEATTLE — In the dizzying race to build generative artificial intelligence systems, the tech industry’s mantra has been bigger is better, no matter the price tag.
Posted 2024-04-23T05:42:13+00:00 - Updated 2024-04-25T01:18:10+00:00
FILE — Sebastien Bubeck, an artificial intelligence researcher with Microsoft, in Seattle, on May 8, 2023. On Tuesday, April 23, 2024, Microsoft introduced three smaller A.I. models that are part of a technology family the company has named Phi-3. The company said even the smallest of the three performed almost as well as GPT-3.5, the much larger system that underpinned OpenAI’s ChatGPT chatbot when it stunned the world upon its release in late 2022. (Meron Tekie Menghistab/The New York Times)

SEATTLE — In the dizzying race to build generative artificial intelligence systems, the tech industry’s mantra has been bigger is better, no matter the price tag.

Now tech companies are starting to embrace smaller AI technologies that are not as powerful but cost a lot less. And for many customers, that may be a good trade-off.

On Tuesday, Microsoft introduced three smaller AI models that are part of a technology family the company has named Phi-3. The company said even the smallest of the three performed almost as well as GPT-3.5, the much larger system that underpinned OpenAI’s ChatGPT chatbot when it stunned the world upon its release in late 2022.

The smallest Phi-3 model can fit on a smartphone, so it can be used even if it’s not connected to the internet. And it can run on the kinds of chips that power regular computers, rather than more expensive processors made by Nvidia.

Because the smaller models require less processing, big tech providers can charge customers less to use them. They hope that means more customers can apply AI in places where the bigger, more advanced models have been too expensive to use. Though Microsoft said using the new models would be “substantially cheaper” than using larger models like GPT-4, it did not offer specifics.

The smaller systems are less powerful, which means they can be less accurate or sound more awkward. But Microsoft and other tech companies are betting that customers will be willing to forgo some performance if it means they can finally afford AI.

Customers imagine many ways to use AI, but with the biggest systems “they’re like, ‘Oh, but you know, they can get kind of expensive,’” said Eric Boyd, a Microsoft executive. Smaller models, almost by definition, are cheaper to deploy, he said.

Boyd said some customers, like doctors or tax preparers, could justify the costs of the larger, more precise AI systems because their time was so valuable. But many tasks may not need the same level of accuracy. Online advertisers, for example, believe they can better target ads with AI, but they need lower costs to be able to use the systems regularly.

“I want my doctor to get things right,” Boyd said. “Other situations, where I am summarizing online user reviews, if it’s a little bit off, it’s not the end of the world.”

Chatbots are driven by large language models, or LLMs, mathematical systems that spend weeks analyzing digital books, Wikipedia articles, news articles, chat logs and other text culled from across the internet. By pinpointing patterns in all that text, they learn to generate text on their own.

But LLMs store so much information, retrieving what is needed for each chat requires considerable computing power. And that is expensive.

While tech giants and startups like OpenAI and Anthropic have been focused on improving the largest AI systems, they are also competing to develop smaller models that offer lower prices. Meta and Google, for instance, have released smaller models over the past year.

Meta and Google have also “open sourced” these models, meaning anyone can use and modify them free of charge. This is a common way for companies to get outside help improving their software and to encourage the larger industry to use their technologies. Microsoft is open sourcing its new Phi-3 models, too.

After OpenAI released ChatGPT, Sam Altman, the company’s CEO, said the cost of each chat was “single-digits cents” — an enormous expense considering what popular web services like Wikipedia are serving up for tiny fractions of a cent.

Now, researchers say their smaller models can at least approach the performance of leading chatbots like ChatGPT and Google Gemini. Essentially, the systems can still analyze large amounts of data but store the patterns they identify in a smaller package that can be served with less processing power.

Building these models are a trade-off between power and size. Sébastien Bubeck, a researcher and vice president at Microsoft, said the company built its new smaller models by refining the data that was pumped into them, working to ensure that the models learned from higher-quality text.

Part of this text was generated by the AI itself — what is known as “synthetic data.” Then human curators worked to separate the sharpest text from the rest.

Microsoft has built three different small models: Phi-3-mini, Phi-3-small and Phi-3-medium. Phi-3-mini, which will be available Tuesday, is the smallest (and cheapest) but the least powerful. Phi-3 Medium, which is not yet available, is the most powerful but the largest and most expensive.

Making systems small enough to go directly on a phone or personal computer “will make them a lot faster and order of magnitudes less expensive,” said Gil Luria, an analyst at the investment bank D.A. Davidson.

This article originally appeared in The New York Times.

Credits