A montage with the logos of Google, Microsoft and Meta
© FT montage

Artificial intelligence companies that have spent billions of dollars building so-called large language models to power generative AI products are now banking on a new way to drive revenues: small language models.

Apple, Microsoft, Meta and Google have all recently released new AI models with fewer “parameters” — the number of variables used to train an AI system and shape its output — but still with powerful capabilities.

The moves are an effort by technology groups to encourage the adoption of AI by businesses who have concerns about the costs and computing power needed to run large language models, the type of technology underpinning popular chatbots such as OpenAI’s ChatGPT.

Generally, the higher the number of parameters, the better the AI software’s performance and the more complex and nuanced its tasks can be. OpenAI’s latest model GPT-4o, announced this week and Google’s Gemini 1.5 Pro, are estimated to have more than 1tn parameters. Meta is training a 400bn-parameter version of its open-source Llama model.

As well as struggling to convince some enterprise customers to pay the large sums needed to run generative AI products, there are also concerns over data and copyright liability holding back adoption.

Bar chart of Cost in $ per 1mn tokens (equivalent to about 1mn words inputted or generated) showing AI model pricing

That has led tech groups like Meta and Google to pitch small language models with just a few billion parameters as cheaper, energy-efficient, customisable alternatives that require less power to train and run, which can also ringfence sensitive data.

“By having this much quality at a lower cost point, you actually enable so many more applications for customers to go in and do things that prohibitively there wasn’t enough return on that investment for them to justify really doing it,” said Eric Boyd, corporate vice-president of Microsoft’s Azure AI Platform, which sells AI models to businesses.

Google, Meta, Microsoft and French start-up Mistral have also released small language models that show advancing capabilities and can be better focused on specific applications.

Nick Clegg, Meta’s president of global affairs, said Llama 3’s new 8bn parameter model was comparable to GPT-4. “I think on pretty much every measurement you could think of, you see superior performance,” he said. Microsoft said its Phi-3-small model, with 7bn parameters, outperformed GPT-3.5, an earlier version of OpenAI’s model.

The small models can process tasks locally on a device, rather than send information to the cloud, which could appeal to privacy-conscious customers who want to ensure information is kept within internal networks.

Charlotte Marshall, a managing associate Addleshaw Goddard, a law firm which advises banks, said that “one of the challenges I think a lot of our clients have had” in adopting generative AI products was adhering to regulatory requirements over handling and transferring data. She said smaller models provided “an opportunity for businesses to overcome” legal and cost concerns.

Smaller models also allow AI features to run on devices such as mobile phones. Google’s “Gemini Nano” model is embedded inside its latest Pixel phone and Samsung’s latest S24 smartphone.

Apple has hinted that it is also developing AI models to run on its bestselling iPhone. Last month, the Silicon Valley giant released its OpenELM model, a small model which is designed to perform text-based tasks.

Microsoft’s Boyd said smaller models would lead to “interesting applications, all the way down into phones and into laptops”.

OpenAI chief Sam Altman said in November that the San Francisco-based start-up offered different-sized AI models to customers that “serve separate purposes”, and it would continue to build and sell these options.

“There are some things where smaller models will work really well,” he added. “I’m excited for that.”

However, Altman added OpenAI would remain focused on building larger AI models with scaled-up capabilities, including the ability to reason, plan and execute tasks and eventually achieve human-level intelligence.

“There are a lot of times where I think people just want the best model,” he said. “I think that’s what people mostly want.”

Additional reporting by George Hammond in San Francisco

Copyright The Financial Times Limited 2024. All rights reserved.
Reuse this content (opens in new window) CommentsJump to comments section

Follow the topics in this article

Comments