Microsoft is about to announce a suite of Office 365 tools using GPT-4 artificial intelligence (AI) software, but the company is now facing a hardware shortage of internal AI servers, according to sources familiar with the matter.
This has forced Microsoft to ration hardware resources and limit the use of other internal AI tool development teams to ensure sufficient resources for the new Bing chatbot based on GPT-4 and the upcoming release of new Office tools. The hardware shortage is also having an impact on Microsoft customers: at least one customer reported long waits to use OpenAI software already available through Microsoft's Azure cloud service.
Microsoft says it is racing to add more hardware to add AI computing power, but if it doesn't do it fast enough, it could limit the appeal of its Azure OpenAI service to new customers. These customers will use the service to add AI capabilities to their applications. Microsoft is already ahead of Google in commercializing this new AI tool, but that advantage will be hard to capitalize on if the hardware shortage is delayed.
Google stole a march on Microsoft this week by releasing its own AI-assisted writing tools, including software that can automatically compose text in Google Docs and Sheets based on short prompts.
The server shortage comes at a time when OpenAI and Microsoft are trying to quickly expand the reach of the AI software that customers can use their own data to fine-tuning to develop customized AI tools such as image generation, document summary, search engines and chatbots.
Microsoft has invested billions of dollars in OpenAI, giving it the right to sell software to startups through its Azure OpenAI service. OpenAI also licenses its software directly to customers -- and as part of its partnership with Microsoft, all OpenAI machine learning models must run on Azure servers. Both services are priced the same, at less than a penny per request.
The services launched by the two companies over the past three months have attracted widespread interest from enterprise users who want to incorporate the underlying technology from OpenAI's ChatGPT chatbot into their products. But new customers of Microsoft's Azure OpenAI service face a long wait.
At the heart of the hardware shortage is the need for new big AI models like GPT-4 to run on Gpus, server chips that can process large amounts of data at once. To handle the massive computing demands of this model, Microsoft has consolidated tens of thousands of Gpus into clusters and scattered them across data centers.
Until it's clear what users will want, Microsoft will have to reserve a lot of its current GPU resources for the new Bing GPT-4 chatbot and the upcoming GPT-4 Office tools.
Microsoft launched the Bing smart chat feature in February and is still working to measure how many people are willing to use the service on an average day, according to people familiar with the matter. That makes it difficult for engineers to predict the computing resources needed for the feature, people familiar with the matter said.
The upcoming Office PGT-4 tool could also lead to a surge in demand from Microsoft customers for OpenAI's chatbot technology. These new features include AI document summaries, personalized writing suggestions, and editing suggestions.
Other Microsoft teams working on AI have had to give way to Bing and Office, including those working on various machine-learning models, including Microsoft's Turing Natural Language Generation Model, which understands text and previously provided the underlying technology for email and search tools in Office apps.
When the teams want to use Gpus to develop new AI tools or test existing AI software, they have to submit a special request to a corporate vice president and get approval before they can use the hardware, according to people familiar with the matter.
Some applications have had to wait days or even weeks for approval, people familiar with the matter said. Microsoft has been rationing GPU resources internally since late 2022, but the wait has been getting longer since January.
A Microsoft spokesperson added in the statement that the company is adding more AI resources to services like Azure and follows a "process that prioritizes customer needs and adjusts accordingly." They added that Microsoft is not worried about AI resources.
At least one Microsoft customer revealed that they had also experienced delayed access issues. "It's almost impossible to use an app right away," says Edo Segal, founder and CEO of marketing software startup TouchCast. Edo is developing GPt-based interactive user manuals for a number of auto companies, and the technology is licensed through Microsoft Azure.
Existing AzureAI customers were barely affected. Spokesmen for Cruise, a driverless car developer, and AI search start-up Perplexity both said they had not had trouble using Azure's GPU resources.
In November, Microsoft and GPU maker Nvidia jointly announced that Microsoft would add tens of thousands of new processors to expand its AI processing power, but it's unclear how far the purchase will go.
The two companies also jointly announced Monday that Azure customers will soon be able to get a head start on Nvidia's new H100 GPU -- which has yet to be widely released. A small number of internal Microsoft teams already use the H100, but most have yet to gain access, according to people familiar with the matter.
User comments