Generative AI Foundry service offers custom large language models and SAP is one of its first customers.
One of the issues with artificial intelligence (AI) that often goes overlooked is just how expensive it is to build and train a large language model (LLM). With rumors earlier this year that OpenAI is burning through $700,000 USD a day to run ChatGPT, few companies are equipped to create their own versions from the ground up.
That’s why one of the most influential players in the AI space for both hardware and software has come up with a different approach. This week at Microsoft Ignite, NVIDIA announced its new Foundry AI service, combining its own foundation models, the NeMo framework and tools, and the company’s DGX Cloud AI supercomputing services into one package. Together, these enable Foundry AI users to build their own custom generative AI models for enterprise applications via Microsoft’s Azure platform.
“Enterprises need custom models to perform specialized skills trained on the proprietary DNA of their company—their data,” said Jensen Huang, founder and CEO of NVIDIA in a press release. “NVIDIA’s AI foundry service combines our generative AI model technologies, LLM training expertise and giant-scale AI factory. We built this in Microsoft Azure so enterprises worldwide can connect their custom model with Microsoft’s world-leading cloud services.”
Satya Nadella, chairman and CEO of Microsoft, echoed Huang’s sentiments in the same press release. “Our partnership with NVIDIA spans every layer of the Copilot stack—from silicon to software—as we innovate together for this new age of AI,” he said. “With NVIDIA’s generative AI foundry service on Microsoft Azure, we’re providing new capabilities for enterprises and startups to build and deploy AI applications on our cloud.”
At this point, anyone familiar with LLMs may be wondering how NVIDIA intends to mitigate the risk of AI hallucinations, which would undermine any trust in AI-powered applications. The solution is retrieval-augmented generation (RAG), which connects models with enterprise data to improve their accuracy and reliability.
SAP, one of AI Foundry’s inaugural customers, is using the RAG workflow to deploy Joule, a new natural language, generative AI copilot. According to SAP CEO and executive board member Christian Klein, Joule will help SAP customers automate time-consuming tasks, such as data analysis.
The AI Foundry will offer several different foundation models—both NVIDIA’s and others, such as Meta’s Llama 2—all hosted in the Azure AI model catalog. Among the NVIDIA offerings is a new family of Nemotron-3 8B models, trained with 8 billion parameters and designed for enterprise generative AI applications.
Taken together with OpenAI’s recent announcements regarding customizable GPTs—especially the internal-only versions for enterprise customers—NVIDIA’s AI Foundry points to customized, specialty models as the future of generative AI.