There's no doubt that companies are eager to bring more generative AI capabilities into their organizations. However, it's also becoming increasingly clear that the ability to do so is proving to be more challenging than many initially realized.
A recent research study on generative AI (GenAI) use in US businesses by TECHnalysis Research highlighted this dichotomy in a stark way. Vendors who are building solutions designed to help businesses enable the impressive benefits that GenAI can offer are recognizing these challenges.
A most recent example is Nvidia who made several AI-focused announcements at the SIGGRAPH Expo trade show, including a new partnership with open-source AI model provider Hugging Face, a new tool for creating custom models across multiple environments, and some new enhancements to its suite of enterprise AI offerings. While each tackles a different aspect of the solution, collectively, they're all geared towards making the process of working with GenAI easier.
With the Hugging Face partnership, Nvidia wants to customize and deploy foundation models in a more straightforward manner. Hugging Face has established itself as the preeminent marketplace for open-source versions of these models, and many companies have started looking for offerings that they believe will be well-suited to their specific needs.
Simply finding the right models is only the first step in the journey, however, as companies typically want to customize these GenAI models with their own data before using them within their organizations.
To do that, each company needs to follow a process that allows it to safely and securely connect data with those AI models and enable the appropriate computing infrastructure to further train the model with its data.
That's where the new partnership kicks in. Nvidia is building a direct connection from the models on Hugging Face to a DGX Cloud service powered by Nvidia GPUs where that training can be done.
Called "Training Cluster as a Service," this fee-based offering – expected to launch later this fall – will streamline the process of configuring all the various software components necessary to make the training process work and give organizations an easier way to get the critical model customization efforts done. Each DGX Cloud instance includes eight of the company's H100 or A100 80 Tensor Core GPU cards to do the training work.
Of course, foundation model development, as well as applications and services that leverage those models, is an ongoing process. In light of that, Nvidia also debuted a new tool that's designed to let AI developers work on those models in various environments.
Called Nvidia AI Workbench, the new software application lets developers build and test GenAI-focused efforts on Nvidia GPU-powered PCs and workstations. Their work can then be easily transferred and scaled to various public and private cloud environments (including Amazon's AWS, Microsoft's Azure, and Google's GCP), as well as Nvidia's own DGX Cloud.
It turns out that without something like AI Workbench, it's difficult to do these kinds of migrations because of the need to separately configure each environment for the model. With Nvidia's new AI Workbench tool, however, the process becomes simple not only for individual programmers but teams of developers working across different geographies or in different parts of the organization. AI Workbench takes care of finding and linking all the various open-source libraries and frameworks necessary to make the transfer more seamless. Once again, the goal is to make the process of building and deploying these GenAI models easier than it's been in the past.
Forthcoming RTX-equipped systems from companies like Dell, HP, Lenovo, HPE and Supermicro are expected to be able to support Nvidia AI Workbench in both Windows and Linux environments.
The final piece of the puzzle is what Nvidia had previously announced at its GTC event last spring. Enterprise AI 4.0 incorporates additional enhancements to the company's line of Nemo foundation models that Nvidia claims now make them a "cloud-native framework" for building custom large language models (LLMs).
The company also unveiled the Nvidia Triton Management Service for automating the deployment of multiple Triton Inference Servers, and Nvidia Base Command Manager Essentials, which is intended to manage AI computing clusters across multiple environments. The latter two capabilities reflect Nvidia's growing reach into the overall automation and management of AI workloads in different environments. The Triton service is specifically for efficient orchestration of Kubernetes-based AI inferencing workloads across containers, while the Base Command Manager uplevels things to entire computing clusters in larger hybrid and multi-cloud environments.
What's clear from all these announcements is that Nvidia – like other vendors – is recognizing the increasing complexity of the efforts required to make GenAI an important part of any business' IT efforts. The new tools look to be an important step towards simplifying at least some of the processes involved, but it's going to be awhile before many organizations are comfortable doing this kind of work on their own.
This whole field is unexplored territory for most organizations, and there needs to be a lot more education to bring all the interested parties up to speed. At the same time, there's a very real sense that organizations need to jump into these GenAI initiatives quickly, lest they fall behind their competition. The resulting disconnect is going to be difficult to navigate for a while, but efforts like what Nvidia just unveiled are steps in the right direction.
Bob O'Donnell is the founder and chief analyst of TECHnalysis Research, LLC a technology consulting firm that provides strategic consulting and market research services to the technology industry and professional financial community. You can follow him on Twitter @bobodtech