The rise of agentic AI is intensifying the demand for computing power, with Nvidia’s CEO predicting a 100-fold increase in the need for accelerated computing. While many companies initially turned to cloud solutions for their AI projects, the long-term costs are prompting a shift back to on-premises setups. Experts, including H2O.ai’s CEO Sri Ambati, argue that on-premise GPUs significantly cut costs compared to cloud alternatives. Similarly, companies like Cloudera are observing clients gravitate toward on-premises solutions for agentic AI. As firms aim to balance efficiency and expenses, reevaluating cloud services is becoming essential to avoid wastage and manage budgets effectively. Alternative cloud providers like Vultr are also emerging to offer more cost-effective computing solutions.
The Rise of On-Prem Solutions in the Age of Agentic AI
As the demand for AI solutions continues to soar, the need for powerful computing infrastructure is becoming critical. According to Nvidia’s CEO Jensen Huang, the requirement for accelerated computing power could increase by a staggering 100 times due to the emergence of agentic AI. This raises a crucial question: Where will companies find the GPUs and server power to handle these extensive workloads? While the cloud might seem like the obvious choice, many industry experts warn that it could come with excessive costs.
When tools like ChatGPT hit the Market in late 2022, businesses poured resources into AI exploration, primarily utilizing cloud platforms. Initially, these cloud services offered a cost-effective solution for sporadic workloads. However, as companies move towards long-term AI implementations, particularly with agentic AI, the cloud appears less appealing.
One company making strides in this space is H2O.ai, based in San Francisco. They are assisting businesses in transitioning from AI proof-of-concept to full-scale deployment. H2O.ai’s CEO, Sri Ambati, notes that their partnership with Dell to establish on-site AI factories is gaining traction. He mentions, “Initially, budgets were unlimited for exploratory projects. However, as companies transition from demos to production, the cost implications become clearer.”
Understanding the financial dynamics of AI is essential. The process of converting language into tokens for large language models (LLMs) involves significant computational steps, which can impact costs. Ambati argues that on-premises GPUs present a more economical choice, reducing expenses by about one-third compared to cloud solutions.
Cloudera, another key player, is witnessing a renewed interest in on-premises AI processing. Corporate VP Priyank Patel highlights customers like Mastercard and OCBC Bank that are embracing agentic AI, recognizing its value beyond mere infrastructure considerations. He emphasizes, “The TCO argument favors on-prem solutions, particularly for training and inference.”
Despite the growing cloud sector, there are indicators that enterprises are reconsidering where they run their workloads. As Patel observes, “The cloud has been the focus for the past decade; now we are reassessing that approach.”
Public cloud giants like AWS, Microsoft Azure, and Google Cloud have enjoyed rapid growth, but they also carry the burden of perceived overcharging, with estimates of wastefulness ranging from 20% to 40%. Deloitte’s Akash Tayal points out that many enterprises waste considerable resources in the cloud due to a lack of careful monitoring and adaptation.
Alternative cloud providers also offer viable options. Companies like Vultr are capitalizing on this trend, providing cost-effective solutions to those hesitant about the public cloud model. As Vultr’s CMO Kevin Cochrane explains, they deliver significant savings—up to 90%—while enabling companies to invest in necessary AI infrastructure.
In summary, as businesses navigate the complexities of agentic AI, both financial considerations and technological demands are guiding them back to on-premises solutions. The shift from the cloud architecture reflects a broader reevaluation of how companies deploy AI technologies effectively and sustainably.
Related Items:
Nvidia Preps for 100x Surge in Inference Workloads, Thanks to Reasoning AI Agents
Nvidia Touts Next Generation GPU Superchip and New Photonic Switches
AI Lessons Learned from DeepSeek’s Meteoric Rise
Tags: AI, on-prem, Nvidia, cloud computing, H2O.ai, Cloudera, Vultr, agentic AI, inference workloads
What is Agentic AI in the Cloud?
Agentic AI in the cloud refers to advanced artificial intelligence systems that can take action on their own, hosted on remote servers or cloud platforms. This allows businesses to access powerful AI tools without needing to maintain complex infrastructure.
How much does it cost to run Agentic AI in the Cloud?
The cost can vary widely depending on factors like the type of AI service, the amount of data processed, and the number of users. On average, you might spend from a few hundred to several thousand dollars a month. It’s important to compare different providers to find the best fit for your budget.
What are the benefits of using Agentic AI in the Cloud?
Some benefits include:
– Reduced infrastructure costs, since you don’t need your own servers.
– Flexibility to scale up or down based on your needs.
– Access to the latest AI technologies without the need to update hardware.
Can small businesses afford Agentic AI in the Cloud?
Yes! Many cloud providers offer plans that suit small businesses. By choosing a pay-as-you-go option, small businesses can use AI without a huge upfront investment. It’s all about finding a plan that matches your needs and budget.
How can I determine if Agentic AI in the Cloud is right for my business?
Consider your current needs and future plans. If you need to automate tasks, improve decision-making, or analyze data efficiently, Agentic AI could be very helpful. Also, think about your budget and ensure you understand the costs involved before making a decision.