Yesterday, Foundry—an emerging cloud service provider founded by Google DeepMind alumni—announced the Foundry Cloud Platform: the world’s first real-time market and orchestration engine for GPU compute. This makes it far easier for one to access the infrastructure behind AI development and deployment, significantly lowering operational complexity while improving cost efficiency in computing up to 6x—making the development of AI more accessible and quickly innovating around the world.

With the AI boom, GPU servers have emerged as a strategic commodity.

An insatiable demand has surged past public cloud qualities, compelling major tech giants and AI startups to invest billions in this crucial hardware that underpins AI development. The early days of long-term contracts, coupled with the fragile infrastructure, brought overprovisioning to ensure capacity with the law of restriction on broader access. Which leads to: a GPU compute ownership race that misses the main point—GPUs in existence today sit infinitely under-utilized, given the unique compute demands of AI development.

 “The current GPU compute market is one of the most inefficient commodity markets in history, directly limiting critical AI innovations that could benefit society,”

says Jared Quincy Davis, founder and CEO of Foundry. “Most AI research and development teams don’t have access to affordable and reliable compute, and well-funded orgs are buying long-term GPU reservations that they rarely use to full capacity. Foundry Cloud Platform solves this by aggregating and redistributing idle compute capacity so as to realize faster breakthroughs and a better return from GPU investments.”

Read our daily News : Cloud Costs: How Lack of Formalized Management Wastes Money

The Foundry Cloud Platform changes the costly and hard-to-scale means by which AI teams access GPU compute today, optimizing performance, cost efficiency, and reliability.

The platform aggregates compute into a single dynamically priced pool. By doing so, both types of GPU capacity creation are possible and can be adapted to various AI workload needs.

Resellable Reserved Instances—AI teams using the foundry platform can reserve short-term capacity from the foundry GPU virtual machine pools. You can reserve the right to use multiple connected clusters for as little as three hours in length with predictable workloads – without requiring long-term commitments on fixed contract terms. Want to make it even more efficient? Resell that idle capacity on your reservations. If a customer reserves 128 NVIDIA H100s and reserves 16 as “healing buffer” nodes in advance, the user can now relist the nodes temporarily on the market to earn credits until they are needed or the reservation ends. This option is best for work that is pre-planned, like training runs, and other day-to-day critical tasks such as verification and debugging.

Spot Instances

Alternatively, users can easily find unreserved and relisted compute on the platform by biddable spot instances for such interrupt-tolerant workloads as model inference, hyperparameter tuning, and fine-tuning.

The Foundry Cloud Platform applies auction theory to both set market-driven prices for reserved and spot compute, using real-time supply and demand, while increasing the overall GPU capacities on the platform, which stops at the high levels of price and therefore stabilizes the market.

The platform adds Kubernetes workload orchestration seamlessly, as there is no manual scheduling.

Reserved and spot instances are automatically ingested into a managed Kubernetes cluster that, among other things, gives AI development teams a fair fighting chance to make the best price-performance optimizations and reduce inference latency during sudden traffic spikes by scaling that capacity horizontally.

Infinite Monkey is an AI startup designing AGI architectures, pulling from the Foundry Cloud Platform for seamless access to a diverse set of GPU technologies without overprovisioning. Matt Wheeler, research engineer at Infinite Monkey, added, “With Foundry Cloud Platform, we’re delivering actionable discoveries in hours, not weeks. The ability to easily dial up more compute when we need it, to then dial it back again when we’re paused studying results and designing for the next experiment, means we can try out different GPUs to discover the sweet spot of price versus performance without being bound by long contracts.”

“Foundry Cloud Platform has accelerated science at Arc,”

explained Patrick Hsu, Co-Founder and Core Investigator of the Arc Institute, a non-profit research foundation specially dedicated to the study of complex diseases such as cancer, neurodegeneration, and immune dysfunction. “Our machine learning work requires demanding infrastructure, and Foundry comes through. We can ensure our researchers have the compute they need when they need it — without procurement friction.”

Foundry has received its SOC 2 Type II certification this year in a commitment to the highest standards of security and compliance, in a mission to ensure maximum protection of customer data.

To share your insights with CyberTech Newsroom, please write to us at news@intentamplify.com