Google Cloud Next is Google’s annual product conference for its cloud platform, used to announce new services, showcase enterprise deployments, and set the direction for Google Cloud for the year ahead. The 2026 edition was held in Las Vegas from April 22 to 24, drew more than 32,000 attendees, and produced 260 product announcements across infrastructure, security, data, and developer tools. For hosting providers, infrastructure buyers, and IT decision-makers, the announcements worth studying are not the AI demos. They are the hardware specs, network architecture changes, and storage throughput numbers that determine what workloads will cost to run on Google Cloud over the next two years.
Two New Chips, Two Different Problems
The headline hardware announcement was the eighth generation of Google’s Tensor Processing Units, introduced in two distinct configurations for the first time.
TPU 8t is built for training large models. A single superpod holds 9,600 chips and 2 petabytes of shared high-bandwidth memory, delivering 121 exaflops of total compute, nearly 3x the performance of the previous generation. The architecture scales near-linearly to 1 million TPU chips across multiple data center sites, which means training timelines for very large models shrink proportionally.
TPU 8i is built for inference and real-time serving. It carries 384 MB of on-chip SRAM, three times the previous generation, and 288 GB of high-bandwidth memory. Its Collectives Acceleration Engine reduces on-chip communication latency by up to 5x. Google states the chip delivers 80% better performance per dollar for inference compared to its prior generation. For hosting providers running managed inference or offering AI-powered services to customers, the cost-per-query reduction this enables is the number that directly affects margin.
Google also confirmed it will be among the first cloud providers to offer instances based on the NVIDIA Vera Rubin NVL72 platform. The A5X instances will support up to 80,000 GPUs in a single data center, with availability expected later in 2026.
The Network Upgrade That Makes the Hardware Usable
Hardware performance numbers only translate to real workload gains if the network can move data at the same speed. Google introduced Virgo, a new data center fabric architecture that delivers 4x the bandwidth of previous generations and supports 134,000 TPUs within a single data center. The collapsed fabric design eliminates what Google calls the “scaling tax,” the efficiency degradation that typically occurs as clusters grow larger. The result is near-linear scaling to 1 million TPUs across multiple sites.
On external connectivity, Cloud Interconnect now supports 400 Gbps per connection, scaling to 3.2 Tbps in a single logical connection. For operators running hybrid or multi-cloud environments, this substantially reduces the cost and latency of moving data between on-premises infrastructure and Google Cloud.
Storage Numbers That Require a Reread
Managed Lustre, Google’s high-performance parallel file system, now delivers 10 TB per second of throughput with up to 80 petabytes of capacity and sub-millisecond latency through direct TPU and RDMA support. Google claims this is up to 20x faster than competing offerings for AI checkpoint workloads. That figure comes from Google’s own materials and has not been independently verified; it applies to training scenarios specifically, not general-purpose storage.
Rapid Buckets, Google’s low-latency object storage tier, delivers sub-millisecond access at 20 million operations per second. For hosting environments that require fast retrieval at scale, including CDN origin stores, media platforms, and high-frequency transactional systems, this is a meaningful step change over standard object storage.
On block storage, Hyperdisk ML reaches 2 TiB per second of aggregate throughput. The new Z4M instances, entering preview in Q3 2026, offer 168 TiB of local SSD per instance with 400 Gbps bandwidth and RDMA support, designed for workloads that run custom parallel file systems.
Lower Cost Per Workload for Standard Infrastructure
Not every workload runs on TPUs, and many hosting providers manage fleets of general-purpose virtual machines for web servers, databases, and application runtimes. For these environments, Google Axion N4A, the company’s custom Arm-based processor, is now generally available and delivers up to 2x better price-performance than comparable x86 virtual machines.
Google Kubernetes Engine also received a set of performance improvements relevant to managed hosting operations: node startup is now up to 4x faster, pod startup is up to 80% faster, and model loading via run:AI Model Streamer with Rapid Cache is 5x faster. The Inference Gateway reduces time-to-first-token by more than 70% without requiring manual tuning. For providers billing on response time or SLA commitments, these translate directly into service quality and support costs.
Security Infrastructure: Wiz Inside and a New Fraud Layer
Google completed its acquisition of cloud security company Wiz, and the integration is already shipping. The combined platform adds threat hunting agents, detection engineering agents, and inline security scanning for AI-generated code. Integration with the Lovable platform is scheduled for general availability in May 2026.
Model Armor, Google’s runtime protection layer against prompt injection, tool poisoning, and data leakage, is now integrated with the Agent Gateway, Agent Runtime, Firebase, and major third-party orchestration frameworks. For providers offering managed AI infrastructure to business customers, this is the security control layer that sits between customer data and the model itself.
Google Cloud Fraud Defense, the successor to reCAPTCHA, is now generally available. It extends beyond bot detection to assess the legitimacy of interactions from human users, automated processes, and AI agents acting on behalf of users. This matters for shared hosting and SaaS platform operators managing abuse at scale, where legacy captcha approaches no longer cover the full range of automated activity.
Cloud Armor received updated managed rules for Layer 7 attack detection and ML-based DDoS protection at Layers 3 and 4. A visual rule builder, currently in preview, reduces configuration overhead for custom protection rules on exposed public endpoints.
$750 Million for Partners and What the Bet Implies
Google announced a $750 million innovation fund to support partners building applications on Google Cloud, with dedicated Google engineering resources deployed alongside Accenture, Deloitte, and McKinsey. An Agent Gallery with more than 70 pre-built agents from Atlassian, Salesforce, ServiceNow, Workday, Oracle, and Adobe is now available through the Google Cloud Marketplace.
For hosting providers and managed service operators evaluating where to invest in platform partnerships, the fund is a signal about where Google expects demand to concentrate. Partners building on the Agent Platform gain access to Google’s forward-deployed engineers, a resource that has historically been available only to the company’s largest direct enterprise accounts.
Production Results: What Named Customers Are Reporting
Google documented 1,302 customer AI deployments at the event, with 300 new cases added in 2026 alone. Several examples are relevant to infrastructure buyers comparing platforms on real-world outcomes rather than benchmark conditions.
Citadel Securities reported 4x faster AI processing on TPUs with a 30% reduction in compute costs compared to its previous setup. Deutsche Telekom deployed a multi-agent system called MINDR and reduced event management time by 95%. Highmark Health attributed $27.9 million in measurable value during 2025 to its AI assistant. GE Appliances has more than 800 agents running across its operations. Tata Steel reached 300 agents deployed within nine months.
These are production figures from named customers operating at industrial scale. For C-level decision-makers building a case for infrastructure investment, this is the kind of evidence that moves a vendor evaluation from a technical shortlist to a budget conversation.
Natalia Nowak
Hosting specialist with e-commerce experience and a background in copywriting. I focus on content that is clear, technical, and to the point.
Sources
- Google Cloud Next 2026 Wrap Up - Google Cloud Blog
- AI Infrastructure at Next '26 - Google Cloud Blog
- 7 Highlights and Announcements from Google Cloud Next '26 - Google Blog
- Sundar Pichai Shares News from Google Cloud Next 2026 - Google Blog
- Google Cloud Next '26: Gemini Enterprise Agent Platform Leads AI-Centric News - Virtualization Review