13 min read

GPU Rental Prices Compared: H100 vs A100 vs Cloud in 2026

GPU pricingcloud computingH100A100machine learningcomputing costs

GPU costs dominate machine learning budgets. A single H100 can cost $2-4/hour in the cloud—$48-96/day or $1,440-2,880/month. Choose the wrong provider and you’ll hemorrhage money. Choose right and you’ll cut costs by 60%.

This guide compares real 2026 GPU rental prices across all major cloud providers and specialized GPU marketplaces, then shows when renting beats buying.

GPU Rental Market Overview: 2026

The GPU rental market has bifurcated into two tiers:

Tier 1: Enterprise Cloud (AWS, Google Cloud, Azure)

Premium pricing ($0.76-$3.50/hour), enterprise SLAs, 99.99% uptime guarantees, managed infrastructure, global availability.

Tier 2: Specialist Providers (RunPod, Lambda Labs, CoreWeave, Modal)

Discount pricing ($0.40-$2.50/hour), variable uptime (95-99%), community support, limited geographic availability.

Most ML engineers use Tier 2 for R&D and Tier 1 for production. This hybrid approach costs 40% less than pure enterprise.

H100 Pricing Across Providers

The H100 is the gold standard for large language model training. Prices in April 2026:

ProviderOn-DemandSpot/Discount1-Year Reserved
AWS (us-east-1)$2.88/hr$0.86/hr (70% off)$1.97/hr
Google Cloud$3.12/hr$1.09/hr (65% off)$2.34/hr
Lambda Labs$3.98/hrN/A$3.18/hr
RunPod$2.49/hr$1.50/hr (40% off)N/A
CoreWeave$2.70/hr$1.62/hr (40% off)$2.16/hr

For continuous training, AWS spot instances at $0.86/hour are unbeatable. However, AWS can preempt your job with 2-5 minutes notice, risking hours of lost progress.

For reliability, RunPod’s $2.49/hour on-demand is safer than AWS but still substantially cheaper than Lambda Labs.

H200 Pricing (The Emerging Leader)

NVIDIA’s H200 offers 50% more memory and identical compute vs H100, making it better for long-context models and larger batch sizes.

ProviderOn-DemandSpot/Discount
Lambda Labs$3.98/hrN/A
RunPod$3.19/hr$1.91/hr
CoreWeave$3.49/hr$2.09/hr
AWSN/A yetN/A yet

H200 will likely dominate by year-end as AWS and Google Cloud add inventory.

A100 Pricing (The Value Play)

The A100, released in 2020, remains competitive for inference and smaller model training. It costs 40% less than H100 but offers 40% less compute (40 TFLOPS vs 55 TFLOPS).

ProviderOn-DemandSpot/Discount
AWS$1.21/hr$0.36/hr
Google Cloud$1.02/hr$0.31/hr
RunPod$0.44/hr$0.26/hr
CoreWeave$0.62/hr$0.37/hr

RunPod’s $0.44/hour A100 is shockingly cheap. For inference workloads (serving models), A100 often suffices and costs 1/6th of H100 pricing.

L40S and Consumer GPU Pricing

The L40S is a workstation GPU designed for professional graphics and light ML, not training. Prices:

  • AWS: $0.70/hour on-demand, $0.21/hour spot
  • RunPod: $0.44/hour on-demand
  • CoreWeave: $0.37/hour on-demand

L40S is useful for:

  • Rendering (3D graphics inference)
  • Image generation (SDXL, Flux inference)
  • Small model training (<10B parameters)
  • Development and testing

For a startup building a product with Stable Diffusion, L40S at CoreWeave ($0.37/hour) is ideal. It costs $266/month for continuous operation, but most inference workloads need far less.

Consumer GPU Rentals (RTX 4090 and 5090)

Consumer GPUs offer value for specific niches:

  • RTX 4090: $0.30-0.50/hour (sufficient for SDXL, Llama 13B inference)
  • RTX 5090: $0.80-1.20/hour (emerging, limited availability)

These are rarely worth it for professional ML. An A100 is only 2x the price but offers 8x more VRAM and better multi-GPU scaling.

On-Premise vs Cloud Break-Even Analysis

Should you buy a GPU instead of renting? The math depends on usage hours.

H100 Purchase Cost: $20,000 (retail)

Cost per hour at different cloud providers:

  • AWS on-demand: $2.88/hour × 8,760 hours/year = $25,229/year
  • AWS spot: $0.86/hour × 8,760 hours/year = $7,534/year
  • RunPod on-demand: $2.49/hour × 8,760 hours/year = $21,802/year

Break-even for H100 purchase:

  • At AWS on-demand ($2.88/hr): 6,944 hours (~9.4 months of continuous use)
  • At AWS spot ($0.86/hr): 23,256 hours (2.7 years of continuous use)
  • At RunPod on-demand ($2.49/hr): 8,032 hours (~11 months of continuous use)

Practical usage scenarios:

  • 50 hours/month (5,600 hours/year): Rent from RunPod. Yearly: $13,000. Buying costs $20,000 upfront + electricity.
  • 200 hours/month (2,400 hours/year): Rent from AWS spot. Yearly: $2,064. Break-even never occurs.
  • 500 hours/month (6,000 hours/year): Rent from RunPod. Yearly: $14,700. Buying costs $20,000 + electricity ($2,000/year). Break-even at month 3-4.
  • 1,000 hours/month (12,000 hours/year): Buy. Yearly: $20,000 + $4,000 electricity = $24,000. Running on cloud costs $29,700+. Buying wins.

Use the GPU Cost Calculator to find your break-even point with your monthly usage.

Cost Per FLOP Comparison

Different models have different efficiency. Here’s cost per TFLOPS (trillion floating point operations per second):

GPUTFLOPSCost/HourCost per TFLOP/Hour
H100 (AWS on-demand)55$2.88$0.052/TFLOP
H100 (AWS spot)55$0.86$0.016/TFLOP
A100 (AWS on-demand)20$1.21$0.061/TFLOP
A100 (RunPod)20$0.44$0.022/TFLOP
L40S (CoreWeave)7.4$0.37$0.050/TFLOP

AWS spot H100 wins on cost-per-FLOP, but RunPod A100 offers reliability at nearly the same efficiency. For production systems, RunPod A100 at $0.022/TFLOP is the sweet spot.

Network and Storage Costs (The Hidden Killer)

GPU rental hourly rates hide additional costs:

Data ingress/egress: AWS charges $0.12/GB egress. Moving a 100GB model in and out costs $24. RunPod doesn’t charge for local transfers, saving thousands monthly.

Storage: AWS charges $0.023/GB/month for EBS. A 500GB dataset costs $11.50/month. RunPod offers free NVMe storage, making it superior for data-intensive workflows.

Network bandwidth: Inter-GPU communication within AWS is free. Across regions costs $0.02/GB. Large distributed training on multiple GPUs across regions becomes expensive quickly.

Total ownership cost for cloud GPU work can be 30% higher than the GPU hourly rate suggests.

Spot vs On-Demand: When to Use Each

Use spot instances ($0.30-0.86/hour) for:

  • Model training (checkpoints every 30 minutes protect against preemption)
  • Data preprocessing
  • Batch inference jobs
  • Research and experimentation
  • Development and testing

Use on-demand instances ($2.49-3.98/hour) for:

  • Real-time inference (serving models to users)
  • Fine-tuning with short training windows
  • Time-sensitive production jobs
  • Enterprise SLA requirements

A hybrid approach uses spot for research and on-demand for production, reducing average costs by 40-50%.

Recommendations by Use Case

Scenario 1: Individual researcher training a model

Use RunPod spot H100 at $1.50/hour. Cost for 1,000 hours training: $1,500. If premature termination causes restarts (+20% overhead), cost is $1,800. Acceptable for research.

Scenario 2: Startup running inference at scale

Use AWS A100 on-demand at $1.21/hour in auto-scaling group. Peak load 8 hours/day requires 2-3 GPUs. Monthly: 8 hours × 30 days × 2.5 GPUs × $1.21 = $726. Alternative: CoreWeave A100 at $0.62/hour = $372/month. CoreWeave saves 50%.

Scenario 3: Company doing continuous training

500 hours/month with reliability requirements. AWS reserved H100 at $1.97/hour × 500 = $985/month. Or buy an H100 ($20,000) and run 500 hours/month. Payback in 20 months plus electricity and maintenance. Renting is safer.

Scenario 4: Large-scale research (Kaggle competition)

1,000+ hours needed urgently. Use AWS spot H100 at $0.86/hour × 1,000 = $860. Risk of preemption is worth the 70% savings vs on-demand ($2,880).

Pro Tips for Minimizing GPU Costs

1. Use mixed precision (FP16) - Cuts memory usage by 50%, allowing smaller GPUs. Train on A100 instead of H100 and save $2/hour.

2. Profile before scaling - Verify your job actually uses the GPU. Many ML workloads bottleneck on data loading, not GPU compute. Profiling saves unnecessary rental costs.

3. Batch inference requests - Serve 100 inference requests in one GPU call instead of 100 separate calls. Reduces GPU rental hours by 95%.

4. Use spot instances with checkpoints - Lose $1.50 to preemption every 10 hours of spot use? Save $8.80/hour with spot. Break-even is under 2 preemptions per 1,000 hours.

5. Compare total cost of ownership - H100 rental looks expensive until you compare with storage, network, and personnel costs. Use the GPU Cost Calculator for full accounting.

Future Trends

NVIDIA’s Blackwell architecture (2026-2027) will offer 2x performance of H100. Expect current H100 prices to drop 40% as inventory shifts to Blackwell. Planning a 6-month project? Wait for Blackwell to avoid overpaying for aging hardware.

AMD’s MI300X offers 30% cost savings over H100 but has lower adoption. By 2027, expect AMD to capture 30-40% of the GPU market, driving competitive pricing.

Conclusion

GPU rental costs range from $0.30/hour (consumer L40S) to $3.98/hour (H200) depending on provider, model, and discounts. Smart teams use RunPod for research (save 40%), AWS for production (reliability), and A100 instead of H100 for inference (save 60%). The difference between optimal and naive GPU spending is 60-70% of total ML infrastructure cost.

Related Calculators

Ready to calculate?

Try our free gpu cost calculator 2026 - ai training & inference costs to get accurate results instantly.

Try the Calculator

Frequently Asked Questions

What is the cheapest GPU for renting in 2026?
RunPod offers H100 GPUs at $2.49/hour and L40S at $0.44/hour (spot pricing). However, L40S is older. For newer hardware with lower costs, Lambda Labs offers H200 at $3.98/hour. Always use the GPU Cost Calculator for your specific workload.
Should I use AWS, Google Cloud, or specialized providers like RunPod?
AWS and GCP offer reliability and enterprise support (+30% cost). RunPod and CoreWeave cost 40-60% less but have occasional outages. For production workloads, AWS costs $2.88/hour for H100. For research, RunPod’s $2.49/hour is safer. Use the GPU Cost Calculator to compare scenarios.
When should I buy a GPU instead of renting?
An H100 GPU ($20,000) pays for itself in 7,000+ hours of rental at cloud prices. That’s 3-4 months of continuous use. If your project runs <100 hours/month, rent. If >500 hours/month, buy. See the Cloud Cost Calculator for break-even analysis.
What is spot pricing and is it reliable?
Spot pricing (spare cloud capacity) costs 60-80% less but can terminate with 2-5 minute notice. Use for training, preprocessing, and batch jobs. Never use spot for serving models or real-time inference. GPU Cost Calculator includes spot vs on-demand scenarios.
How do I compare performance per dollar?
FLOPS (floating point operations per second) per dollar is key. H100 offers 55 TFLOPS for $2.88/hour = 5.7 TFLOPS/dollar. A100 offers 20 TFLOPS for $0.80/hour = 25 TFLOPS/dollar. A100 wins on efficiency. Use the GPU Cost Calculator for your specific model.
Are there monthly or annual discounts for GPU rental?
AWS offers 1-year reservations at 30% off on-demand. Google Cloud offers 1-year commitments at 25% off. RunPod discounts only for large teams. Most savings come from spot instances (70% off). Compare in the GPU Cost Calculator.

Related Articles

JW

James Whitfield

Lead Editor & Calculator Architect

James Whitfield is the lead editor and calculator architect at CalcCenter. With a background in applied mathematics and financial analysis, he oversees the development and accuracy of every calculator and guide on the site. James is committed to making complex calculations accessible and ensuring every tool is backed by verified, industry-standard formulas from authoritative sources like the IRS, Federal Reserve, WHO, and CDC.

Learn more about James

Disclaimer: This article is for informational purposes only and should not be considered financial, tax, legal, or professional advice. Always consult with a qualified professional before making important financial decisions. CalcCenter calculators are tools for estimation and should not be relied upon as definitive sources for tax, financial, or legal matters.