13 min read

GPU Rental Prices Compared: H100 vs A100 vs Cloud in 2026

GPU pricingcloud computingH100A100machine learningcomputing costs

GPU costs dominate machine learning budgets. A single H100 can cost $2-4/hour in the cloud—$48-96/day or $1,440-2,880/month. Choose the wrong provider and you’ll hemorrhage money. Choose right and you’ll cut costs by 60%.

This guide compares real 2026 GPU rental prices across all major cloud providers and specialized GPU marketplaces, then shows when renting beats buying.

GPU Rental Market Overview: 2026

The GPU rental market has bifurcated into two tiers:

Tier 1: Enterprise Cloud (AWS, Google Cloud, Azure)

Premium pricing ($0.76-$3.50/hour), enterprise SLAs, 99.99% uptime guarantees, managed infrastructure, global availability.

Tier 2: Specialist Providers (RunPod, Lambda Labs, CoreWeave, Modal)

Discount pricing ($0.40-$2.50/hour), variable uptime (95-99%), community support, limited geographic availability.

Most ML engineers use Tier 2 for R&D and Tier 1 for production. This hybrid approach costs 40% less than pure enterprise.

H100 Pricing Across Providers

The H100 is the gold standard for large language model training. Prices in April 2026:

ProviderOn-DemandSpot/Discount1-Year Reserved
AWS (us-east-1)$2.88/hr$0.86/hr (70% off)$1.97/hr
Google Cloud$3.12/hr$1.09/hr (65% off)$2.34/hr
Lambda Labs$3.98/hrN/A$3.18/hr
RunPod$2.49/hr$1.50/hr (40% off)N/A
CoreWeave$2.70/hr$1.62/hr (40% off)$2.16/hr

For continuous training, AWS spot instances at $0.86/hour are unbeatable. However, AWS can preempt your job with 2-5 minutes notice, risking hours of lost progress.

For reliability, RunPod’s $2.49/hour on-demand is safer than AWS but still substantially cheaper than Lambda Labs.

H200 Pricing (The Emerging Leader)

NVIDIA’s H200 offers 50% more memory and identical compute vs H100, making it better for long-context models and larger batch sizes.

ProviderOn-DemandSpot/Discount
Lambda Labs$3.98/hrN/A
RunPod$3.19/hr$1.91/hr
CoreWeave$3.49/hr$2.09/hr
AWSN/A yetN/A yet

H200 will likely dominate by year-end as AWS and Google Cloud add inventory.

A100 Pricing (The Value Play)

The A100, released in 2020, remains competitive for inference and smaller model training. It costs 40% less than H100 but offers 40% less compute (40 TFLOPS vs 55 TFLOPS).

ProviderOn-DemandSpot/Discount
AWS$1.21/hr$0.36/hr
Google Cloud$1.02/hr$0.31/hr
RunPod$0.44/hr$0.26/hr
CoreWeave$0.62/hr$0.37/hr

RunPod’s $0.44/hour A100 is shockingly cheap. For inference workloads (serving models), A100 often suffices and costs 1/6th of H100 pricing.

L40S and Consumer GPU Pricing

The L40S is a workstation GPU designed for professional graphics and light ML, not training. Prices:

  • AWS: $0.70/hour on-demand, $0.21/hour spot
  • RunPod: $0.44/hour on-demand
  • CoreWeave: $0.37/hour on-demand

L40S is useful for:

  • Rendering (3D graphics inference)
  • Image generation (SDXL, Flux inference)
  • Small model training (<10B parameters)
  • Development and testing

For a startup building a product with Stable Diffusion, L40S at CoreWeave ($0.37/hour) is ideal. It costs $266/month for continuous operation, but most inference workloads need far less.

Consumer GPU Rentals (RTX 4090 and 5090)

Consumer GPUs offer value for specific niches:

  • RTX 4090: $0.30-0.50/hour (sufficient for SDXL, Llama 13B inference)
  • RTX 5090: $0.80-1.20/hour (emerging, limited availability)

These are rarely worth it for professional ML. An A100 is only 2x the price but offers 8x more VRAM and better multi-GPU scaling.

On-Premise vs Cloud Break-Even Analysis

Should you buy a GPU instead of renting? The math depends on usage hours.

H100 Purchase Cost: $20,000 (retail)

Cost per hour at different cloud providers:

  • AWS on-demand: $2.88/hour × 8,760 hours/year = $25,229/year
  • AWS spot: $0.86/hour × 8,760 hours/year = $7,534/year
  • RunPod on-demand: $2.49/hour × 8,760 hours/year = $21,802/year

Break-even for H100 purchase:

  • At AWS on-demand ($2.88/hr): 6,944 hours (~9.4 months of continuous use)
  • At AWS spot ($0.86/hr): 23,256 hours (2.7 years of continuous use)
  • At RunPod on-demand ($2.49/hr): 8,032 hours (~11 months of continuous use)

Practical usage scenarios:

  • 50 hours/month (5,600 hours/year): Rent from RunPod. Yearly: $13,000. Buying costs $20,000 upfront + electricity.
  • 200 hours/month (2,400 hours/year): Rent from AWS spot. Yearly: $2,064. Break-even never occurs.
  • 500 hours/month (6,000 hours/year): Rent from RunPod. Yearly: $14,700. Buying costs $20,000 + electricity ($2,000/year). Break-even at month 3-4.
  • 1,000 hours/month (12,000 hours/year): Buy. Yearly: $20,000 + $4,000 electricity = $24,000. Running on cloud costs $29,700+. Buying wins.

Use the GPU Cost Calculator to find your break-even point with your monthly usage.

Cost Per FLOP Comparison

Different models have different efficiency. Here’s cost per TFLOPS (trillion floating point operations per second):

GPUTFLOPSCost/HourCost per TFLOP/Hour
H100 (AWS on-demand)55$2.88$0.052/TFLOP
H100 (AWS spot)55$0.86$0.016/TFLOP
A100 (AWS on-demand)20$1.21$0.061/TFLOP
A100 (RunPod)20$0.44$0.022/TFLOP
L40S (CoreWeave)7.4$0.37$0.050/TFLOP

AWS spot H100 wins on cost-per-FLOP, but RunPod A100 offers reliability at nearly the same efficiency. For production systems, RunPod A100 at $0.022/TFLOP is the sweet spot.

Network and Storage Costs (The Hidden Killer)

GPU rental hourly rates hide additional costs:

Data ingress/egress: AWS charges $0.12/GB egress. Moving a 100GB model in and out costs $24. RunPod doesn’t charge for local transfers, saving thousands monthly.

Storage: AWS charges $0.023/GB/month for EBS. A 500GB dataset costs $11.50/month. RunPod offers free NVMe storage, making it superior for data-intensive workflows.

Network bandwidth: Inter-GPU communication within AWS is free. Across regions costs $0.02/GB. Large distributed training on multiple GPUs across regions becomes expensive quickly.

Total ownership cost for cloud GPU work can be 30% higher than the GPU hourly rate suggests.

Spot vs On-Demand: When to Use Each

Use spot instances ($0.30-0.86/hour) for:

  • Model training (checkpoints every 30 minutes protect against preemption)
  • Data preprocessing
  • Batch inference jobs
  • Research and experimentation
  • Development and testing

Use on-demand instances ($2.49-3.98/hour) for:

  • Real-time inference (serving models to users)
  • Fine-tuning with short training windows
  • Time-sensitive production jobs
  • Enterprise SLA requirements

A hybrid approach uses spot for research and on-demand for production, reducing average costs by 40-50%.

Recommendations by Use Case

Scenario 1: Individual researcher training a model

Use RunPod spot H100 at $1.50/hour. Cost for 1,000 hours training: $1,500. If premature termination causes restarts (+20% overhead), cost is $1,800. Acceptable for research.

Scenario 2: Startup running inference at scale

Use AWS A100 on-demand at $1.21/hour in auto-scaling group. Peak load 8 hours/day requires 2-3 GPUs. Monthly: 8 hours × 30 days × 2.5 GPUs × $1.21 = $726. Alternative: CoreWeave A100 at $0.62/hour = $372/month. CoreWeave saves 50%.

Scenario 3: Company doing continuous training

500 hours/month with reliability requirements. AWS reserved H100 at $1.97/hour × 500 = $985/month. Or buy an H100 ($20,000) and run 500 hours/month. Payback in 20 months plus electricity and maintenance. Renting is safer.

Scenario 4: Large-scale research (Kaggle competition)

1,000+ hours needed urgently. Use AWS spot H100 at $0.86/hour × 1,000 = $860. Risk of preemption is worth the 70% savings vs on-demand ($2,880).

Pro Tips for Minimizing GPU Costs

1. Use mixed precision (FP16) - Cuts memory usage by 50%, allowing smaller GPUs. Train on A100 instead of H100 and save $2/hour.

2. Profile before scaling - Verify your job actually uses the GPU. Many ML workloads bottleneck on data loading, not GPU compute. Profiling saves unnecessary rental costs.

3. Batch inference requests - Serve 100 inference requests in one GPU call instead of 100 separate calls. Reduces GPU rental hours by 95%.

4. Use spot instances with checkpoints - Lose $1.50 to preemption every 10 hours of spot use? Save $8.80/hour with spot. Break-even is under 2 preemptions per 1,000 hours.

5. Compare total cost of ownership - H100 rental looks expensive until you compare with storage, network, and personnel costs. Use the GPU Cost Calculator for full accounting.

Future Trends

NVIDIA’s Blackwell architecture (2026-2027) will offer 2x performance of H100. Expect current H100 prices to drop 40% as inventory shifts to Blackwell. Planning a 6-month project? Wait for Blackwell to avoid overpaying for aging hardware.

AMD’s MI300X offers 30% cost savings over H100 but has lower adoption. By 2027, expect AMD to capture 30-40% of the GPU market, driving competitive pricing.

Conclusion

GPU rental costs range from $0.30/hour (consumer L40S) to $3.98/hour (H200) depending on provider, model, and discounts. Smart teams use RunPod for research (save 40%), AWS for production (reliability), and A100 instead of H100 for inference (save 60%). The difference between optimal and naive GPU spending is 60-70% of total ML infrastructure cost.

Related Calculators

Ready to calculate?

Try our free gpu cost calculator 2026 - ai training & inference costs to get accurate results instantly.

Try the Calculator

Frequently Asked Questions

What is the cheapest GPU for renting in 2026?
RunPod offers H100 GPUs at $2.49/hour and L40S at $0.44/hour (spot pricing). However, L40S is older. For newer hardware with lower costs, Lambda Labs offers H200 at $3.98/hour. Always use the GPU Cost Calculator for your specific workload.
Should I use AWS, Google Cloud, or specialized providers like RunPod?
AWS and GCP offer reliability and enterprise support (+30% cost). RunPod and CoreWeave cost 40-60% less but have occasional outages. For production workloads, AWS costs $2.88/hour for H100. For research, RunPod’s $2.49/hour is safer. Use the GPU Cost Calculator to compare scenarios.
When should I buy a GPU instead of renting?
An H100 GPU ($20,000) pays for itself in 7,000+ hours of rental at cloud prices. That’s 3-4 months of continuous use. If your project runs <100 hours/month, rent. If >500 hours/month, buy. See the Cloud Cost Calculator for break-even analysis.
What is spot pricing and is it reliable?
Spot pricing (spare cloud capacity) costs 60-80% less but can terminate with 2-5 minute notice. Use for training, preprocessing, and batch jobs. Never use spot for serving models or real-time inference. GPU Cost Calculator includes spot vs on-demand scenarios.
How do I compare performance per dollar?
FLOPS (floating point operations per second) per dollar is key. H100 offers 55 TFLOPS for $2.88/hour = 5.7 TFLOPS/dollar. A100 offers 20 TFLOPS for $0.80/hour = 25 TFLOPS/dollar. A100 wins on efficiency. Use the GPU Cost Calculator for your specific model.
Are there monthly or annual discounts for GPU rental?
AWS offers 1-year reservations at 30% off on-demand. Google Cloud offers 1-year commitments at 25% off. RunPod discounts only for large teams. Most savings come from spot instances (70% off). Compare in the GPU Cost Calculator.

Related Articles

JW

Brandon Sorensen

Founder & Editor

Brandon Sorensen is the founder and editor of CalcCenter.io. He is not a licensed financial advisor, tax professional, or medical practitioner — every calculator on the site uses formulas drawn from primary authoritative sources (IRS publications, Federal Reserve data, WHO and CDC standards, peer-reviewed journals), and the formula plus a worked example is published on each calculator page so users can verify the methodology themselves and consult a licensed professional for case-specific decisions.

Learn more about James

Disclaimer: This article is for informational purposes only and should not be considered financial, tax, legal, or professional advice. Always consult with a qualified professional before making important financial decisions. CalcCenter calculators are tools for estimation and should not be relied upon as definitive sources for tax, financial, or legal matters.