infrabenchmark.com

See what's working in cold email right now

Mar 26Apr 8·501,083emails·11 campaigns·4 segments
Azure 201,538Google 176,275M365 123,270

Across 501,083 emails, 11 campaigns, and 4 segments, Azure gets the most replies with a score of 110. Google is second (103), followed by M365 (88).

Weekly Scorecard

Leader
#1

Azure

110

+1WoW

Normalized Index

#2

Google

103

-7WoW

Normalized Index

#3

M365

88

-2WoW

Normalized Index

Segment
Scores adjusted for infrastructure cost. Azure is significantly cheaper per inbox than Google and M365, which cost about the same.

11 campaigns · 501,083 emails sent over a 2-week period · Best: Azure

14-Day Trendline

Azure
Google
M365
Google+68
M365+250
Extra emails per reply. If Azure is the best this week and Google shows +68, that means you need to send 68 more emails on Google to get one additional reply compared to Azure.

Key Takeaways

  • Google outperforms on SMB and Personal segments but shows significantly more day-to-day variance.
  • Azure is the most consistent provider with the lowest volatility across all segments.
  • Microsoft 365 trails overall but shows the smallest gap in Enterprise, where it occasionally edges out Google.
  • E-Commerce is the most competitive segment where Azure and Google frequently trade positions.
  • The AI Infrastructure Consulting campaign (SMB) produces the highest reply rates regardless of provider.

Methodology & FAQ

It's a score where 100 is average. If a provider scores 110, it's performing 10% better than the average across all providers. If it scores 90, it's 10% worse. We calculate it by dividing each provider's reply rate by the overall average, then multiplying by 100.

We're two cold email agency owners. We've been in the space for about a year and a half, do a collective $150k+ in MRR, manage 30+ clients, and send roughly a million emails per month. We built this because we wanted to know which infrastructure actually works best, and figured everyone else does too.

Every Sunday we sit down, pull the numbers from our campaigns, normalize everything, and publish it. The data is lagged by 7 days so all replies have time to come in before we count them.

Because reply rates depend heavily on the offer, the list, and the copy, not just the infrastructure. If we showed raw rates, people would draw the wrong conclusions. The normalized score strips out everything except the infrastructure difference, which is what actually matters here.

Email testing@infrabenchmark.com. We read every request and prioritize based on what the community wants to see.

It's simple: if Google is the leader and Azure shows +125, that means for every reply you get on Google, you'd need to send 125 more emails on Azure to get that same reply. The lower the number, the closer the performance.

We run 30+ campaigns across our agencies. For this benchmark, we picked the ones from our longest-running, highest-volume clients, accounts that have been active for 6+ months. That way the data is meaningful, not based on one-off tests.

Testing Queue

What we're testing next

1

SMTP+ providers

Multiple vendors under evaluation

2

Warmup duration: 14-day vs 30-day ramp

Comparing inbox reputation build rates

3

IP location impact

US-based vs international sending IPs

Check your MX records
4

Branded vs non-branded domains

Does using your brand name in the domain affect reply rates?

Internal Tools Used

Sequencer:Email Bison, Smartlead
Lead Lists:Blitz API, Apollo, LinkedIn
Personalization:Clay
List Cleaning:MillionVerifier
Inbox Providers:Multiple