infrabenchmark.com

See what's working in cold email right now

Mar 26 – Apr 8·501,083emails·11 campaigns·4 segments

Azure 201,538Google 176,275M365 123,270

Across 501,083 emails, 11 campaigns, and 4 segments, Azure gets the most replies with a score of 110. Google is second (103), followed by M365 (88).

Weekly Scorecard

Leader

Azure

110

+1WoW

Normalized Index

Google

103

-7WoW

Normalized Index

M365

-2WoW

Normalized Index

Segment

Scores adjusted for infrastructure cost. Azure is significantly cheaper per inbox than Google and M365, which cost about the same.

11 campaigns · 501,083 emails sent over a 2-week period · Best: Azure

14-Day Trendline

Azure

Google

M365

Google+68

M365+250

Extra emails per reply. If Azure is the best this week and Google shows +68, that means you need to send 68 more emails on Google to get one additional reply compared to Azure.

Key Takeaways

Google outperforms on SMB and Personal segments but shows significantly more day-to-day variance.
Azure is the most consistent provider with the lowest volatility across all segments.
Microsoft 365 trails overall but shows the smallest gap in Enterprise, where it occasionally edges out Google.
E-Commerce is the most competitive segment where Azure and Google frequently trade positions.
The AI Infrastructure Consulting campaign (SMB) produces the highest reply rates regardless of provider.

Methodology & FAQ

It's a score where 100 is average. If a provider scores 110, it's performing 10% better than the average across all providers. If it scores 90, it's 10% worse. We calculate it by dividing each provider's reply rate by the overall average, then multiplying by 100.

We're two cold email agency owners. We've been in the space for about a year and a half, do a collective $150k+ in MRR, manage 30+ clients, and send roughly a million emails per month. We built this because we wanted to know which infrastructure actually works best, and figured everyone else does too.

Every Sunday we sit down, pull the numbers from our campaigns, normalize everything, and publish it. The data is lagged by 7 days so all replies have time to come in before we count them.

Because reply rates depend heavily on the offer, the list, and the copy, not just the infrastructure. If we showed raw rates, people would draw the wrong conclusions. The normalized score strips out everything except the infrastructure difference, which is what actually matters here.

Email testing@infrabenchmark.com. We read every request and prioritize based on what the community wants to see.

It's simple: if Google is the leader and Azure shows +125, that means for every reply you get on Google, you'd need to send 125 more emails on Azure to get that same reply. The lower the number, the closer the performance.

We run 30+ campaigns across our agencies. For this benchmark, we picked the ones from our longest-running, highest-volume clients, accounts that have been active for 6+ months. That way the data is meaningful, not based on one-off tests.

Inspired By

People and teams pushing cold email forward through transparency and education.

Taylor Haren↗

Cold email infrastructure & deliverability

Eric Nowoslawski↗

Outbound strategy & cold email systems

Jack Reamer↗

Founder of Salesbread

Hans Dekker↗

B2B outreach & lead generation (Instantly)

The ColdIQ Team↗

Cold email intelligence & tooling

Testing Queue

What we're testing next

SMTP+ providers

Multiple vendors under evaluation

Warmup duration: 14-day vs 30-day ramp

Comparing inbox reputation build rates

IP location impact

US-based vs international sending IPs

Check your MX records ↗

Branded vs non-branded domains

Does using your brand name in the domain affect reply rates?

Have a request? testing@infrabenchmark.com

Internal Tools Used

Sequencer:Email Bison, Smartlead

Lead Lists:Blitz API, Apollo, LinkedIn

Personalization:Clay

List Cleaning:MillionVerifier

Inbox Providers:Multiple