What an MDR renewal conversation actually sounds like
I've sat in on seven Managed Detection and Response (MDR) renewal calls in the last two years, brought in as the detection engineer to grade twelve months of service against twelve months of ticket history.
My job in that room is to represent what happened on the floor, at 2 AM, when an ambiguous escalation hit our queue and someone on my team had to decide: wake the on-call lead, or let it ride.
The pattern is consistent. The vendor arrives with a slide deck, I arrive with the escalation log. Their claim that the service stopped thousands of threats collides with my log of how often we re-investigated their escalations because the verdicts weren't trustworthy.
Those two narratives don't survive contact with each other. The renewal call is the one meeting where they're forced into the same room.
In brief:
- The quarterly business review (QBR) deck mirrors the metrics customers already track, which is precisely why those metrics don't tell you whether the service is working.
- A high ambiguous-escalation rate after a year of tuning points to an architectural limitation. If the rate hasn't declined, the provider can't encode your environment into detection logic.
- Closed-alert quality is the actual product you're paying for, and most operators never audit a single closed verdict.
- The night-shift context gap is structural: the information needed to resolve ambiguous alerts lives with your team, with the Tier 1 analyst on rotation having no access to it.
The renewal call is where the service finally gets graded
Most of the year, the MDR relationship runs on autopilot. Escalations arrive, your team triages them, and the quarterly business reviews run as vendor-controlled presentations, not as a two-way performance review.
The renewal is different. It's the one moment where you have contractual power, which makes it the one moment where hard operational questions carry weight. Let that window close without grading the service and you re-sign for another year on the vendor's terms, same escalations, same queue, same 2 AM.
I treat that meeting as a twelve-month service audit compressed into an hour. I pull our escalation data by category, by shift window, by verdict accuracy. I flag every case where my team re-investigated a closed verdict, or where an escalation arrived without enough context to act on.
On one contract the deck looked healthy until I sorted re-investigations by category. A third of the identity escalations had been closed with no authentication-log review at all. The vendor brings aggregates, I bring specifics, and that gap is the part that actually counts.
The QBR deck measures the wrong things
The vendor's QBR usually leads with total alerts investigated, plus timing metrics like mean time to detect (MTTD) and mean time to respond (MTTR). These dominate decks because they mirror what customers already measure. The SANS 2025 Detection and Response Survey shows most teams still benchmark on MTTR and incident counts.
MTTD is easy to game: alert fast on noise and you post a low number witt of those cases are false positives, and the headline still reads as a win. The count is accurate, it just doesn't tell you whether the service works.
I care about the false positive rate of escalations sent to my team, broken down by detection category, not rolled into an aggregate. Alert fatigue comes from volume without signal.
Most customers walk into the renewal with no counter-metric to challenge what the vendor presents. Build that baseline before the call, or you're grading the service with the vendor's own rubric.
Escalation volume exposes the architecture
I've watched escalation rates across two different MDR contracts. In the first year, ambiguous escalations are expected. The provider is still learning the recurring behavior in your environment, the service accounts and the legitimate admin activity that looks anomalous to generic rules.
By month twelve, that rate should be dropping materially. A flat rate that late points to an architecture that can't turn your context into durable detection logic. I've watched a provider hold that rate flat for eighteen months and still call it tuning on every QBR.
Detection-as-code is what separates a provider that can scale from one that can't, and it shows up in how Forrester ranks MDR leaders. Without that methodology, a provider can suppress noisy alerts or move thresholds, but it can't encode that service account X runs behavior Y every Tuesday at 3 AM, expected here, into versioned detection logic.
The providers that improve true positive rates year over year are building those detections as code, not tuning generic rules. The tell is whether they can show you a versioned detection with your environment's name on it, not a screenshot of a suppressed alert.
If your escalation load is flat into year two, the real question is whether their architecture can learn your environment at all.
Closed-alert quality is what you're paying for
The word "investigated" on a closed MDR ticket covers an enormous range. At one end, an analyst pulls query results from your security information and event management (SIEM) platform, then cross-references the finding against endpoint telemetry and authentication history.
At the other end, a provider unquarantines an endpoint and closes the ticket with a tag and a generic true-positive note, no artifact citations, no behavioral analysis. That makes the verdict a disposition label, not an investigation.
I've closed plenty of tickets myself. The difference between those two ends is whether anyone could reconstruct my reasoning six months later.
When a closed verdict carries no auditable evidence chain, you have no basis to challenge it, and the same low-quality dispositions recur quarter after quarter. So this is the question I now bring to every renewal: pull three closed verdicts from last quarter and walk me through the evidence chain.
The AT&T MDR Evaluator's Guide makes the same point. It lists whether analysts leave an audit trail among the questions a buyer should ask, and the principle is general: a vendor's claims have to be specific enough to verify.
If a detection engineering team can't inspect the reasoning behind a closed verdict, they can't improve their own coverage based on what the MDR found or missed.
Night-shift Tier 1 is a structural ceiling
A large share of alerts fire outside business hours. That puts the heaviest detection load exactly where the context gap is widest.
The night-shift Tier 1 analyst sees a service account behave anomalously, when that behavior is expected in your environment. Two options: escalate and wake your team at 2 AM, or suppress and risk a real miss. Both mean the coverage model failed at its core job.
I've been the person making that call, weighing a vague alert against the cost of waking a staff engineer who ships a release in the morning. The context that would resolve it, a new VPN, a maintenance window, the normal rhythm of a service account, lives with your team and changes constantly.
Tiered models make this structural. They hand context across shifts through manual, lossy handoffs, prioritizing cost over continuity.
AI-native MDR approaches from vendors like Exaforce and Daylight try to close the gap. They keep customer-specific context available at investigation time, through preserved institutional knowledge and cross-domain correlation.
It's a real architectural difference from the traditional model. But pattern ingestion only raises the context ceiling, it doesn't remove it, since business intent is hard to read from patterns. So for any provider you're evaluating, the question is simple: what does your 2 AM analyst know about my environment that your 2 PM analyst does?
Seven questions I bring to the renewal table
These are the seven I put on the table, and a provider worth renewing answers each one with specifics from your environment, not aggregates from their platform.
- Show me the suppression rules you created for our environment in the last 90 days, what triggered each one, and what the decision logic was.
- What percentage of alerts were auto-closed or suppressed without human review in the last quarter, broken down by alert class?
- Pull three closed verdicts and walk me through the evidence chain from alert to close decision.
- Give us our false positive rate broken down by detection category: endpoint behavioral, identity, network, and cloud.
- Walk me through your escalation decision tree for a confirmed lateral movement alert at 2 AM. Who makes the call, what threshold triggers it, and what actions do your analysts take before reaching out?
- How many custom detection rules has your team written for our environment in the last 12 months, and can you show the before-and-after signal quality for each one?
- Provide your MITRE ATT&CK coverage map for our specific environment, your generic coverage matrix won't do. Which sub-techniques and procedures does each detection cover given our actual telemetry sources?
A vendor that answers all seven with specifics has earned the renewal. I've worked through every one of these failures in real incidents, on real shifts, where the escalation landed without context and I had to choose: spend forty minutes rebuilding the investigation, or just trust the label.
Global median dwell time rose to 14 days in 2025, up from 11 the year before, the second straight year M-Trends 2026 showed the metric moving the wrong way after a decade of improvement. Against that backdrop, a flat QBR trend isn't holding steady, it's falling behind.
I've learned to treat the renewal as the one meeting where ticket history speaks louder than the deck. If your provider's 2 AM analyst knows nothing about your environment, the ticket history already told you that.
Frequently asked questions about MDR renewals
What metrics should I pull before an MDR renewal?
Build your own false positive rate by detection category before the call. Pull your escalation log, flag cases where your team re-investigated closed verdicts or needed follow-up context to act, and calculate the ratio of raw alerts ingested to escalations delivered. Most operators enter renewals without a counter-metric to challenge the vendor's QBR numbers, and that gap lets vanity metrics go unchallenged.
How do I tell whether escalation volume is a tuning problem or an architecture problem?
Look at the trajectory. Ambiguous escalation rates should decline materially between months six and twelve as the provider encodes your environment into detection logic. If the rate is flat or worsening into the second contract year, the provider likely can't build customer-specific detections persistently. Ask whether they use detection-as-code methodology rather than only applying threshold adjustments to generic rule sets.
How do I audit a closed MDR verdict?
A good closed verdict includes what fired, what telemetry the analyst examined, what hypotheses were considered and ruled out, and the documented basis for the close decision. If the only thing in the ticket is a disposition label like "Benign" or "True Positive" with no artifact citations or behavioral analysis, you're paying for disposition labels instead of actual investigations.
How do I check MDR detection coverage with MITRE ATT&CK?
Request an ATT&CK Navigator layer showing detection coverage specific to your monitored environment and telemetry sources. Ask which sub-techniques and procedures each detection covers. A provider claiming coverage of a technique through one procedure while missing three others has partial coverage at best. Check whether your provider participated in MITRE Engenuity's ATT&CK Evaluations, and review per-technique Detection Coverage scores if available.
Should I run unannounced simulations against my MDR provider?
Yes. Closed-ticket auditing only tells you about alerts the system generated. Injecting known-bad signals validates the full incident response verdict pipeline, from detection through investigation to escalation. If the provider objects, that reaction is itself informative.