Methodology

How we verify every deal

Every deal published on Discoperi passes through a 9-stage AI and human verification pipeline. No deal is published without a minimum confidence score of 0.40 across six weighted dimensions. Here is exactly how it works.

12+
Signal channels monitored continuously
<2h
Average time from close to published
39
Structured fields per deal record
0.40
Minimum confidence score to publish
The 9-stage pipeline

From signal to verified record

Each stage adds structured data, removes noise, and increases the confidence that what we publish is accurate. Deals can fail at any stage and are discarded rather than published with uncertain data.

1
Signal ingestion — 12 channels

Continuous monitoring of 12 distinct source channels: SEC EDGAR (8-K filings, merger agreements), Bloomberg wire, Reuters wire, PE firm press release pages (top 200 firms), M&A trade publications (Mergermarket, PEI), regional financial press, company investor relations pages, HSR antitrust filings, UK CMA public registers, EU DG COMP decisions, and court records for contested transactions. Sources are polled every 15–30 minutes depending on channel latency.

SEC EDGAR Bloomberg Reuters 200 PE firm pages HSR filings EU CMA
2
AI extraction — 20 fields per signal

Claude (Anthropic) extracts 20 structured fields from each raw signal using a purpose-built M&A extraction prompt: acquirer name, target name, deal type, enterprise value, equity value, debt component, announcement date, close date, acquirer jurisdiction, target jurisdiction, sector, sub-sector, deal rationale, advisor names (buy-side financial, sell-side financial, legal), regulatory status, and deal conditions. Each field is extracted with an individual confidence score (0–1). Fields with confidence below 0.30 are left blank rather than guessed.

Claude API Structured extraction 20 fields Per-field confidence
3
Deduplication — fuzzy matching at 82%

The same deal will typically appear across 3–8 sources within hours of announcement. Our deduplication engine uses fuzzy string matching on acquirer + target + value combinations (threshold: 82% similarity) to group signals referring to the same transaction. The highest-quality source for each field is selected as the canonical value. Conflicting values across sources trigger a contradiction check in Stage 4.

Fuzzy match 82% Source deduplication Canonical selection
4
Contradiction checking

Where multiple sources provide different values for the same field (e.g. $18.5bn vs $18.2bn for enterprise value), the contradiction engine applies a source hierarchy: SEC filings > official press releases > major wire services > trade press > regional press. Value conflicts within 5% are resolved by averaging or selecting the SEC-filed figure. Value conflicts exceeding 5% lower the field confidence score and are flagged for the analyst review queue in Stage 8.

Source hierarchy 5% tolerance SEC precedence
5
Quality scoring — 6 weighted dimensions

Each deal record is scored across six dimensions, producing a composite quality score from 0 to 1. Records scoring below 0.40 are placed in a review queue and not published.

Source count (weight: 0.25)0.25
Field completeness (weight: 0.20)0.20
AI extraction confidence (weight: 0.20)0.20
Date plausibility (weight: 0.15)0.15
Value plausibility (weight: 0.10)0.10
Entity quality (weight: 0.10)0.10
Min score: 0.40 Auto-approve: ≥0.75 6 dimensions
6
AI article generation

For deals that pass the quality threshold, Claude generates a structured article: transaction overview, deal structure analysis, strategic context, and regulatory path. The article is generated from the structured fields — not from raw source text — preventing hallucination. All factual claims in the article are traceable to the verified field values. Articles are clearly labelled "AI-generated analysis" on the deal page.

Claude authorship Field-grounded Labelled AI content
7
Google Sheets staging — 39 fields

Verified deal records are written to a structured Google Sheets database with 39 fields across 5 tabs: deal overview (12 fields), financial details (8 fields), parties and advisors (9 fields), regulatory and legal (6 fields), and metadata (4 fields). This staging layer allows for human review, batch corrections, and manual enrichment before publication.

39 fields Google Sheets staging 5 structured tabs
8
Human analyst review (score <0.75)

Deals scoring 0.40–0.74 are flagged for analyst review before publication. A human reviewer checks: value plausibility against sector benchmarks, entity name accuracy (common failure point for international entities), advisor attribution accuracy, and date consistency. Analysts can approve, reject, or modify any field. Deals scoring ≥0.75 are auto-approved. Approximately 35% of published deals are auto-approved; 65% pass through human review.

Manual review queue 35% auto-approved Score 0.40–0.74
9
Publication — WordPress & deal alerts

Approved deals are published to discoperi.com via the WordPress REST API. Deal alert subscribers with matching sector, size, and geography filters receive an email notification within 30 minutes of publication. The complete 39-field record is also made available for export in CSV and JSON formats for paid subscribers.

WordPress REST API Alert delivery ≤30 min CSV + JSON export
Signal sources

What we monitor

SourceTypeCoverageUpdate frequency
SEC EDGARRegulatoryAll US public company filings (8-K, S-4, merger agreements)Every 15 min
Bloomberg wireWireGlobal M&A coverage, deal announcements, regulatory decisionsEvery 20 min
Reuters wireWireGlobal M&A, PE deals, restructuringsEvery 20 min
PE firm press releasesOfficialTop 200 PE firms — direct from source, no wire latencyEvery 30 min
HSR antitrust filingsRegulatoryUS transactions above $119.5m filing thresholdDaily
EU DG COMP decisionsRegulatoryEuropean Commission merger control decisionsDaily
UK CMA registerRegulatoryUK merger reviews, phase 1 and phase 2 decisionsDaily
Mergermarket / PEITrade pressSpecialist M&A and private equity trade coverageEvery 30 min
Company IR pagesOfficialInvestor relations pages for FTSE 100, S&P 500, and select mid-capsEvery 30 min
Regional financial pressPressMajor regional financial publications — 40+ outletsHourly
Structured data

All 39 fields per deal

Every published deal record contains up to 39 structured fields. Fields that cannot be verified above the confidence threshold are left blank rather than estimated.

Acquirer name
Target name
Seller name
Deal type
Enterprise value
Equity value
Debt component
Currency
Announcement date
Close date
Expected close date
Deal duration (days)
Primary sector
Sub-sector
Acquirer geography
Target geography
Buy-side financial advisor(s)
Sell-side financial advisor(s)
Buy-side legal counsel
Sell-side legal counsel
Regulatory bodies
Regulatory outcome
Conditions / remedies
EV/EBITDA (if available)
EV/Revenue (if available)
Implied premium (%)
Deal rationale (AI)
Strategic category
Financing type
Retained stake (%)
Source count
Quality score (0–1)
Confidence per field
Auto-approved flag
Analyst reviewed flag
First published
Last updated
Correction history
Discoperi deal ID
Corrections policy: Any party with first-hand knowledge of an error — including the companies, advisors, or lawyers involved in a transaction — can submit a correction at corrections@discoperi.com. Corrections are verified against primary sources and published within 24 hours. The correction history is permanently recorded on the deal page.
AI transparency: All AI-generated content on Discoperi is clearly labelled. The article body on each deal page is generated by Claude (Anthropic). The structured field data is extracted by Claude but verified against primary sources — it is not generated from Claude's training knowledge. We do not publish AI-generated data without source grounding.
Browse verified deals → Submit a correction API documentation