Skip to main content

Intelligence Confidence Levels

Intelligence Confidence Audit Engine (ICAE)

Operational confidence posture for every DNS intelligence protocol. ICAE assessments are derived from deterministic pass thresholds plus time-in-service, with no manual overrides or self-grading. Every result is cryptographically hashed and retained — a tamper-evident, historically verifiable audit trail. View the accountability log.

9/9 Protocols
0 Regressions

Intelligence Confidence Matrix

Verified 9/9
Development < 100 passes
Verified 100+ passes
Consistent 500+ & 30d
Gold 1K+ & 90d
Gold Master 5K+ & 180d
SPF Verified 4783 passes · 4783 runs
Collection
Verified 4260 runs
Analysis
Verified 4783 runs
First pass: 2026-02-19 Last evaluated: 2026-03-10
AnalysisConsistent 4783/500 passes 19/30 days 63%
CollectionConsistent 4260/500 passes 17/30 days 56%
DKIM Verified 4602 passes · 4602 runs
Collection
Verified 4242 runs
Analysis
Verified 4602 runs
First pass: 2026-02-19 Last evaluated: 2026-03-10
AnalysisConsistent 4602/500 passes 18/30 days 60%
CollectionConsistent 4242/500 passes 17/30 days 56%
DMARC Verified 4766 passes · 4766 runs
Collection
Verified 4152 runs
Analysis
Verified 4766 runs
First pass: 2026-02-19 Last evaluated: 2026-03-10
AnalysisConsistent 4766/500 passes 19/30 days 63%
CollectionConsistent 4152/500 passes 17/30 days 56%
DANE/TLSA Verified 4589 passes · 4589 runs
Collection
Verified 4233 runs
Analysis
Verified 4589 runs
First pass: 2026-02-19 Last evaluated: 2026-03-10
AnalysisConsistent 4589/500 passes 18/30 days 60%
CollectionConsistent 4233/500 passes 17/30 days 56%
DNSSEC Verified 4764 passes · 4764 runs
Collection
Verified 4235 runs
Analysis
Verified 4764 runs
First pass: 2026-02-19 Last evaluated: 2026-03-10
AnalysisConsistent 4764/500 passes 19/30 days 63%
CollectionConsistent 4235/500 passes 17/30 days 56%
BIMI Verified 4601 passes · 4601 runs
Collection
Verified 4236 runs
Analysis
Verified 4601 runs
First pass: 2026-02-19 Last evaluated: 2026-03-10
AnalysisConsistent 4601/500 passes 18/30 days 60%
CollectionConsistent 4236/500 passes 17/30 days 56%
MTA-STS Verified 4603 passes · 4603 runs
Collection
Verified 4250 runs
Analysis
Verified 4603 runs
First pass: 2026-02-19 Last evaluated: 2026-03-10
AnalysisConsistent 4603/500 passes 18/30 days 60%
CollectionConsistent 4250/500 passes 17/30 days 56%
TLS-RPT Verified 4606 passes · 4606 runs
Collection
Verified 4148 runs
Analysis
Verified 4606 runs
First pass: 2026-02-19 Last evaluated: 2026-03-10
AnalysisConsistent 4606/500 passes 18/30 days 60%
CollectionConsistent 4148/500 passes 17/30 days 56%
CAA Verified 4599 passes · 4599 runs
Collection
Verified 4238 runs
Analysis
Verified 4599 runs
First pass: 2026-02-19 Last evaluated: 2026-03-10
AnalysisConsistent 4599/500 passes 18/30 days 60%
CollectionConsistent 4238/500 passes 17/30 days 56%
Dual Threshold (passes + time): Development <100 Verified 100+ Consistent 500+ & 30d Gold 1K+ & 90d Gold Master 5K+ & 180d
Effective Maturity = min(Analysis, Collection). A protocol’s assessed level reflects the lower of its two layers — both collection and analysis must independently reach each threshold. No layer is exempted.

Hash Integrity Audit

Every analysis result is SHA-3-512 (Keccak, NIST FIPS 202) hashed at creation using a canonical, deterministic serialization of all protocol findings. This audit recomputes hashes from stored results and compares against the retained posture hash to verify no data has been altered post-analysis.

Audit window: 100 of 6223 hashed analyses (most recent 100, ordered by creation date)

100 Audited
100 Verified
0 Mismatched
100% Integrity

All 100 of 100 audited results verified — no posture hash mismatches detected. Tamper-evident audit trail intact.

Calibration Validation

Empirical accuracy of the confidence scoring system, measured by running 129 golden test cases across 5 resolver agreement scenarios (645 total predictions). This answers: “when we state a confidence level, how often are we actually correct?”

Brier Score excellent
0.0018
Excellent — near-perfect probabilistic accuracy
Scale: 0.0 (perfect) — 0.25 (no skill) — 1.0 (worst). Reference: Brier (1950)
Expected Calibration Error good
0.0310
Good — minor calibration gap, operationally reliable
Mean |predicted − observed| across bins, weighted by population. Reference: Naeini et al. (2015)

Reliability Diagram

Each bin shows how well stated confidence matches observed accuracy. A perfectly calibrated system would have predicted = observed in every bin.

Confidence Bin Predictions Predicted Observed Gap Distribution
80–90% 14 88.0% 100.0% 0.1200
90–100% 631 97.1% 100.0% 0.0290

Per-Protocol Calibration

Calibration gap by protocol — how far each protocol’s stated confidence deviates from observed accuracy. Sorted from best to worst calibrated.

Protocol Cases Mean Confidence Pass Rate Brier Gap Rating
DMARC 120 98.8% 100% 0.0002 0.0120 excellent
CAA 50 98.0% 100% 0.0006 0.0200 excellent
SPF 100 98.0% 100% 0.0006 0.0200 good
TLS-RPT 25 97.2% 100% 0.0012 0.0280 good
DNSSEC 125 96.8% 100% 0.0015 0.0320 good
DKIM 40 96.0% 100% 0.0024 0.0400 good
MTA-STS 60 96.0% 100% 0.0024 0.0400 good
BIMI 55 95.2% 100% 0.0035 0.0480 good
DANE/TLSA 70 94.0% 100% 0.0054 0.0600 adequate
Methodology: Predictions use fixed rawConfidence=1.0 (engine predicts “correct”) with 5 resolver agreement levels (5/5 through 1/5) per test case. No label leakage — ground-truth outcomes are never used as prediction inputs. The shrinkage estimator w·Craw + (1-w)·α/(α+β) is what produces the varying confidence levels as measurement quality degrades.

Confidence Degradation Log

No degradation events recorded. All protocols have maintained continuous passing status since first evaluation.

Protocol Test Dossier

Each protocol is audited against deterministic test cases grounded in specific RFC sections. Below is what ICAE tests for each protocol — the signals, the standards, and the methodology.

SPF — 20 total (19 analysis + 1 collection) test cases

Validates qualifier classification (~all, -all, +all, ?all), DNS lookup counting (RFC 7208 §4.6.4), 10-lookup limit enforcement, no-mail intent detection, multiple-record error handling (§3.2), record classification (valid vs. spf-like), and cross-protocol RFC 7489 §10.1 premature rejection warnings.

Methodology: Deterministic input → expected output. No live DNS. Pure logic validation.

DMARC — 24 total (21 analysis + 3 collection) test cases

Validates policy enforcement logic (reject/quarantine/none per RFC 7489 §6.3), partial percentage coverage, SPF-only exposure, DMARC-without-SPF gaps, null MX no-mail domains (RFC 7505), and structured color/severity mapping for spoofability assessment.

Methodology: Combinatorial policy matrix → expected verdict + severity color.

DKIM — 8 total (7 analysis + 1 collection) test cases

Validates RSA key strength classification (1024-bit weak, 2048-bit adequate per RFC 8301), Ed25519 key type parsing (RFC 8463), revoked key detection (empty p= per RFC 6376 §3.6.1), test mode flag detection (t=y), and provider fingerprinting (Google, Microsoft 365).

Methodology: Synthetic DKIM records → expected key analysis + provider classification.

DNSSEC — 25 total (18 analysis + 7 collection) test cases

Validates chain-of-trust verdicts (signed/unsigned/broken per RFC 4033 §2), tampering exposure assessment, DS digest classification (SHA-256 per RFC 8624 §3.3), enterprise DNS provider detection (Cloudflare, AWS Route 53, Google Cloud DNS, Azure, Akamai, NS1 per RFC 1035), and inheritance chain classification.

Methodology: Simulated NS/DS inputs → expected verdict + provider fingerprint.

DANE/TLSA — 14 total (10 analysis + 4 collection) test cases

Validates TLSA usage type parsing (DANE-EE usage 3 per RFC 7672 §3.1), deprecated usage recommendations (usage 0 triggers RFC 7672 advisory), MX host extraction (RFC 5321 §5), full-coverage verdict logic, and no-TLSA informational classification.

Methodology: Synthetic TLSA + MX inputs → expected verdict + coverage status.

MTA-STS — 12 total (9 analysis + 3 collection) test cases

Validates mode enforcement logic (enforce=success, testing=warning per RFC 8461 §5), policy line parsing (version, mode, max_age, mx per §3.2), STS record filtering (§3.1), and policy ID extraction.

Methodology: Synthetic DNS records + policy bodies → expected parsed fields.

CAA — 10 total (6 analysis + 4 collection) test cases

Validates CA issuer identification (Let’s Encrypt, DigiCert per RFC 8659 §4), issuewild detection (§4.3), iodef record detection (§4.4), and human-readable message construction.

Methodology: Synthetic CAA records → expected parsed issuers + flags.

BIMI — 11 total (9 analysis + 2 collection) test cases

Validates record filtering (v=BIMI1 per RFC 9495 §3), logo URL extraction, VMC (Verified Mark Certificate) URL extraction, and absent-VMC null handling.

Methodology: Synthetic BIMI records → expected parsed URLs + null checks.

TLS-RPT — 5 total (3 analysis + 2 collection) test cases

Validates TLS-RPT URI extraction from rua fields (RFC 8460 §3), plus cross-protocol cryptographic strength classification: DKIM key strength (2048-bit RSA adequate per RFC 8301, Ed25519 strong) and DS digest type classification (SHA-256 per RFC 8624 §3.3).

Methodology: Synthetic TLS-RPT records → expected URI extraction; known algorithm + key size inputs → expected strength label.

Maturity Levels

Modeled after the Capability Maturity Model (CMM) developed at Carnegie Mellon’s Software Engineering Institute. Five tiers of sustained correctness — earned through deterministic test runs, never self-assigned.

The dual-threshold system (consecutive passes AND elapsed time) prevents maturity inflation. A burst of 5,000 runs in one day cannot achieve Gold Master — the 180-day time requirement ensures tests have run across multiple code versions, infrastructure changes, and resolver conditions. This mirrors how real-world confidence is earned: through sustained performance, not a single marathon session.

Development

Fewer than 100 consecutive passing audit runs. The engine is learning.

Verified

100+ consecutive passes. Results are reliable but still maturing.

Consistent

500+ passes over 30+ days with no regressions. Production-grade correctness.

Gold

1,000+ passes over 90+ days. Battle-tested across diverse domains.

Gold Master

5,000+ passes over 180+ days. The highest confidence tier — reference-grade intelligence.

Two-Layer Auditing

Each protocol is audited at two independent layers:

Collection Layer

Validates that DNS records are queried, retrieved, parsed, and filtered correctly. Tests cover multi-resolver consensus algorithms (5-resolver agreement checks), record type extraction (MX hosts, CAA issuers), record filtering (identifying valid BIMI, MTA-STS, SPF records from TXT noise), TLSA parsing, NS provider classification, and DKIM key parsing. Currently 27 test cases.

Analysis Layer

Validates that collected data is interpreted correctly against RFC standards. Tests cover SPF qualifier classification, DMARC policy enforcement logic, DKIM key strength assessment, DNSSEC chain-of-trust verdicts, brand impersonation verdicts, CAA issuer identification, DANE coverage assessment, and regression guards from every past correctness bug. Currently 102 test cases.

Timeout & Efficiency Strategy

DNS Tool manages timeout budgets across multiple concurrent lookups to maximize intelligence coverage while respecting external service constraints.

Parallel Execution

DNS, SPF, DMARC, DKIM, DNSSEC, CT logs, MTA-STS, TLS-RPT, BIMI, CAA, infrastructure, and security.txt are dispatched concurrently. DANE and SMTP run sequentially after MX resolution. Total analysis targets 10–30 seconds for most domains.

Per-Section Budgets

Each section has independent timeout budgets: DNS lookups (5s per resolver), CT log queries (15s with cooldown), SMTP transport probes (10s per host per port), and HTTP fetches for MTA-STS/BIMI/security.txt (3–5s each). No single section can block the entire analysis.

Graceful Degradation

When a section times out or errors, the analysis continues with remaining sections. Timed-out sections are flagged with a partial-failure banner so you know exactly which data may be incomplete. Re-analysis retries failed sections.

Remote Probe Failover

SMTP probing uses dedicated remote infrastructure (US region) for reliable port 25/465/587 access. If the remote probe is unavailable (network, auth, rate limit), the system falls back to local direct probing. Rate limiting: 30 requests per 60 seconds per client.

Efficiency Tracking

The ICAE Collection layer audits timeout handling as part of protocol correctness. Proper timeout behavior (returning informational status vs. crashing, logging appropriately, enabling re-analysis) is tested alongside data correctness. Timeout patterns feed into maturity progression.

Scanning Philosophy

DNS Tool is designed to be a responsible participant in every system it touches. Our approach: gather the intelligence we need while leaving the smallest possible footprint.

Minimal Footprint

Analysis uses standard DNS protocol queries and lightweight HTTP HEAD/GET requests with per-section timeout budgets. No brute-force enumeration, no credential stuffing, no port scanning beyond mail transport (25/465/587). Exposure checks use 200ms inter-request delays to avoid overwhelming target infrastructure.

Adaptive Rate Awareness

Third-party services (certificate transparency logs, RDAP registries) are monitored with telemetry-based exponential backoff — automatic cooldown from 5 seconds to 5 minutes when services signal degradation. When a source is unavailable or rate-limited, DNS Tool says so honestly rather than hiding the gap.

Symbiotic Interfacing

Every external data source is documented on the Sources page with its rate limits, methodology, and verification commands. SecurityTrails is user-key-only and never called automatically. Community services like Team Cymru are queried via standard DNS protocol with no API keys required.

Honest Reporting

When a section times out, gets rate-limited, or encounters an error, the report says exactly that — never “no issues found” when the data simply could not be checked. Four clear states: success, rate-limited, error, and partial. Transparency is non-negotiable.

Why This Matters

A security grade without a disclosed confidence level is an assertion, not an analysis. The ICAE provides full transparency into analytical correctness — because the score means nothing if you can’t see how certain we are of our own results.

Every protocol’s confidence level is backed by a verifiable count of consecutive audit passes. No black boxes. No hand-waving. Every claim backed by deterministic test cases.

Intelligence Currency Levels

Intelligence Currency Audit Engine (ICuAE)

Companion to ICAE. While ICAE measures correctness (did we interpret the data right?), ICuAE measures currency (is the data still valid?). Five standards-grounded dimensions evaluate data freshness, TTL compliance, completeness, source credibility, and TTL relevance for every scan.

Runtime Performance

ICuAE measures how close each scan’s data comes to the theoretical ideal — a perfectly tuned, machine-locked collector that requests every DNS record at exactly the right cadence, receives responses within authoritative TTL windows, achieves complete multi-resolver consensus, and returns a full record set. A score of 100 means the collected data is indistinguishable from ground truth. Because real-world DNS inherently fluctuates (caches age, resolvers disagree, optional records vary by domain), we track statistical stability across scans rather than pass/fail maturity.

5105 Scans Evaluated
74.7 Adequate
Trend
Good Stability Stability (σ=5.2)

Grade Distribution

How often each currency grade appears across all evaluated scans. A healthy system clusters toward Excellent and Good.

56%
Adequate
2878 scans
0%
Degraded
1 scans
44%
Good
2226 scans

Per-Dimension Averages

Each of the five currency dimensions, averaged across all scans. Low-scoring dimensions indicate systemic patterns; tuning hints suggest how to improve collection fidelity.

Dimension Standard Avg Score Grade Samples
Completeness NIST SP 800-53 SI-7 35.2 degraded 5105
Tuning Advisory: Multiple expected record types are consistently missing. Expanding the query set or adding retry logic for failed lookups would improve coverage.
Currentness ISO/IEC 25012 100.0 excellent 5105
Source Credibility ISO/IEC 25012 + SPJ 97.6 excellent 5105
TTL Compliance RFC 8767 97.7 excellent 5105
TTL Relevance NIST SP 800-53 SI-7 43.0 degraded 5105
Tuning Advisory: Observed TTLs deviate significantly from expected ranges for their record types. This often indicates domain-side misconfiguration rather than collection issues.
Last evaluated: 2026-03-10 15:08 UTC

Why Track Currency Separately?

ICAE — Correctness

“Did we read the data right?” ICAE runs deterministic test vectors against our analysis engine. If SPF says ~all, does the tool correctly identify it as a softfail? This is pass/fail, so we track consecutive passes and maturity tiers.

ICuAE — Currency

“How close is the collected data to ground truth?” ICuAE scores each scan against a theoretical ideal — a machine-locked collector with perfect TTL compliance, complete records, and full resolver consensus. Real-world DNS fluctuates, so instead of pass/fail we track statistical stability — rolling averages and variance across scans.

Per ICD 203, confidence requires both: an accurate interpretation of data that is also current. One without the other is incomplete intelligence.

Excellence Benchmarks

What does “near-ideal” DNS collection look like in the real world? These targets are derived from large-scale passive DNS observation networks and authoritative resolver operations that approach the theoretical ideal.

Dimension Excellence Target Real-World Reference
TTL Compliance ≥95% Farsight DNSDB and OpenINTEL passive sensors collect at TTL-aligned intervals. RFC 8767 defines serve-stale as an explicit protocol extension, making non-compliant caching measurably detectable.
Completeness ≥98% Large-scale collectors (RiskIQ, Censys) query all standard record types per zone. ≥98% coverage of the core set (A, AAAA, MX, TXT, NS, SOA, CAA, DMARC, SPF) is achievable for any domain that publishes them.
Source Credibility ≥90% Google Public DNS, Cloudflare 1.1.1.1, and Quad9 operate at global scale with near-identical authoritative views. ≥90% multi-resolver agreement is standard; unanimity is expected for NS and SOA records.
Currentness <0.5× TTL DNSPerf tests from 200+ locations every 60 seconds. Median data age below half the authoritative TTL indicates the collector is querying well within the freshness window.
TTL Relevance Within Range NIST SP 800-53 SI-7 treats information integrity as a measurable property. TTLs within the typical range for their record type (3600s for TXT, 86400s for NS) indicate well-configured authoritative zones.

Where these numbers come from: Farsight Security’s DNSDB processes billions of DNS observations daily from sensor networks worldwide. OpenINTEL (University of Twente) performs daily active measurements across all .com, .net, and .org zones. These systems represent the closest real-world approximation to the theoretical machine-locked ideal. Our scoring model uses their operational characteristics as the upper boundary of what is achievable.

Self-Tuning Intelligence Pipeline

ICuAE is not just a measurement engine — it is the diagnostic instrument for the collection pipeline itself. By tracking per-dimension statistics across scans, ICuAE identifies exactly which stage of the analysis chain needs attention.

Phase 1: Advisory

Dimension-level tuning hints surfaced in the Per-Dimension Averages table. When a dimension scores below 90, ICuAE explains what’s happening and suggests specific improvements. Live

Phase 2: Suggested Config

Generate recommended scanner profiles from rolling statistics — resolver set, retry thresholds, record type priorities — requiring explicit approval before applying. Generation Live Approval On the Roadmap

Phase 3: Adaptive Tuning

Fully automatic, non-destructive adjustments (timing jitter, retries, resolver weighting) with rollback if stability decreases. Gated by minimum sample count and confidence thresholds. On the Roadmap

The vision: With enough scans and enough science, the confidence engine tunes TTLs, resolver weighting, query cadence, and retry logic until the system achieves the highest possible fidelity against the theoretical ideal — automatically, measurably, and with full provenance.

Standards Foundation

ICuAE is grounded in five authoritative standards from the intelligence community, information quality, and journalism ethics.

ICD 203 CIA Timeliness

Intelligence Community Directive 203 identifies timeliness as one of five core analytic standards. Data that was accurate yesterday may be misleading today.

NIST SP 800-53 SI-7

NIST SI-7 addresses information integrity — ensuring data has not been improperly modified and remains complete. ICuAE operationalizes completeness and TTL relevance as integrity dimensions for DNS data.

ISO 25012 Currentness

ISO/IEC 25012 defines “Currentness” — data of the right age for its context. DNS records have inherent validity windows defined by TTL values.

RFC 8767 TTL

RFC 8767 defines TTL-based cache expiration and serve-stale behavior. ICuAE detects when resolver TTLs exceed authoritative values — which may indicate serve-stale behavior, timing skew, or cache misconfiguration.

SPJ Source Ethics

SPJ Code of Ethics requires multiple independent sources for verification. ICuAE measures multi-resolver agreement as a credibility indicator.

Five Measurement Dimensions

Dimension Standard What It Measures
Currentness ISO/IEC 25012 Data age relative to its TTL-derived validity window. Are the DNS records still within their expected freshness period?
TTL Compliance RFC 8767 Whether resolver TTLs respect authoritative limits. Exceedances may indicate RFC 8767 serve-stale behavior, timing skew, or cache misconfiguration.
Completeness NIST SI-7 Percentage of expected record types with authoritative TTL data. Gaps reduce overall intelligence quality.
Source Credibility ISO + SPJ Multi-resolver agreement scoring. When all five resolvers return identical data, source credibility is highest.
TTL Relevance NIST SI-7 Observed TTL versus typical range for each record type. Extreme deviations may indicate misconfiguration.

Deterministic Test Matrix

29 test cases verify ICuAE scoring logic across all five dimensions. Every grade boundary, edge case, and nil-input path is tested deterministically — no randomness, no approximation.

Score-to-Grade Boundaries 1
All Standards
Currentness 6
ISO/IEC 25012
TTL Compliance 5
RFC 8767
Completeness 4
NIST SP 800-53 SI-7
Source Credibility 3
ISO/IEC 25012 + SPJ
TTL Relevance 6
NIST SP 800-53 SI-7
Integration & Constants 4
All Standards

Currency Grading Scale

The 0–100 score measures proximity to a theoretical ideal: a perfectly tuned collection system that requests every record type at exactly the right cadence, receives responses within authoritative TTL windows, achieves complete multi-resolver consensus, and returns a full record set with zero gaps. A score of 100 means the data is indistinguishable from what an ideally configured, machine-locked collector would produce. Each dimension is scored independently; the overall grade is their average.

Grade Range What It Means Signal
Excellent 90–100 Data was collected within authoritative TTL windows, all resolvers agree, and the record set is complete. Near-ideal collection fidelity. The system is performing at or near the theoretical machine-locked ideal. Minimal drift from ground truth.
Good 75–89 Minor deviations from ideal: perhaps one resolver returned a slightly stale cache, or a non-critical record type was absent. Data remains operationally reliable. Healthy collection with small imperfections. Acceptable for production intelligence.
Adequate 50–74 Measurable gaps: some resolvers served cached data beyond authoritative TTL, optional record types are missing, or source agreement is partial. Data is usable but not pristine. The domain’s DNS configuration has real-world imperfections common in production environments. Worth investigating but not alarming.
Degraded 25–49 Significant staleness or incompleteness: resolver caches substantially exceed authoritative TTLs, multiple record types are absent, or resolvers disagree on fundamental records. Data collection is meaningfully distant from the ideal. Results should be interpreted with caution; re-scan recommended after cache expiry.
Stale 0–24 Severe currency failure: data is likely cached well beyond TTL, critical record types are absent, or resolvers returned fundamentally conflicting answers. The collected data does not reflect current ground truth. Per ICD 203, stale data should not be used for confidence assessments without explicit caveats.

Why 0–100? ISO/IEC 25012 defines timeliness as a quantitative data quality dimension — it exists on a continuum, not as a binary. A 0–100 normalized score allows statistical tracking (rolling averages, standard deviation, trend analysis) that binary pass/fail cannot. NIST SP 800-53 SI-7 (Information Integrity) similarly treats data completeness and validity as measurable properties requiring periodic verification. The five-tier grading scale maps the continuous score to actionable categories, paralleling how ICD 203 maps analytic confidence to five levels (almost no confidence through high confidence).

Mathematical Foundations

Every confidence score is derived from deterministic, standards-grounded mathematics — not heuristics or machine learning. The formulas below are the actual computations running in the engine.

EWMA Drift Detection

The Exponentially Weighted Moving Average tracks currency score stability over time. Each new scan updates the statistic, giving recent observations more weight than historical ones.

$$ Z_t = \lambda \cdot X_t + (1 - \lambda) \cdot Z_{t-1} $$

Control limits detect statistically significant drift — not just any change, but changes that exceed normal process variation:

$$ \text{UCL/LCL} = \mu_0 \pm L \cdot \sigma \sqrt{\frac{\lambda}{2 - \lambda}\left[1 - (1-\lambda)^{2t}\right]} $$

Where \(\lambda\) is the smoothing factor (0.2), \(L\) is the control limit multiplier (3σ), and \(t\) is the observation period. Based on NIST/SEMATECH Engineering Statistics Handbook §6.3.2.4.

Implementation: icuae/ewma.goEWMAControlChart.Add(), EWMAControlChart.IsOutOfControl() · Parameters: NewEWMAControlChart(λ=0.2, μ0=50, σ=10, L=3.0)

Bootstrap note: The initial parameters (μ0=50, σ=10, L=3.0) are heuristic defaults that allow monitoring to begin immediately without a Phase I calibration dataset. σ is refined adaptively from observed data after 10+ observations (see Add() method). These are operational starting points per NIST/SEMATECH §6.3.2.4, not values fitted from historical in-control DNS data.

Reliability-Weighted Shrinkage Calibration

Each protocol carries an empirical prior — a Beta distribution encoding historical detection reliability. Measurement quality (resolver agreement) determines how much the raw observation is trusted versus the prior anchor — a Bayesian-inspired shrinkage estimator:

$$ C_{\text{calibrated}} = w \cdot C_{\text{raw}} + (1 - w) \cdot \frac{\alpha}{\alpha + \beta} $$

Where \(w = \frac{\text{agreeing resolvers}}{\text{total resolvers}}\) is measurement quality, and \(\frac{\alpha}{\alpha+\beta}\) is the prior mean from a \(\text{Beta}(\alpha, \beta)\) distribution for the protocol category. When resolver agreement is low, the prior mean anchors the estimate; as agreement increases, the raw observation dominates. This is a convex shrinkage estimator — structurally similar to, but distinct from, the true Beta-Bernoulli posterior mean \(E[\theta|D] = \frac{\alpha+s}{\alpha+\beta+n}\), where the weight on data is derived from observation count rather than set independently. Prior parameters evolve via conjugate updating: each passing ICAE test increments \(\alpha\), each failure increments \(\beta\).

Implementation: icae/priors.goCalibrationEngine.CalibratedConfidence() · Per-protocol Beta priors defined in CalibrationEngine.priors map

Currency Score Normalization

Each ICuAE dimension is scored on a continuous 0–100 scale. The overall currency score is the weighted mean across all dimensions:

$$ S_{\text{currency}} = \sum_{i=1}^{n} w_i \cdot s_i \quad \text{where} \quad \sum_{i=1}^{n} w_i = 1 $$

Dimension weights are equal by default (each \(w_i = \frac{1}{n}\)). Per ISO/IEC 25012, timeliness is a quantitative data quality dimension — the continuous score enables statistical tracking (rolling averages, standard deviation, trend analysis) that binary pass/fail cannot.

Implementation: icuae/icuae.goBuildCurrencyReport() · Five dimensions scored independently via score* functions, averaged into composite grade

Cryptographic Integrity

Every analysis result is sealed with a SHA-3-512 digest over a canonical pipe-delimited representation of posture fields. The hash function is the NIST FIPS 202 standard (Keccak sponge construction):

$$ H = \text{SHA-3-512}\left(\text{Canonical}(R)\right) $$

Where \(R\) is the canonical posture representation — protocol statuses, records, policies, and posture labels joined in deterministic field order. The digest is independently verifiable — anyone with the same posture fields can recompute and confirm integrity.

Implementation: analyzer/posture_hash.goCanonicalPostureHash() · Pipe-delimited canonical string with deterministic field ordering, verified by icae/hash_audit.go

Dual Engine Architecture

DNS Tool employs two companion engines that measure scientifically distinct properties of intelligence quality. ICAE (correctness) and ICuAE (currency) are never conflated — accuracy and timeliness are independent dimensions per ICD 203 and NIST SP 800-53. These engines are one of five analytic perspectives that together form our Symbiotic Security model.

ICAE — Correctness

“Did we interpret the DNS data correctly?” Deterministic golden-rule tests with per-protocol maturity tracking and cryptographic hash integrity.

ICuAE — Currency

“Is the DNS data still valid/current?” Five standards-grounded dimensions evaluated per-scan with TTL-aware validity windows and multi-resolver credibility.

Straight talk about your data.

We use two cookies, both essential:

  • _csrf — Prevents cross-site request forgery. Required for form submissions. Security-only.
  • _dns_session — Only exists if you choose to sign in. No account required to use DNS Tool.

We log your IP address for two reasons: rate limiting (so nobody abuses the service) and security (identifying malicious actors and complying with legal obligations). We check source geography for analysis accuracy — DNS responses vary by region, and knowing which resolver answered from where makes the science better.

No tracking cookies. No analytics cookies. No ad networks. No data brokers. Our code is open-core — the application framework is publicly available under BUSL-1.1 with timed Apache-2.0 conversion. Verify it yourself.

If you create an account and want out, account deletion removes your login and scan history. Public domain analyses remain available because they contain only public DNS records, already hashed. Full details: Privacy Pledge.