Methodology
The Sovereignty Plane measures the gap between what a state has declared about AI sovereignty and what it can actually do. Every country in the sample is scored on two axes. Formal sovereignty captures the declared architecture: strategies, laws, institutions, partnerships. Substantive sovereignty captures the operational capacity: enforcement, procurement, assurance, delivery, negotiation.
The same distinction applies to AI readiness. Formal readiness is the architecture of declared preparedness for AI adoption: strategies, partnership announcements, and adoption roadmaps. Substantive readiness is the operational capacity to absorb capability as it diffuses, on terms the state can govern. The framework treats sovereignty and readiness as paired concepts because in the AI domain they are inseparable. The readiness to absorb capability without losing agency is itself the substantive form of sovereignty, and the architecture that signals readiness without the capacity beneath it is the same architecture that produces sovereignty theatre.
The framework rests on the wager that these two things often diverge. A country can publish a sophisticated AI strategy and still be unable to enforce a single rule against a hyperscaler. A country can run a national AI deployment at scale and still lack the formal architecture that the deployment demands. Both patterns matter. The Plane makes both visible.
v2.2 update (April 2026). The S1 (Enforcement Capacity) indicator set has been redefined to separate latent capacity from observable enforcement. S1.1 now measures statutory enforcement authority, S1.2 demonstrated enforcement capacity, S1.3 compliance monitoring infrastructure, and S1.4 documented enforcement actions. This change recovers credit for countries with statutory architecture that has not yet been exercised. See /changelog for the cell-by-cell record.
v2.3 update (April 2026). Conceptual framing pass. The framework now explicitly pairs AI sovereignty with AI readiness. The formal/substantive distinction applies to both. A new opening section on the two-speed AI economy frames the strategic context. No data changes. All cohort scores and 836 indicator points are unchanged from v2.2. See /changelog for the full record.
The two-speed AI economy
The strategic landscape is shaped by two trends moving in opposite directions. The framework reads against both.
The first trend is concentration at the frontier. The cost of training a state-of-the-art model has risen from roughly 10 million dollars in 2020 to between 100 million and 1 billion dollars in 2024. Epoch AI projects that frontier training runs will reach 10 to 100 billion dollars by 2030, with single runs consuming electricity comparable to a mid-sized city. Compute is the binding input among compute, data, and algorithms because it cannot be substituted by talent or effort. At the projected scale, frontier operators will be concentrated in a small number of well-capitalised firms in the United States and China.
The second trend is diffusion. The cost of operating a model with a given capability falls by roughly three to five times each year as efficient training methods, distilled smaller models, and open-weights releases reach the market. A capability that costs hundreds of millions of dollars to develop today reaches commodity inference prices within three to five years.
The two trends together reframe the question for African states. The question is not whether to match frontier compute spending. It is whether the country will be ready to govern, procure, and deliver the capability that diffuses from the frontier. Readiness is substantive: procurement contracts that embed portability and audit rights, data infrastructure that allows integration, regulatory expertise that supports oversight, public-sector talent that converts deployment into delivery, and assurance institutions that confirm fitness for purpose. The Sovereignty Plane measures all five.
Sources: Buchanan (CSET 2020), Cottier et al. (Epoch AI 2024), Epoch AI (2024), Hobbhahn et al. (Epoch AI 2024).
Sample design
The core sample is fifteen African states stratified across the African Union's five geographic regions, plus four reference countries from the Tony Blair Institute's State of Compute Access framework. Total sample: nineteen.
| Region | Countries |
|---|---|
| North Africa | Egypt, Morocco, Tunisia |
| West Africa | Nigeria, Ghana, Senegal, Sierra Leone |
| East Africa | Kenya, Rwanda, Ethiopia |
| Central Africa | Cameroon, Gabon |
| Southern Africa | South Africa, Botswana, Malawi |
| Reference | France, Japan, UAE, Brazil |
The selection logic prioritises capacity diversity within each region. North Africa contributes its strategy publishers. West Africa is over-sampled at four because Nigeria and Ghana both anchor the paper's argument. East Africa includes the enforcement breakthrough case (Kenya) and a regional digital leader (Rwanda) alongside a low-evidence comparator (Ethiopia). Central Africa has two countries because the evidence base is thinner. Southern Africa includes the strongest substantive case in the corpus (South Africa) plus a documented deployment outlier (Malawi).
The four reference countries serve as calibration anchors. They are scored using the same rubric as the African states. They define what the rubric ceiling looks like so that reviewers can see the rubric discriminates rather than caps.
§02 Two axes, eleven dimensions, forty-four indicators
Formal sovereignty is the y-axis. Six dimensions, four indicators each, twenty-four indicators in total.
| Code | Dimension | What it measures |
|---|---|---|
| F1 | National AI Strategy | Strategy publication, scope, budget, implementation plan |
| F2 | Data Protection Law | Statute, DPA, cross-border regime, enforcement framework |
| F3 | Regulatory Institutions | AI body, digital ministry, standards, ethics guidance |
| F4 | International Commitments | Continental, multilateral, bilateral, norm alignment |
| F5 | Digital Infrastructure Policy | Broadband, cloud, localisation, cybersecurity |
| F6 | Partnership Architecture | Strategic, donor, bilateral, PPP frameworks |
Substantive sovereignty is the x-axis. Five dimensions, four indicators each, twenty indicators in total.
| Code | Dimension | Test question |
|---|---|---|
| S1 | Enforcement Capacity | Can the state enforce its rules against a major AI provider? |
| S2 | Procurement Power | Can the state walk away from a vendor without losing operational continuity? |
| S3 | Assurance Infrastructure | Can the state assess whether a deployed AI system is fit for purpose? |
| S4 | Delivery Capability | Is AI generating measurable service or productivity improvements? |
| S5 | Negotiation Capability | Can the state renegotiate terms when conditions change? |
Each indicator is scored on a 0-3 ordinal scale. Composite scores are normalised to 0-100 percentages.
§03 Scoring rubric
Every indicator is scored against a four-point ordinal scale. The rubric is the same for formal and substantive indicators.
| Score | Label | Definition | Evidence standard |
|---|---|---|---|
| 0 | Absent | No evidence of activity, policy, or capability | No documents, no institutional presence, no press references |
| 1 | Nascent | Draft, announced, or limited pilot | Published draft, announced commitment, single pilot project |
| 2 | Enacted / Partial | Formally adopted with partial operation | Enacted law or formal launch, observable activity, evidence of implementation gaps |
| 3 | Operational / Comprehensive | Fully operational with documented practice | Operating institution with track record, documented enforcement, scaled deployment |
The rubric was kept deliberately compact. Four steps create enough discrimination to separate Sierra Leone from Kenya without inviting endless ordinal arguments between intermediate scores.
§04 Aggregation and the sovereignty gap
For each axis, the country's composite score is the sum of indicator scores divided by the maximum possible, expressed as a percentage. The formal axis has a maximum raw score of 24 indicators × 3 = 72. The substantive axis has a maximum raw score of 20 × 3 = 60.
The sovereignty gap is the perpendicular distance from the country's point on the Plane to the diagonal where Formal % equals Substantive %. Computed as (Formal % - Substantive %) / sqrt(2). Reported in percentage points.
A positive gap indicates formal architecture outrunning operational capacity. We call this sovereignty theatre. A negative gap indicates operational capacity outrunning formal architecture. We call this ad-hoc capability. A near-zero gap indicates coherence: either negotiated interdependence if both axes are high, or dependency by default if both are low.
Because sovereignty and readiness are paired in the AI domain, the same four labels apply to readiness with the same meaning. A state in sovereignty theatre is also in readiness theatre, declaring preparedness without absorptive capacity. A state in negotiated interdependence is both sovereignty-coherent and readiness-coherent. The labels are kept singular for parsimony, but each carries the dual reading throughout the framework.
§05 Confidence flags
Every score carries a confidence flag.
- High — at least two independent primary sources verify the score.
- Medium — one primary source, or AI-assisted inference from authoritative secondary material.
- Low — no primary source located; score based on triangulation and judgement.
Low-confidence indicators are visually marked in the tracker and disclosed in the paper's limitations section. They are not removed from the composite. Excluding them would distort the regional comparison; flagging them transparently lets reviewers weight them as they choose.
§06 Validation states
Every score progresses through three validation states.
- AI-assisted — initial score generated by an AI research workflow with cited evidence. This is the default state at first scoring.
- Human-reviewed — the author has reviewed the AI-generated score and either accepted, adjusted, or rejected it. The review action and date are recorded.
- Expert-validated — a third-party domain expert (regional researcher, regulator, or programme reviewer) has signed off on the score.
Submission to Data & Policy requires every score in the cohort to be at least Human-reviewed. Expert-validated is the gold standard and is sought for the highest-stakes country scores.
§07 Data sources hierarchy
Sources are weighted in the following order:
- Primary government sources: legislation, gazette publications, regulator websites, ministry portals, official strategy documents.
- Authoritative regional and international trackers: African Union AI Strategy tracker, UNESCO AI Policy Observatory, the African Observatory on Responsible AI's Africa AI Policy Tool, OECD AI Policy Observatory, Carnegie AfTech.
- Peer-reviewed academic sources: CIGI policy briefs, Global Studies Quarterly, Data & Policy.
- Reputable secondary sources: Reuters, Bloomberg, Financial Times, sector-specific outlets with named bylines.
- AI-assisted retrieval: explicitly flagged when AI tools assisted in source discovery.
Where sources disagree, the most conservative score is used and the discrepancy is documented in the indicator's evidence note.
§08 Cutoff
The current cohort scoring reflects evidence available as of 2026-04-30. Evidence published after the cutoff is recorded in subsequent tracker versions but does not retroactively change scores in this version.
§09 Limitations
The framework has four limitations the paper discloses explicitly.
First, single-scorer bias. One author made the bulk of scoring decisions. Inter-rater reliability is partial; expert-validation rounds with the ILINA Program partially mitigate this but do not eliminate it.
Second, variable evidence base across countries. Some primary sources are not in English. Cameroon and Gabon's evidence base is markedly thinner than Nigeria's or Kenya's. Confidence flags transparently reflect this.
Third, snapshot nature. The AI policy landscape is moving rapidly. The 2026-04-30 cutoff captures one moment; the framework is designed to be re-applied at future cutoffs.
Fourth, judgement in qualitative scoring. The rubric does not eliminate judgement, particularly on indicators like "ethics guidance" and "negotiation capability" where contracts are rarely public. The methodology lock means definitions are stable, but interpretation always carries some scorer voice.
§10 Optional sixth signal
The methodology includes an optional sixth signal — epistemic alignment — which captures whether deployed AI systems reflect local knowledge, languages, and contexts rather than imported defaults. Indicators include African-language AI activity, indigenous knowledge representation in training data, and local institutional ownership of model artefacts.
This signal is reported as supplementary analysis rather than folded into the substantive composite. It is conceptually important to the paper's argument about negotiated intelligence but has not yet been calibrated to the same standard as the five core substantive signals.
§11 What this framework is not
The Sovereignty Plane is a diagnostic tool, not a definitive index. It is designed to make a comparative argument about formal-versus-substantive divergence visible and quantifiable. It is not designed to rank countries on AI capability writ large, nor to predict policy outcomes. It is not a substitute for country-level case studies; it is a complement that surfaces patterns case studies can then test.
Reviewers should treat the country scores as indicative positions on a two-dimensional surface rather than as point estimates of a latent capability. The value lies in the shape of the cohort — the spread of points, the regional clusters, the diagonal — more than in any single country's exact coordinates.