The Northern Brittonic Toponymic Reconstruction Framework (NBTRF) is an interdisciplinary methodology for the analysis and structured recovery of poorly attested Brittonic linguistic material from the northern Britain region — specifically Cumbria, Strathclyde, and the Pennine uplands — through systematic evaluation of historical toponymy, comparative Brittonic linguistics, and documented historical phonology.
The framework integrates three established scholarly traditions — the Parry–Lord oral-formulaic model [1, 2], Vansina's criteria for oral historical reliability [3], and Foley's theory of traditional referentiality [4] — within a historical-linguistic methodology adapted to the specific evidentiary conditions of the northern Brittonic record: the near-total absence of continuous textual sources, the survival of linguistic material primarily as toponymic fossils, and the close structural relationship between Cumbric and Old Welsh.
The framework produces a three-stage transformation pipeline (B1 → B2 → B3) applied to each entry: a Lexical Fossil stage (B1) drawing on place-name evidence; a Phonological Realisation stage (B2) drawing on documented northern Brittonic sound changes; and an Orthographic Attestation stage (B3) modelling scribal conventions. The results are assigned graded confidence levels (A, B, C).
Applied to 307 entries across 24 lexical and grammatical domains, the framework produces 19 non-identity xcb forms (exclusively cardinal numbers 1–20, using the yan-tan-tethera system as Grade A primary evidence) and 288 identity forms (xcb = Old Welsh), reflecting the conservative evidentiary standard adopted throughout. The framework is designed to accommodate future contributions as new academic evidence becomes available.
The NBTRF is the terminal framework of the Brittonic Convergent Diachronic Revitalisation System (BCDRS) — the Brittonic-specific instantiation of the abstract Convergent Diachronic Revitalisation System (CDRS). Its primary input — Revitalised Old Welsh — is itself the output of the Old Welsh Revitalisation Framework (OWRF), which in turn receives its input from the Middle Welsh Revitalisation Framework (MWRF). The complete pipeline runs: Modern Welsh → MWRF → Revitalised Middle Welsh → OWRF → Revitalised Old Welsh → NBTRF → Revitalised Cumbric.
Cumbric (from the Brittonic Combrogi, meaning "fellow countrymen") was the Brythonic Celtic language spoken in northern England and southern Scotland from approximately the fifth to twelfth centuries AD. Its geographic range encompassed the modern counties of Cumbria, Westmorland, Cumberland, Lancashire north of the Sands, parts of Northumberland, and the Scottish regions of Strathclyde, Galloway, and the Lothians [5].
Politically, the Cumbric-speaking zone was associated with a succession of Brittonic kingdoms of the Yr Hen Ogledd (the Old North): Rheged, Gododdin, and ultimately Strathclyde, the last great Brittonic polity of the north, which persisted until absorbed into the Scottish kingdom in the eleventh century. The County of Cumberland preserves in its very name the memory of this political entity — Cymry (the Welsh word for "Welsh people / Britons") is the same root as Cumbria and Cumberland [6].
The scholarly consensus is that Cumbric was not a categorically distinct language but rather a northern dialect of Late Common Brittonic, sharing its grammatical architecture, mutation system, and pronoun paradigms with Old Welsh. The primary features distinguishing Cumbric from its southern Brittonic cousin were regional phonological characteristics and some lexical variation — features visible in the place-name record and in the yan-tan-tethera counting system but not recoverable from any extant continuous text, as virtually no Cumbric literary corpus survives [7].
Evidence for Cumbric exists in three forms:
The NBTRF is designed to work systematically within this evidentiary landscape — deriving the maximum information from these three sources while maintaining explicit, graded transparency about what is attested, what is probable, and what is speculative.
NBTRF draws on three established scholarly traditions, each adapted from its original domain to the specific problem of reconstructing unattested northern Brittonic linguistic material from place-name evidence.
| Framework | Original domain | Core principle adopted |
|---|---|---|
| Parry–Lord | Oral epic poetry (1930s–60s) | Stable, recurrent formulaic units in a shared tradition. Redefined as recurrent toponymic morphological units. |
| Vansina | Oral historical transmission (1961/1985) | Multi-source corroboration without requiring direct transmission chains. Applied to geographic plausibility and convergent toponymic attestation. |
| Foley | Traditional referentiality / immanent art (1991) | "The part summons the whole." Place-names as compressed semantic units encoding inherited landscape-perception systems. |
The foundational work of Milman Parry and Albert Lord [1, 2] established that oral epic poets compose not word-by-word but through the deployment of stable, recurrent formulaic phrases — metrical units that carry consistent meaning across multiple performances. Within NBTRF, this framework is not applied in its original poetic form but generalised as a theory of formulaic linguistic production in naming environments.
A formula within NBTRF is defined as:
A recurrent morphological or lexical unit that appears in structurally consistent naming environments across multiple toponymic and comparative Brittonic contexts, demonstrating recurrence across geographically distinct locations, semantic stability across attestations, and productive combinatorial behaviour in compound formations.
The formulaic elements identified in the northern Brittonic record — pen-, aber-, caer-, glas-, cul- — satisfy all three criteria. They appear in multiple geographically independent place-names; their semantic function (summit, river mouth, fort, blue-green, narrow) is consistent across all attestations; and they combine productively with other elements to form compound toponyms. The analogy to Parry–Lord formulas is direct: just as swift-footed Achilles is a stable unit in Homeric composition, pen- is a stable unit in Brittonic naming practice.
Jan Vansina's work on oral tradition [3] developed criteria for assessing the historical reliability of information preserved without continuous written documentation — criteria based on the principle that independent convergence across multiple unrelated sources provides probabilistic evidence of genuine continuity, even without direct transmission chains.
Because Cumbric lacks continuous textual or oral documentation, Vansina's criteria are operationalised at the level of toponymic evidence rather than oral narrative. Specifically:
Rather than evaluating direct chains of oral transmission, NBTRF evaluates structural persistence within stable inhabited landscapes. A Brittonic element preserved in ten independent Cumbrian place-names across a historically Brittonic-speaking region has survived precisely because the people who used it persisted in the landscape long enough to transmit the name — even as the language itself ceased to be spoken.
John Miles Foley's theory of traditional referentiality [4] proposes that traditional phrases in oral composition carry what he terms immanent meaning — meaning that exceeds the literal content of the phrase and summons the whole tradition through the part. This is not mere metaphor; it is a cognitive claim about how traditional units function in meaning-making systems built from accumulated cultural use.
NBTRF extends this principle to onomastic systems. A toponym is, in Foley's terms, a traditional unit that has accumulated meaning through persistent use within a cultural-linguistic system. Penrith does not merely label a Cumbrian market town; it encodes within its first element (pen, summit or headland) an inherited classificatory system for landscape perception that was productive in Brittonic naming practice over centuries. The semantic weight carried by that element — as formulated by Foley's framework — is not exhausted by its literal meaning: it indexes an entire cultural system of landscape relation.
Within NBTRF, this means that each toponymic morpheme is treated not merely as evidence of a specific Brittonic word, but as evidence of a Brittonic conceptual and lexical system — a compressed cultural memory that points beyond itself to the full naming tradition from which it derives.
The key theoretical move NBTRF makes is the shift from oral performance as the object of analysis to linguistic fossilisation in toponymic systems. This is the framework's most original contribution. All three adapted traditions — Parry–Lord, Vansina, Foley — were developed to address questions about living oral traditions or recently lapsed ones. Cumbric presents a fundamentally different object: a language that ceased to be spoken approximately a millennium ago, leaving traces only in the landscape and in one category of counting rhymes.
The domain adaptation proceeds as follows:
This shift enables the application of oral-traditional theory to a non-oral, non-continuous corpus. The result is a methodology that can draw principled inferences from sparse evidence without either over-claiming certainty or abandoning the attempt to recover what is recoverable.
All reconstructions generated under NBTRF are evaluated cumulatively across multiple independent criteria. No single criterion is considered sufficient for validation. A valid reconstruction requires convergence across the following seven domains:
| Criterion | Description |
|---|---|
| A1 | Recurrence of morphological elements across multiple toponyms |
| A2 | Semantic stability across attestations and regions |
| A3 | Structural consistency in compound formation and morphological positioning |
| A4 | Cross-Brittonic comparability — Welsh, Cornish, Breton parallels |
| A5 | Geographic clustering within historically Brittonic-speaking regions of northern Britain |
| A6 | Archaeological or historical settlement plausibility supporting linguistic continuity |
| A7 | Independent corroboration from non-linguistic datasets (geography, hydrology, archaeology) |
Reconstructions are therefore treated as probabilistic models of linguistic persistence, not deterministic recoveries of attested Cumbric forms. The framework produces graded outputs rather than binary accepted/rejected decisions, reflecting the inherently probabilistic nature of inference from this category of evidence.
In practice, the great majority of grammatical forms — pronouns, verb paradigms, mutation markers, conjunctions, prepositions — do not have specific place-name evidence associated with them. For these, the evidence integration model produces an identity result (xcb = Old Welsh) by default. This is the correct scholarly outcome: the absence of specific northern evidence does not licence a departure from the Old Welsh baseline.
The NBTRF transformation pipeline processes each entry in sequence through three stages.
The output of each stage becomes the input to the next. The final output of B3 is the
Revitalised Cumbric (xcb) cell value.
Source: Old Welsh cell value only.
B1 asks: does attested northern Brittonic place-name evidence establish a lexical form for this word that differs from Old Welsh? This stage is governed by two hard rules:
In the current dataset, B1 is identity for all 307 entries. No specific place-name evidence was identified that would require a different lexical root from Old Welsh for any entry. This is the expected outcome: the toponymic record documents the Cumbric naming system but does not provide a general lexicon of Cumbric content words.
Source: B1 output only.
B2 models plausible historical northern Brittonic phonology based on:
B2 may only vary from Old Welsh where structural lexical identity remains unchanged and variation is historically plausible. No lexical change and no grammatical change may be introduced at this stage.
In the current dataset, B2 produces non-identity results for exactly 19 entries: cardinal numbers 1–20, excluding 5 (pimp, which is identity). These non-identity results are grounded in the yan-tan-tethera primary evidence (see §8).
Source: B2 output only.
B3 models how the phonologically realised form may appear in written historical records, drawing on:
B3 may only vary from B2 as orthographic representation — never as phonological or structural change. Where B2 is identity with Old Welsh, B3 is also identity.
For the 19 numeral entries, B3 uses the exact Borrowdale orthographic forms as recorded in the yan-tan-tethera literature. For all other entries, B3 is identity with Old Welsh.
The following categories are structurally frozen: for all entries in these categories, B1 = B2 = B3 = Old Welsh (identity). These categories cannot be the subject of non-identity contributions under any circumstances:
| Row prefix | Category | Basis for immutability |
|---|---|---|
PRN_* | All pronouns | Cumbric pronoun system structurally identical to Old Welsh; no northern divergence attested |
BE_*, HAVE_*, MOD_* | Verb paradigms | B1 Uncertainty Rule: no NBTRF verb evidence exists; identity is the only defensible position |
GO_*, COME_*, DO_*, etc. | Action verb paradigms | Same basis as verb paradigms above |
DAY_* | Days of the week | Calendrical borrowings from Latin; stable across all Brittonic languages |
MON_* | Months | Largely Latin/Romance borrowings; stable |
SEA_* | Seasons | Stable across Brittonic; no northern divergence attested |
TIM_* | Telling the time | Compound formulae; clock-time is a post-Cumbric concept; no NBTRF evidence |
TMP_* | Temporal words | Common words with no attested northern divergence |
GRT_*, INT_*, POL_* | Greetings / introductions / politeness | Phrasebook material; no NBTRF evidence; already lowest-confidence category |
ORD_* | Ordinal numbers | Explicit NBTRF immutable rule; no ordinal YTT evidence |
The following categories may — subject to evidentiary demonstration — produce non-identity xcb values. Even within these categories, identity remains the correct default unless evidence to the contrary is produced:
| Row prefix | Category | Possible divergence class | Basis |
|---|---|---|---|
NUM_1–NUM_20 | Cardinals 1–20 | CONTROLLED_PHONOLOGICAL | Yan-tan-tethera primary attestation (Grade A/B) |
NUM_0, NUM_21+ | Cardinals beyond 20 | Identity only | No YTT evidence above 20 |
ADJ_COL_* | Colour adjectives | CONTROLLED_PHONOLOGICAL | Some northern phonological analogues for inherited Brittonic colour terms — place-name evidence for glas (Glasgow) and cul (Culgaith) confirms identity with OW; no non-identity established |
ADJ_SIZE_* | Size adjectives | CONTROLLED_PHONOLOGICAL | Same basis as colours; northern toponymic forms stable |
ADJ_FEEL_* | Feeling adjectives | CONTROLLED_PHONOLOGICAL | Case-by-case; no toponymic corpus for this domain |
PREP_* | Prepositions | CONTROLLED_ORTHOGRAPHIC at most | Simple prepositions well attested; inflected forms less so |
CONJ_* | Conjunctions | CONTROLLED_ORTHOGRAPHIC at most | Well documented; no northern divergence found |
All NBTRF outputs are assigned confidence grades reflecting the quality and quantity of evidentiary support:
| Grade | Definition | Typical basis |
|---|---|---|
| A | Strong convergence across Brittonic comparanda and toponymic recurrence | Multiple independent place-name attestations; cross-Brittonic confirmation; or direct documentation (e.g. YTT 1–5, 15, 20) |
| B | Probable reconstruction from partial comparative and geographic evidence | Single place-name attestation with cross-Brittonic support; or YTT 6–14, 16–19 |
| C | Speculative reconstruction with limited or indirect evidential basis | Theoretically possible from known sound changes, but lacking specific attestation; used sparingly |
Identity results (xcb = owl) carry the confidence grade of the Old Welsh source column, which is separately documented in the Polyglot™ dataset. For grammatical categories (pronouns, verb paradigms, prepositions, conjunctions), the identity result is itself a high-confidence claim — Grade A or B — because the structural identity of Cumbric and Old Welsh in these domains is well established in the scholarly literature.
The yan-tan-tethera counting system is the most widely attested survival of Cumbric (northern Brittonic) speech in everyday use. It was used by shepherds and textile workers across Cumbria, Yorkshire, Northumberland, and parts of southern Scotland into the nineteenth and twentieth centuries for counting sheep and stitches, and has been systematically documented by dialect collectors [10].
The authoritative forms used in this dataset are taken from the Borrowdale column of the Cumberland and Westmorland table documented in the Wikipedia article on Yan tan tethera [10]. Borrowdale — a valley in the Lake District — is the geographic heart of the historical Cumbric-speaking region, and its attestations represent the most geographically appropriate surviving form of the system for a Cumbric column.
| # | Old Welsh (owl) | Borrowdale (xcb) | Notes | Grade |
|---|---|---|---|---|
| 1 | un | yan | OW un → northern rounding + final nasal drop | A |
| 2 | dou | tyan | Borrowdale uses tyan not tan — note regional variation | A |
| 3 | tri | tethera | teth- reflects older Brittonic *ter- | A |
| 4 | petguar | methera | meth-: documented consonant shift in YTT corpus | A |
| 5 | pimp | pimp | Identity — OW form already matches Borrowdale | A |
| 6 | chwech | sethera | Initial ch-/hw- lenition → seth- | B |
| 7 | saith | lethera | s- → l- shift (possibly dialectal) | B |
| 8 | oith | hovera | Aspirate + vowel shift | B |
| 9 | naw | dovera | d- prothesis | B |
| 10 | dec | dick | Dental stop retention, vowel reduction | B |
| 11 | un ar dec | yan-a-dick | Compound: yan + a + dick | B |
| 12 | doudec | tyan-a-dick | Compound: tyan + a + dick | B |
| 13 | tri ar dec | tethera-dick | Compound: tethera + dick | B |
| 14 | petguar ar dec | methera-dick | Compound: methera + dick | B |
| 15 | pimdec | bumfit | Widely attested primary term; cognate with Welsh pymtheg | A |
| 16 | un ar pimdec | yan-a-bumfit | Compound: yan + a + bumfit | B |
| 17 | dou ar pimdec | tyan-a-bumfit | Compound: tyan + a + bumfit | B |
| 18 | dounau | tethera bumfit | Space not hyphen in Borrowdale spelling | B |
| 19 | petguar ar pimdec | methera bumfit | Space not hyphen in Borrowdale spelling | B |
| 20 | uceint | giggot | Vigesimal unit; cf. jiggit elsewhere | A |
Caution: The yan-tan-tethera system shows considerable regional variation and was recorded long after Cumbric ceased as a spoken language. Some forms may represent convergence with Norse or Anglian phonology rather than pure Brittonic survival. The Borrowdale column is used exclusively here because it is the most geographically appropriate attestation, but it should not be treated as the only valid form of the system.
The NBTRF pipeline has been applied to all 307 entries in the Revitalised Cumbric dataset, drawn from three source tables: Verbs & Sentence Elements (206 entries), Numbers & Dates (75 entries), and Greetings & Introductions (26 entries).
| Outcome | Count | Notes |
|---|---|---|
| xcb = owl (identity) | 282 | B1 = B2 = B3 = Old Welsh across all three pipeline stages |
owl = —, xcb = — (no form) | 6 | Clock-time expressions (absent from Old Welsh and Cumbric), orange (colour absent from OW), one million |
| xcb ≠ owl (CONTROLLED_PHONOLOGICAL) | 19 | Cardinals 1–20 only (excluding NUM_5 which is identity) |
| Total | 307 |
The 19 non-identity forms are exclusively the cardinal numbers 1–4 and 6–20, using the Borrowdale yan-tan-tethera forms. NUM_5 (pimp) is identity because the Old Welsh form already matches the Borrowdale attestation.
The full per-entry derivation chain — B1 result, B2 result, B3 result, divergence class, validation tags — is documented in the NBTRF row trace log maintained internally by Penrith Beacon Communications | PBC.
| Category | xcb confidence | Basis |
|---|---|---|
| Subject pronouns (9 rows) | High 85% | Identity with OW; pronouns immutable per NBTRF |
| Cardinal numbers 1–20 (20 rows) | High 80% | YTT attestation; Grade A: 1–5, 15, 20; Grade B: 6–14, 16–19 |
| Ordinal numbers (7 rows) | High 80% | Identity with OW; well-attested OW forms |
| Days of week (7 rows) | Medium-High 70% | Identity with OW; calendrical terms stable |
| Prepositions (13 rows) | Medium 65% | Identity with OW; OW forms partially reconstructed |
| Adjectives (27 rows) | Medium 60% | Identity with OW; inherits OW confidence |
| To Be — present (6 rows) | Medium-High 75% | Identity with OW; present forms well attested in OW |
| Verb paradigms (all other) | Low–Medium 25–60% | Identity with OW; OW verb forms partially reconstructed |
| Greetings / introductions / polite | Low 25–40% | Identity with OW; OW greeting forms entirely reconstructed |
The following limitations apply to all NBTRF outputs and should be understood by any user of the Revitalised Cumbric dataset:
The recovery of Revitalised Cumbric is not an unprecedented project. The most directly applicable precedent is the revitalisation of Cornish — which died as a native language in the eighteenth century, when Dolly Pentreath of Mousehole, generally identified as the last fluent native speaker, passed away in 1777 [11].
The Cornish revival was initiated by Henry Jenner's A Handbook of the Cornish Language (1904) [12] and developed through Morton Nance's Unified Cornish standard (1929). The methodology employed — comparison with related Brittonic languages, place-name analysis, manuscript glosses, and historical phonology — is precisely the methodology of the NBTRF. Cornish is now taught in schools, spoken by an active community of learners, and officially recognised under the European Charter for Regional or Minority Languages.
Revitalised Cumbric occupies a more defensible position than revived Cornish in one important respect. Cornish was a southwest Brittonic dialect reaching across significant structural divergence from Welsh; Cumbric was a northern dialect of the same continuum as Welsh, sharing its grammar, verb system, and pronoun paradigms. The recovery of Cumbric does not require recovering a fully distinct language — it requires recovering the northern phonological colouring of something that still exists in living Welsh. The Cornish revival demonstrates that thin evidence and a committed community are sufficient; the NBTRF framework demonstrates that Cumbric's evidential position, though sparse, is not weaker than Cornish's was in 1904.
The NBTRF is explicitly designed as a living framework. As new peer-reviewed academic research into northern Brittonic emerges — new analyses of manuscript glosses, revised toponymic studies, advances in historical phonology — the framework is positioned to incorporate that evidence and deepen what Revitalised Cumbric shows.
External contributions from qualified specialists are welcomed under the formal contribution protocol documented at the Contributors page. The protocol requires dissertation-format submissions evaluated against the NBTRF specification, with accepted forms entered into the Polyglot™ master dataset and flowing through to this site.
The conservative standard is permanent. Each refinement to the dataset will carry explicit evidence documentation and confidence grading. The goal is not the maximum number of non-identity xcb forms, but the maximum accuracy and intellectual honesty of every form that appears, whether identity or not.
The NBTRF does not operate as a standalone methodology. It is the terminal framework of a formally structured, multi-stratal revitalisation architecture: the Brittonic Convergent Diachronic Revitalisation System (BCDRS), which is itself the Brittonic-language instantiation of the abstract Convergent Diachronic Revitalisation System (CDRS).
The CDRS defines a universal formal model class for diachronic revitalisation: a language-independent architecture in which diachronically ordered linguistic systems are represented as acyclic, deterministic transformation networks, each producing an operational revitalised language state. The BCDRS applies this model to the Brittonic linguistic continuum, specifying three ordered transformation frameworks:
| Framework | Input | Output |
|---|---|---|
| MWRF | Modern Welsh | Revitalised Middle Welsh |
| OWRF | Revitalised Middle Welsh | Revitalised Old Welsh |
| NBTRF (this document) | Revitalised Old Welsh | Revitalised Cumbric |
The NBTRF receives Revitalised Old Welsh as its primary input — the output of the OWRF — and applies northern Brittonic divergence modelling, toponymic continuity preservation, and phonological differentiation to produce Revitalised Cumbric. The full end-to-end pipeline is therefore:
Modern Welsh → MWRF → Revitalised Middle Welsh → OWRF → Revitalised Old Welsh → NBTRF → Revitalised Cumbric
Each layer in this pipeline is an independently operational language system. A speaker competent in multiple layers may conduct full discourse in any single layer, or switch between layers within the same conversation — reflecting the structured diachronic continuity that the BCDRS makes explicit.
The CDRS framework also defines the conditions under which any layer may be revised. Because the system is formally governed and academically grounded, peer-reviewed evidence bearing on any part of the pipeline — any individual word form, any transformation rule, or the architectural assumptions of the CDRS or BCDRS themselves — may be submitted to Penrith Beacon Communications | PBC for evaluation. Changes are made only where peer-reviewed evidence supports them and the governing group deems them both reasonable and necessary. This ensures that every output of the NBTRF — every Revitalised Cumbric form in the Polyglot™ dataset — is academically credible at the time it is published, and remains responsive to legitimate scholarly advance.
For the full specification of the abstract model, see the CDRS page. For the Brittonic implementation architecture, see the BCDRS page. For the upstream frameworks that produce the NBTRF's Old Welsh input, see the MWRF and OWRF pages.