The Brittonic Convergent Diachronic Revitalisation System (BCDRS) is the domain-specific implementation of the abstract Convergent Diachronic Revitalisation System (CDRS) applied to the Brittonic linguistic continuum. It is the first and currently the only Model Class Instance of the CDRS in operation.
The BCDRS transforms Modern Welsh — a living, fully documented Celtic language — through two intermediate revitalised language stages to produce Revitalised Cumbric: an operational language system derived from the extinct northern Brittonic dialect of Cumbria, Strathclyde, and the Pennine uplands. The pipeline is realised through three deterministic transformation frameworks: the Middle Welsh Revitalisation Framework (MWRF), the Old Welsh Revitalisation Framework (OWRF), and the Northern Brittonic Toponymic Revitalisation Framework (NBTRF).
Each output layer of the BCDRS is independently operational: it constitutes a fully functional language system capable of supporting human communication. This property, inherited from the CDRS operational output requirement, distinguishes the BCDRS from purely reconstructive approaches and enables its distinctive multilingual switching capability — the ability of speakers competent in multiple layers to conduct discourse in any single layer or to switch between layers within a discourse.
The BCDRS is governed by a formal peer-review pathway administered by Penrith Beacon Communications (PBC). All proposed modifications — whether to the system architecture or to any of the three frameworks — require peer-reviewed academic documentation and evaluation by PBC before incorporation into the Polyglot™ dataset.
The Brittonic (or Brythonic) branch of the Celtic language family comprises those languages descended from Common Brittonic, the language spoken across most of Britain south of the Forth–Clyde line during the Iron Age and Roman period. Three Brittonic languages survive into the present: Welsh (spoken throughout Wales and in diaspora communities globally, with approximately 900,000 speakers), Cornish (revitalised since the early twentieth century, now with several hundred active speakers), and Breton (spoken in Brittany, France, with a significantly declining community of native speakers). Two further members of the family — Cumbric and Pictish — are extinct, leaving no surviving native speakers and no continuous textual record.
The diachronic relationship between these languages is established by historical linguistics. Modern Welsh descends from Middle Welsh (broadly c. 1150–1500 CE), which descends from Old Welsh (c. 800–1150 CE), which descends from Late Common Brittonic (c. 400–800 CE), which itself developed from the Common Brittonic of the Roman and pre-Roman periods. Cornish and Breton represent parallel southwest developments from the same Common Brittonic ancestor; Cumbric represents the northern branch, spoken in what are now Cumbria, Lancashire north of the Sands, Northumberland, and the southern Scottish lowlands from approximately the fifth to the eleventh or twelfth century CE [5, 7].
Cumbric is documented — to the extent that it can be called documented — through three categories of evidence: place-names preserving Brittonic morphological elements across the historical Cumbric-speaking zone (including pen- in Penrith, caer- in Carlisle, aber- in Aballava, glas- in Glasgow, and cul- in Culgaith); personal names in neighbouring documents such as the Y Gododdin and the Annals of Ulster; and the yan-tan-tethera counting tradition preserved by shepherds and textile workers across Cumbria, Yorkshire, and southern Scotland into the nineteenth and twentieth centuries [6].
The rationale for choosing the Brittonic continuum as the first CDRS instantiation is fourfold. The surviving Brittonic languages — Welsh in particular — are exceptionally well documented, providing a robust starting point for the pipeline. The diachronic relationships within the family are well established by two centuries of Celtic scholarship. The surviving languages provide living input material of known quality. And there is substantial scholarly precedent for Brittonic revitalisation in the Cornish revival, which demonstrates that operational revitalisation from historical evidence is achievable for the Brittonic family specifically [4].
The BCDRS instantiates the abstract Convergent Diachronic Revitalisation System (CDRS). The relationship between CDRS and BCDRS is one of abstract model to concrete implementation: the CDRS defines the formal structure — the directed acyclic graph architecture, the five system properties, the transformation function specification, the governance protocols — while the BCDRS provides the linguistic content that fills that structure.
The BCDRS inherits all five CDRS system properties without modification or weakening:
| Property | In the abstract CDRS | In the BCDRS |
|---|---|---|
| Acyclic | No directed cycles in the transformation graph | No Cumbric form may influence its Old Welsh input; no Old Welsh form may influence its Middle Welsh input |
| Diachronic | All transformations ordered with respect to historical time | Pipeline runs strictly forward: Modern Welsh → Middle Welsh → Old Welsh → Cumbric |
| Deterministic | F(x; P) = y under fixed parameterisation | Each of the three frameworks produces a unique output for each input under its fixed rules |
| Convergent | Iterative refinement produces progressively stabilised outputs | New evidence refines the dataset without disrupting the overall operational system |
| Operational | All output layers are fully functional language systems | Revitalised Middle Welsh, Old Welsh, and Cumbric are all independently learnable and speakable |
The BCDRS additionally adds domain-specific content that the abstract CDRS does not
determine: the specific language states (Modern Welsh, Middle Welsh, Old Welsh, Cumbric);
the specific transformation frameworks (MWRF, OWRF, NBTRF) with their particular linguistic
rules and evidential bases; the column key system (cy, wlm,
owl, xcb); and the 307 row IDs that constitute the shared dataset
across all three frameworks.
Changes to the CDRS propagate to all its instantiations, including the BCDRS. Changes to the BCDRS do not affect the abstract CDRS unless they reveal a structural deficiency in the model itself. This asymmetry is a feature, not a limitation: it ensures that the abstract model remains stable while domain-specific implementations can be refined independently in response to new evidence.
The BCDRS defines four strata within its pipeline. Each stratum corresponds to one node in the CDRS directed acyclic graph and one column in the Polyglot™ dataset. The strata are defined by their historical period, their attestation status, and their governing framework:
| Stratum | Language state | Column key | Historical period | Attestation | Framework |
|---|---|---|---|---|---|
| Source | Modern Welsh | cy | Ongoing | Living language, fully documented | None — primary input |
| Layer 1 | Revitalised Middle Welsh | wlm | c. 1150–1500 CE | Extensively attested | MWRF |
| Layer 2 | Revitalised Old Welsh | owl | c. 800–1150 CE | Partially attested (glosses, computus) | OWRF |
| Layer 3 | Revitalised Cumbric | xcb | c. 500–1100 CE | Near-absent (toponymic + YTT only) | NBTRF |
Each stratum is a fully operational language system. Strata are not merely rungs on a ladder to be discarded as the pipeline descends; each is an independently significant linguistic system with its own scholarly and communicative value. The pipeline structure connects them, but their independence as operational systems is guaranteed by the CDRS operational output requirement.
The attestation gradient across the four strata is a defining feature of the BCDRS. As the pipeline moves from Modern Welsh to Cumbric, the evidential basis becomes progressively sparser. Modern Welsh is a living language with comprehensive documentation; Middle Welsh is extensively attested in manuscript form; Old Welsh is sparsely attested in glosses and computus fragments; Cumbric is effectively unattested as a continuous language. The BCDRS is designed to handle this gradient explicitly, through the confidence grade system and the graduated rules of each framework. The sparser the evidence, the more conservative the framework — and the more explicitly the outputs are graded for their epistemic status.
The formal dependency function of the BCDRS is expressed as:
Lₙ = Fₙ(Lₙ₋₁)
This states that each language state in the BCDRS pipeline is a deterministic function of the immediately preceding state, mediated by the governing transformation framework Fₙ. In the BCDRS:
The dependency function has two critical formal properties that govern the operation of the entire BCDRS pipeline.
Reproducibility. If the input Lₙ₋₁ is unchanged and the transformation framework Fₙ is unchanged — that is, if the parameterisation Pₙ (the complete set of rules governing the transformation) is fixed — then the output Lₙ is unchanged. Two researchers independently applying the same framework rules to the same source data must reach the same output. This property makes BCDRS outputs verifiable: any claimed output can be independently checked by applying the documented rules to the documented inputs.
Cascade sensitivity. If Lₙ₋₁ changes — because an entry in the preceding
layer has been revised — then the corresponding entry in Lₙ must be re-evaluated. This is the
formal basis for the BCDRS cascade protocol: a change in the wlm column for a
given row ID may require a change in the owl column for that row, which may in
turn require a change in the xcb column. Cascade obligations are always forward
(from earlier to later strata); no change in xcb can create an obligation to
revise owl or wlm.
Together, reproducibility and cascade sensitivity ensure that the BCDRS pipeline is fully auditable. Every output can be traced to its source; every change triggers an explicit assessment of downstream implications; every revision is documented and versioned. The dataset is not a static artefact but a living system governed by explicit rules that make its evolution transparent.
The BCDRS contains three transformation frameworks, each governing one transition in the pipeline. The frameworks operate sequentially: the output of the MWRF is the input to the OWRF; the output of the OWRF is the input to the NBTRF. Each framework is independently documented and peer-reviewed, but all three operate within the structural constraints of the BCDRS. The specifications below summarise each framework; full documentation is maintained in the framework directories linked at the end of each subsection.
The Middle Welsh Revitalisation Framework (MWRF) governs the transformation from Modern
Welsh (L₀, column cy) to Revitalised Middle Welsh (L₁, column wlm).
It operates in the most evidentially rich environment of the three frameworks: Modern Welsh
is a living, fully documented language, and the Middle Welsh corpus — comprising the
Mabinogion, the poetry of the Gogynfeirdd, legal texts of the Cyfraith Hywel, and numerous
further prose and verse works — is extensive.
The MWRF takes as inputs the Modern Welsh source entry and the corresponding Middle Welsh attestations where they exist. Its function is to apply documented phonological and morphological transformations from Modern Welsh to the Middle Welsh stage, recovering the forms and paradigms that characterised that stage as documented in the attested corpus and described in the scholarly grammar literature, principally Evans (1964) [4]. Its output is a Revitalised Middle Welsh form for each of the 307 dataset rows, assigned a confidence grade reflecting the quality of attestation.
| Input | Function | Output |
|---|---|---|
| Modern Welsh (cy) + Middle Welsh corpus | Documented phonological and morphological transformation; recovery of MW paradigms | Revitalised Middle Welsh (wlm), confidence-graded |
Full MWRF documentation: mwrf.html
The Old Welsh Revitalisation Framework (OWRF) governs the transformation from Revitalised
Middle Welsh (L₁, column wlm) to Revitalised Old Welsh (L₂, column
owl). It operates under significantly greater evidential constraint than the
MWRF: the Old Welsh corpus is sparse, consisting primarily of the Martianus Capella glosses,
the Juvencus glosses, inscriptions from the period, and the computus fragments (the Computus
Fragment, Cambridge MS Add. 4543, being the most substantial attested Old Welsh text).
The OWRF takes as input the Revitalised Middle Welsh entry and the corresponding Old Welsh attestations where they exist. Its function is to apply documented phonological and morphological transformations from Middle Welsh to Old Welsh — reversing the developments that occurred between the two stages, recovering the earlier forms. Because the Old Welsh corpus is sparse, the OWRF relies substantially on comparative Brittonic reconstruction alongside direct attestation, and many entries carry Grade B or Grade C confidence.
| Input | Function | Output |
|---|---|---|
| Revitalised Middle Welsh (wlm) + Old Welsh corpus | Documented diachronic transformation; comparative Brittonic reconstruction where attested forms are absent | Revitalised Old Welsh (owl), confidence-graded |
Full OWRF documentation: owrf.html
The Northern Brittonic Toponymic Revitalisation Framework (NBTRF) governs the transformation
from Revitalised Old Welsh (L₂, column owl) to Revitalised Cumbric (L₃, column
xcb). It operates under near-total corpus absence: no continuous Cumbric text
survives. The NBTRF's evidence base consists of place-name fossils across the historical
Cumbric-speaking zone and the yan-tan-tethera counting tradition, the only directly attested
Cumbric phonological material in everyday use.
The NBTRF applies a three-stage pipeline — B1 (Lexical Fossil Model), B2 (Phonological Realisation), B3 (Orthographic Attestation) — to each entry. Applied to the 307 dataset rows, it currently produces 19 non-identity xcb forms (cardinal numbers 1–20, excluding NUM_5 which is identity), 288 identity forms (xcb = owl), and 6 null forms where the entry is absent from both Old Welsh and Cumbric.
| Input | Function | Output |
|---|---|---|
| Revitalised Old Welsh (owl) + northern Brittonic place-name record + YTT numeral tradition | Three-stage B1 → B2 → B3 transformation pipeline; conservative identity default; YTT attestation for numerals | Revitalised Cumbric (xcb), confidence-graded |
Full NBTRF documentation: nbtrf.html
The BCDRS pipeline architecture is represented below. Each node is a language state; each arrow is a deterministic transformation framework. The column keys in parentheses are the data identifiers used in the Polyglot™ dataset.
Each of the four pipeline nodes (cy, wlm, owl, xcb) corresponds to one column of the
Polyglot™ dataset. All four columns share the same 307 row IDs, making it possible to trace
any individual entry — for example, the forms for BE_PRES_1SG ("I am") —
from its Modern Welsh source through all three intermediate stages to the final Revitalised
Cumbric output.
The architecture's key property is the one-way direction of all transformations. Information flows strictly from source to output — from cy to wlm, from wlm to owl, from owl to xcb. No reverse flow is permitted: a discovery about xcb cannot modify owl directly; it can only do so indirectly by prompting a reconsideration of the NBTRF rules, which, if revised through the peer-review process, would then be reapplied to derive a new xcb from the existing owl. The forward-only architecture is the BCDRS's implementation of the CDRS acyclicity property, and it is the structural guarantee against circular reasoning in the pipeline.
Each of the three BCDRS frameworks is rule-governed, reproducible, and invariant under identical inputs. This is the BCDRS's implementation of the CDRS determinism property, and it has significant consequences at the system level for how the pipeline is governed and how changes propagate through it.
At the system level, determinism means that a change to any single cell in any column must
be traceable through the pipeline. If the wlm value for a given row ID changes
— because the MWRF has been revised in response to new evidence about Middle Welsh grammar
— then the OWRF must be applied to the new wlm value to determine whether the
owl value also changes. If owl changes, the NBTRF must be applied
to the new owl value to determine whether xcb changes. This cascade
is not optional; it is a logical consequence of determinism. If the output is determined by
the input and the rules, then a change in the input entails a re-evaluation of the output.
Determinism also defines the scope of any proposed change. A change to a framework rule is not a change to a single cell; it is a change to a function that maps inputs to outputs across all entries governed by that rule. The peer-review process for a rule change must therefore assess its implications not just for the single entry that motivated the change but for all entries in the framework's scope. This is a demanding requirement, but it is the requirement that gives BCDRS outputs their claim to scholarly rigour.
Non-determinism in the pipeline — ambiguous rules that could produce different outputs for the same input — is treated as a system error, not an acceptable level of uncertainty. Uncertainty about historical forms is handled by the confidence grade system (A, B, C), not by branching in the transformation function. The transformation function always produces a single output; the confidence grade records how certain that output is.
Each stratum of the BCDRS is a fully operational language system in its own right. A learner who acquires Revitalised Middle Welsh without proceeding to Revitalised Old Welsh or Revitalised Cumbric possesses a complete, usable language system. A learner who acquires Revitalised Cumbric without prior competence in the intermediate layers likewise possesses a complete operational system — though the structural relatedness to Welsh and the intermediate strata provides a natural acquisition pathway.
Independent language completeness is guaranteed by the CDRS operational output requirement, which the BCDRS inherits. It is operationalised through the 307-entry dataset, which covers all basic verb paradigms, all basic pronouns, the core adjective and function-word vocabulary, numbers, calendar terms, and a full set of communicative formulae. This is not a full literary register, but it is a sufficient communicative basis — enough to conduct meaningful everyday interaction in each layer. Each layer is an entry point, not a stepping stone.
The BCDRS is designed to support a distinctive form of structured diachronic multilingualism. A speaker competent in two or more BCDRS layers possesses the ability to operate in structurally related but historically distinct language systems, and to switch between them within a discourse. This is not mere translation; it is a genuine communicative capability analogous to the code-switching behaviour documented in multilingual human communities.
The structural relatedness of BCDRS layers facilitates this switching in ways that unrelated language pairs do not permit. A speaker competent in Revitalised Old Welsh will find Revitalised Cumbric largely transparent: the systems share their grammar, mutation system, and pronoun paradigms, differing primarily in the 19 YTT numeral forms and in potential phonological colouring. A speaker of Modern Welsh moving through the BCDRS pipeline will encounter a controlled and documentable set of changes at each layer transition — not arbitrary lexical and grammatical differences, but the specific, historically motivated developments documented by Brittonic scholarship.
This property gives the BCDRS a distinctive character among revitalisation projects. Most revitalisation efforts target a single language; the BCDRS targets a family of related operational systems, each recoverable from the others by known diachronic rules. The result is a unique resource for anyone engaged in the scholarly or practical exploration of the Brittonic linguistic tradition.
The BCDRS supports three communicative configurations, each reflecting a different mode of engagement with its multi-layer structure:
Single-layer discourse. A group of participants fluent in the same BCDRS layer conducts all discourse in that layer. This is the simplest configuration and is fully supported by each layer's operational completeness. A group wishing to conduct discourse entirely in Revitalised Cumbric, without reference to the other layers, can do so using the 307-entry dataset as their linguistic foundation.
Transition discourse. A group begins discourse in one BCDRS layer and systematically transitions to another. This configuration is particularly appropriate for pedagogical settings, where learners progress through layers over time and the community's discourse register shifts accordingly. A learner who begins with Modern Welsh and proceeds through the pipeline is following the historical descent of the language from its living present to its reconstructed past.
Alternating discourse. Participants fluent in multiple layers alternate between them within a single discourse event, deploying different layers for different purposes — perhaps using Modern Welsh for contemporary references, Revitalised Middle Welsh for formal registers, and Revitalised Cumbric for ceremonial or identity-marking moments. This mirrors documented multilingual practice in Welsh-speaking communities and in historically multilingual contexts more broadly. It represents the most sophisticated engagement with the BCDRS and the fullest exploitation of its multi-layer design.
The lexicographic structure of a BCDRS entry encompasses the full set of information maintained for each row in the Polyglot™ dataset. For each of the 307 row IDs, the following information is maintained:
| Field | Content |
|---|---|
| Row ID | The stable string identifier shared across all four columns (e.g., BE_PRES_1SG) |
| English gloss | The English meaning of the entry (e.g., "I am") |
| cy (Modern Welsh) | The living Modern Welsh form, sourced from GPC or standard dictionaries |
| wlm (Revitalised Middle Welsh) | The MWRF-derived Middle Welsh form, with confidence grade |
| owl (Revitalised Old Welsh) | The OWRF-derived Old Welsh form, with confidence grade |
| xcb (Revitalised Cumbric) | The NBTRF-derived Cumbric form, with divergence class and confidence grade |
| Attestation status | Whether the form is directly attested, reconstructed, or identity-mapped |
| Transformation path | Which pipeline stages produced the form and by what mechanism |
| Confidence level | Grade A (strongly evidenced), Grade B (probable), or Grade C (speculative) |
| Cross-Brittonic cognates | Corresponding forms in Welsh, Cornish, and Breton where relevant |
This structure makes each BCDRS entry not merely a form but a scholarly record: a documented derivation with explicit provenance, confidence grading, and cross-linguistic context. The full dataset is maintained in the Polyglot™ system and is accessible at polyglot.kingarthursroundtable.com.
The complete list of all 307 row IDs, organised by grammatical category, is documented in the companion data keys file and replicated at the BCDRS level, the NBTRF level, and each framework level. The shared row ID structure is the fundamental mechanism by which the pipeline maintains coherence: every entry in every column can be unambiguously matched to its counterparts in all other columns.
The BCDRS is maintained by Penrith Beacon Communications (PBC). Governance operates at two levels, each with distinct scope and procedures:
BCDRS-level governance addresses changes to the system architecture — the pipeline structure, the stratum definitions, the dependency function — and changes with cross-framework implications. A discovery that affects the interface between two frameworks (for example, a finding that requires the OWRF output to be restructured in a way that affects all NBTRF inputs) is a BCDRS-level matter, not a framework-level matter. BCDRS-level proposals are submitted via www.penrithbeacon.com and evaluated against both the BCDRS specification and the abstract CDRS constraints.
Framework-level governance addresses changes within a single framework's transformation rules, new evidence incorporated into the framework's parameterisation, revised confidence grades, and corrections to individual dataset entries. Framework-level proposals are also submitted via PBC, evaluated against the relevant framework specification, and assessed for cascade implications in downstream frameworks.
The cascade protocol is the formal mechanism by which changes propagate
through the pipeline. All changes are strictly forward: a change at any stage must trigger
a re-evaluation of all downstream stages. A change in the wlm column for a
given row must trigger OWRF re-evaluation to determine whether owl changes,
and if so, NBTRF re-evaluation to determine whether xcb changes. No change
in a downstream stage can create an obligation to revise an upstream stage — the acyclicity
constraint prohibits upstream cascades.
All changes — BCDRS-level or framework-level — require peer-reviewed academic documentation. The peer-review requirement is the foundation of the BCDRS's claim to scholarly credibility. Appropriate reviewing bodies include Celtic Studies departments at Aberystwyth, Bangor, Cardiff, Edinburgh, and Glasgow, as well as any specialist in Brittonic historical linguistics, Welsh or Cornish philology, or allied fields.
The following constraints apply to all BCDRS outputs and should be understood by any user of the Revitalised Cumbric dataset or the intermediate pipeline columns:
The BCDRS is designed as a living framework, and several directions for future development are envisaged:
Deeper input layers. The current pipeline begins at Modern Welsh. A future extension could add Proto-Celtic or Common Brittonic as a deeper input layer (L₋₁), extending the pipeline backwards in time. This would require a new framework governing the Proto-Celtic → Common Brittonic transformation and would correspondingly deepen the scholarly basis of all subsequent layers.
Extension to other Brittonic dialects. The NBTRF currently targets the Cumbric dialect of northern Brittonic. An extension could add a Pictish framework, applying the same formal structure to the poorly documented linguistic tradition of northeastern Scotland. The evidential challenges for Pictish are even more severe than for Cumbric, but the BCDRS architecture is explicitly designed to accommodate varying evidential densities.
Computational implementation. The deterministic, rule-governed structure of the BCDRS makes it well suited to computational implementation: a software system that applies framework rules automatically, flags cascade obligations when inputs change, and maintains version history across the pipeline. PBC is exploring this as a longer-term development.
Formal verification. The consistency of the BCDRS pipeline — the guarantee that each output is a valid function of its input under the stated rules — could in principle be formally verified using tools from formal methods and computer science. Formal verification would provide the strongest possible guarantee of pipeline consistency and would support the scholarly credibility of the system.
Invitation to scholarly engagement. As with the CDRS, the BCDRS actively invites scholarly engagement. Contributions from qualified specialists — new evidence about Middle Welsh grammar, revised analyses of Old Welsh manuscript forms, improved interpretations of northern Brittonic place-name elements — are welcomed through the formal contribution protocol at www.penrithbeacon.com and summarised at contributors.html.