feat(Computability): add Unlimited Register Machines #299

jessealama · 2026-01-28T14:20:52Z

Summary

This PR introduces Unlimited Register Machines (URMs), a model of computation with instructions, programs, and registers. URMs are a standard topic in computability theory, introduced by Shepherdson and Sturgis and popularized by Cutland's textbook.

What's included

Core definitions (Defs.lean):

Instr: The four instruction types (Z, S, T, J)
Program: Lists of instructions, with shift_jumps, shift_registers, max_register
State: Register contents as ℕ → ℕ
Config: Machine configuration (program counter + state)

Basic lemmas (Basic.lean):

State.write_read_self, State.write_read_of_ne: State read/write lemmas
Config.is_halted_iff, Config.ext: Configuration lemmas
isJump, IsJump: Jump instruction predicates
jumps_bounded_by, JumpsBoundedBy: Bounded jump targets
cap_jump: Cap jump targets to a given length

Execution semantics (Execution.lean):

Step: Single-step execution relation
Steps: Multi-step execution (reflexive-transitive closure)
Halts, Diverges, HaltsWithResult: Halting predicates
eval: Evaluation as a partial function
ProgramEquiv: Program equivalence with Setoid instance (≈ notation)
Standard notation: p ↓ inputs, p ↑ inputs, p ↓ inputs ≫ result
Key theorems: Step.deterministic, Steps.eq_of_halts

Computability (Computable.lean):

Computes: A program computes a given partial function
Computable: A partial function is URM-computable (∃ program that computes it)
Basic computable functions: successor, projection, identity

Straight-line programs (StraightLine.lean):

IsStraightLine: Programs without jumps
straight_line_halts: Straight-line programs always halt

Standard form (StandardForm.lean):

IsStandardForm: Programs with bounded jump targets
to_standard_form: Normalization function
to_standard_form_equiv: Behavioral equivalence (p.to_standard_form ≈ p)
to_standard_form_idempotent: Idempotence of normalization

Design decisions

0-indexed registers: Aligns with Lean/Mathlib conventions (differs from Cutland's 1-indexed presentation)
Functional state representation: State := ℕ → ℕ works well with grind and simp
Relational semantics: Step as an inductive relation rather than a function, following the style of Mathlib's Turing machine formalization

Feedback welcome

I'm happy to adjust naming conventions, reorganize the module structure, or refine the API based on reviewer preferences. This is my first contribution to CSLib, so guidance on project conventions is appreciated.

Introduce Unlimited Register Machines (URMs), a model of computation with instructions, programs, and registers. URMs are a standard topic in computability theory, introduced by Shepherdson and Sturgis (1963) and popularized by Cutland's textbook. Core definitions: - Instr: The four instruction types (Z, S, T, J) - Program: Lists of instructions - State: Register contents as ℕ → ℕ - Config: Machine configuration (program counter + state) Execution semantics: - Step/Steps: Single and multi-step execution relations - Halts, Diverges, HaltsWithResult: Halting predicates - eval: Evaluation as a partial function - ProgramEquiv: Program equivalence with Setoid instance (≈ notation) - Standard notation: p ↓ inputs, p ↑ inputs, p ↓ inputs ≫ result Computability: - Computes: A program computes a given partial function - Computable: Existence of a program that computes a function - Basic computable functions: successor, projection, identity Additional modules: - StraightLine: Jump-free programs that always halt - StandardForm: Programs with bounded jump targets, with normalization

Cslib/Computability/Urm/Basic.lean

Cslib/Computability/Urm/Execution.lean

BoltonBailey · 2026-01-28T18:29:16Z

Cslib/Computability/Urm/State.lean

+
+Uses the functional representation `ℕ → ℕ` for efficiency with rewrites,
+following the advice from the `grind` tactic documentation. -/
+abbrev State := ℕ → ℕ


Representing states as functions from Nat to Nat is convenient here, but I worry that it would lead to problems down the road. The other choice I have in mind is to let the registers of a UTM be indexed by an arbitrary (or perhaps [Fintype]) type, and then have State be defined State (m : UTM) := u.RegisterIndex -> Nat as machine-dependent type. You would have to include this extra variable. But the benefit would be that if you want to define a composition of the functionalities of different UTMs (like I do for TMs in #269), you can do so by taking a direct product of the individual register indices, which would hopefully make functions on that register type easier to define. It would also have the benefit, if the register type were finite, of having the register state be printable data.

Perhaps this change is not necessary since the textbook defines it this way, but it's something to think about.

If this takes too long to figure out, I think it's fine to proceed with the current version. In general, we might wanna have a general 'RegisterMachine' concept and then apply it here, but it might be good to accumulate some experience first.

I love the idea of a general register machine concept! What I might suggest is implementing another register machine model, say RASP machines (but that's just one possibility!).

One challenge with the use of an arbitrary index type for the register indices, or [Fintype], would be to ensure the coherence of programs. Using natural numbers, one has the luxury of knowing that any program is coherent in the sense that any register index appearing in it refers to a register that actually exists. Programs and machines (or states) are sort of independent. But perhaps I'm overthinking this; making the definition of machine states and programs share an index type could work (and might have some nice theoretical benefits such as enabling a notion of product of machines).

Cslib/Computability/Urm/StraightLine.lean

chenson2018

This looks pretty good to me, and thank you for documenting things well. Here are some overarching thoughts:

I think the duplication of API between Bool and Prop that happens in several places is unnecessary
I made a lot of comments about how you can use grind. These are not all mandatory (I have a personal inclination to use it a lot) but I think worth it overall. Definitions/theorems I passed directly to grind often should have annotations for these.
I wasn't completely comfortable with the use of choice in some type signatures, but I need to think more carefully about this

Cslib/Computability/Urm/Instr.lean

Cslib/Computability/Urm/Program.lean

Cslib/Computability/Urm/StandardForm.lean

Cslib/Computability/URM/StandardForm.lean

ctchou · 2026-01-29T05:31:56Z

Cslib/Computability/Urm/Instr.lean

+- `J m n q`: If registers m and n have equal contents, jump to instruction q;
+             otherwise proceed to the next instruction


If the jump instruction's semantics is changed to a relative jump (namely, the jump target is specified via an offset relative to the jump instruction's own position, rather than an absolute position), would shift_jumps and all related stuff still be needed? Of course this would require either q be an integer or the jump instruction be split into a jump_forward and a jump_backward instructions.

Interesting idea! I think relative offsets would be fine (and shouldn't affect the class of computable functions). This change would indeed obviate the need for shift_jumps and similar bureaucratic details, but some details would just get shifted to other places. On balance, though, the net gain of a relative approach might be worth it. I'm not sure. I'd be happy to explore this idea.

I thought about this some more, and I suggest exploring this in a follow-on PR. A syntactic conversion between programs is straightforward and an equivalence proof should also be pretty easy. In my opinion, this kind of exercise belongs in CSLib. What do you think?

It's up to you, for you will have to do the proofs. PC-relative jump is in fact an old idea. Every modern instruction set architecture has it, precisely for the purpose of supporting position-independent code.

A related idea is to add a base register and/or literal to each instruction, so that each address is relative to that base, rather than an absolute address. Here you can probably get away with using only Nat for address.

There's some spectrum of tradeoff here from staying minimal and abstract to mirroring a real instruction set. I'm fine if this stays simple at first, especially if that will make your future plans for connecting to SKI simpler.

Come to think of it: you can have a special register (say, registter 0) which contains a "base PC". The real target of a jump instruction is the target + the value of the base PC. This will also give you position-independent code at the expense of changing the base PC value when switching between different blocks of code.

I agree with Chris. Adding more "realistic" instruction features is easy. The hard part is to show that the fancier instructions can be implemented by a more primitive formalism like SKI.

My first instinct here is to investigate the relative jump idea. That ought to simplify the proofs of closure of URM-computable functions under composition, primitive recursion, and minimization. And it should help with the SKI equivalence proof, too.

- Define IsJump, JumpsBoundedBy, IsStraightLine directly as Prop - Remove redundant Bool versions (isJump, jumps_bounded_by, etc.) - Rename theorems: not_IsJump → nonJump

Co-authored-by: Chris Henson <46805207+chenson2018@users.noreply.github.com>

Swap argument order so Program comes before offset, enabling nicer dot notation like `p.shift_jumps offset`. Also fix redundant `rename_i` in StandardForm.lean.

Add `evalConfig : Program → List ℕ → Part Config` as the primary definition, with `eval` as a wrapper via `Part.map`. This removes `Classical.choose` from the type signatures of `toHaltsWithResult` and `evalConfig_to_standard_form_state`.

Introduce `straightLine_finalConfig` to wrap the choice, mirroring the pattern used for `evalConfig`. Updates `straightLine_finalConfig_spec` to use the wrapper instead of exposing `Classical.choose` in its type.

jessealama · 2026-01-29T10:56:07Z

I wasn't completely comfortable with the use of choice in some type signatures, but I need to think more carefully about this

I've refactored the theorems that use choice to use the Part API, in particular Part.get. That pushes choice out of the URM API. (I think for this formalization we can't totally avoid choice.)

fmontesi

Really nice work! I've added some comments and questions myself.

Cslib/Computability/Urm/Basic.lean

Cslib/Computability/URM/Basic.lean

Cslib/Computability/URM/Defs.lean

fmontesi · 2026-01-29T11:09:12Z

Cslib/Computability/URM/Execution.lean

+
+/-- The step relation is deterministic: each configuration has at most one successor. -/
+theorem deterministic {c c' c'' : Config} (h1 : Step p c c') (h2 : Step p c c'') : c' = c'' := by
+  cases h1 <;> cases h2 <;> grind


@chenson2018, in LTS we have the class Deterministic to cover exactly this. Would it make sense to introduce a similar class for ReductionSystem and use it here?

fmontesi · 2026-01-29T11:09:39Z

Cslib/Computability/URM/Execution.lean

+  Relation.ReflTransGen.trans h1 h2
+
+/-- One step implies multi-step. -/
+theorem single {c c' : Config} (h : Step p c c') : Steps p c c' :=


As Chris mentioned, some of these things can come for free from ReductionSystem.

fmontesi · 2026-01-29T11:11:09Z

Cslib/Computability/URM/Execution.lean

+
+/-- A program halts on given inputs if there exists a halted configuration reachable from
+the initial configuration. -/
+def Halts (inputs : List ℕ) : Prop :=


@chenson2018: if we don't/cannot generalise this to ReductionSystem now, I'd like a concurrent issue to be opened so that we remember.

In general, concepts like divergence and termination should probably be unified under general definitions for ReductionSystem and LTS.

I really like that idea. I'll investigate using Deterministic. I tried building a bit on top of LTSs but I couldn't think of a good match with URM semantics, but it looks like I need to investigate LTSs some more.

I think ReuctionSystem is pretty straightforward in replacing Step/Steps. I'd say give it a try and if we hit any significant difficulties we open an issue as Fabrizio has suggested.

I'll give that a try; I'd really like to align with existing infrastructure. Stay tuned!

Per textbook convention: State is full machine state (pc + registers), Regs is just register contents.

jessealama requested review from chenson2018 and fmontesi as code owners January 28, 2026 14:20

jessealama force-pushed the urm-kickstarter branch 4 times, most recently from 388bd04 to 6f272d5 Compare January 28, 2026 14:46

jessealama force-pushed the urm-kickstarter branch from 6f272d5 to 09b7076 Compare January 28, 2026 15:02

BoltonBailey reviewed Jan 28, 2026

View reviewed changes

chenson2018 requested changes Jan 29, 2026

View reviewed changes

ctchou reviewed Jan 29, 2026

View reviewed changes

jessealama and others added 14 commits January 29, 2026 09:33

refactor(Urm): consolidate core definitions into Defs.lean

0f205b4

refactor(Urm): split helper lemmas into Basic.lean

3bff171

refactor(Urm): move theorems from Defs.lean to Basic.lean

2fc2fb3

refactor(Urm): rename Step.trans to Step.transfer

51d7e8c

refactor(Urm): use Prop-first predicates, remove Bool/Prop duplication

3f4cda6

- Define IsJump, JumpsBoundedBy, IsStraightLine directly as Prop - Remove redundant Bool versions (isJump, jumps_bounded_by, etc.) - Rename theorems: not_IsJump → nonJump

Update Cslib/Computability/Urm/StandardForm.lean

e66b43f

Co-authored-by: Chris Henson <46805207+chenson2018@users.noreply.github.com>

Use grind more liberally

e66651a

Update Cslib/Computability/Urm/StandardForm.lean

fc3e5ac

Co-authored-by: Chris Henson <46805207+chenson2018@users.noreply.github.com>

Update Cslib/Computability/Urm/StandardForm.lean

602bad6

Co-authored-by: Chris Henson <46805207+chenson2018@users.noreply.github.com>

Golf a bit with grind

4f84890

refactor(Urm): put Program first in shift_jumps/shift_registers

db5f2ae

Swap argument order so Program comes before offset, enabling nicer dot notation like `p.shift_jumps offset`. Also fix redundant `rename_i` in StandardForm.lean.

style(Urm): split packed tactic lines for readability

6458977

refactor(Urm): hide Classical.choose in straightLine_finalConfig

854eeb3

Introduce `straightLine_finalConfig` to wrap the choice, mirroring the pattern used for `evalConfig`. Updates `straightLine_finalConfig_spec` to use the wrapper instead of exposing `Classical.choose` in its type.

fmontesi requested changes Jan 29, 2026

View reviewed changes

jessealama added 3 commits January 29, 2026 13:32

rename Urm to URM per Mathlib conventions

1f43494

style(URM): use camelCase for theorem names

e0c0353

style(URM): cap_jump → capJump

78f8488

jessealama added 2 commits January 29, 2026 13:38

style(URM): use camelCase for definitions

1794048

refactor(URM): rename State → Regs, Config → State

d9d1acb

Per textbook convention: State is full machine state (pc + registers), Regs is just register contents.

		- `J m n q`: If registers m and n have equal contents, jump to instruction q;
		otherwise proceed to the next instruction

feat(Computability): add Unlimited Register Machines #299

Are you sure you want to change the base?

feat(Computability): add Unlimited Register Machines #299

Uh oh!

Conversation

jessealama commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's included

Design decisions

Feedback welcome

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jessealama Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chenson2018 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ctchou Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jessealama commented Jan 29, 2026

Uh oh!

fmontesi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jessealama commented Jan 28, 2026 •

edited

Loading

jessealama Jan 30, 2026 •

edited

Loading

ctchou Jan 29, 2026 •

edited

Loading