Conversation
- Added TinyGPU.load_kernel(program, labels, grid=(blocks,tpb), args=None, shared_size=...) which configures grid, copies kernel args into R0..Rk, allocates shared memory, and loads program - Added TinyGPU.run_kernel() convenience wrapper - Added example runner demonstrating kernel launch (examples/run_vector_add_kernel.py)
- Added step_single(), snapshot(), and rewind() to TinyGPU - Added examples/debug_repl.py: a tiny command-line REPL to step through kernels - Snapshot returns PC, flags, registers, memory slice, and shared memory for quick debugging
…r examples - Make SHLD/SHST robust: compute block_id from tid // threads_per_block (fall back to register if needed) and add defensive bounds checks to prevent IndexError when kernels overwrite registers. - Refactor TinyGPU step internals and restore semantics so consecutive non-control instructions execute in the same cycle when expected. - Normalize example imports by adding local `src` to sys.path when running scripts and switch to the `tinygpu` package imports. - Add GIF saving for examples: each example now attempts to write a timestamped GIF to `src/outputs/<script_name>/` (wrapped in try/except to be CI/headless-friendly). - Fix various lint/format issues (ruff/black): wrap long docstrings/lines, add noqa where appropriate, and remove tab indentation in tests. - Update tests & verify: ruff/black/pytest passed; examples generated GIFs under `src/outputs/`.
deaneeth
added a commit
that referenced
this pull request
Dec 16, 2025
Introduces the v2.0.0 release of TinyGPU
deaneeth
added a commit
that referenced
this pull request
Jan 18, 2026
Resolves CodeQL alert #1 - 'Workflow does not contain permissions' Added minimal permissions block (contents: read) following the principle of least privilege as recommended by GitHub security best practices. Ref: CWE-275
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces the v2.0.0 release of TinyGPU, featuring significant enhancements to the instruction set, visualization, continuous integration, and documentation. The release adds new shared memory instructions, improves synchronization semantics, expands the visualizer, and updates the project structure and examples to better support educational use and extensibility.
Major new features and improvements:
Instruction Set and Core Functionality:
SHLDandSHSTfor robust per-block shared memory operations, and improvedSYNC/SYNCBsemantics for better thread and block coordination. [1] [2] [3]Visualizer and Example Scripts:
run_odd_even_sort.py,run_reduce_sum.py,run_sync_test.py, newrun_block_shared_sum.py) to output GIFs tosrc/outputs/<script_name>/and included new example programs, such as block shared memory sum and a REPL debugger. [1] [2] [3] [4] [5] [6] [7] [8] [9]Continuous Integration and Code Quality:
devbranch and Python 3.13, and added badges for linting, code style, and tests. Integratedruffandblackfor linting and formatting. [1] [2] [3]Documentation and Project Structure:
README.mdand addeddocs/index.mdfor v2.0.0, with detailed changelogs, updated examples, project layout, and instruction set reference. Updated all documentation and image paths to reflect the newsrc/outputs/organization. [1] [2] [3] [4] [5]New and Improved Examples:
block_shared_sum.tgpuand runner), and an interactive REPL debugger (debug_repl.py). [1] [2]These changes collectively make TinyGPU more powerful, extensible, and user-friendly for both educational and development purposes.
SHLDandSHSTinstructions for shared memory, improvedSYNCsemantics, and refactored core execution for extensibility and performance. [1] [2] [3]src/outputs/, and added new examples for block shared memory and a REPL debugger. [1] [2] [3] [4] [5] [6] [7] [8] [9]devbranch, and added badges for linting and tests. Integratedruffandblackfor code quality. [1] [2] [3]README.mdand newdocs/index.mdwith v2.0.0 changelog, usage instructions, new project layout, and instruction set reference. [1] [2] [3] [4] [5]block_shared_sum.tgpu,run_block_shared_sum.py, anddebug_repl.pyfor demonstrating block-level operations and interactive debugging. [1] [2]