GitHub - 44David/OpenLM: GPT-2 style language model all from scratch

A from scratch implementation of a decoder-only transformer based language model. You can find my articles about this project here:

https://davids.onl/blog/mathematical-foundations-of-self-attention-mechanisms/
https://davids.onl/blog/%20mathematical-and-architectural-analysis-of-decoder-only-transformers/

This project gave me insight and perspective on language models, and made me realize how much data is required and how compute intensive language is as a task. As language must not only be semantically and grammatically correct, but also extremely concise and informational.

Some stats about one of the completed training runs:

Visual reprsentation of how attention mechanisms work:
Attention mechanisms on images, the white blur shows where the model attends to.
Image credits: https://arxiv.org/abs/1502.03044

Attention mechanisms working on next word prediction.
Image Credits: https://arxiv.org/abs/1706.03762

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.vscode		.vscode
__pycache__		__pycache__
README.md		README.md
attention.py		attention.py
feedforward_net.py		feedforward_net.py
inference.py		inference.py
layernorm.py		layernorm.py
model.py		model.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
tokenization.py		tokenization.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

44David/OpenLM

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages