Clarify that 100B refers to training tokens, not parameters #399

Ayush10 · 2026-02-01T07:38:42Z

Summary

Reworded "run a 100B BitNet b1.58 model" to "run a BitNet b1.58 model with 100B training tokens" in the README
The original phrasing is frequently misinterpreted as referring to a 100 billion parameter model, when the model is actually 8B parameters trained on 100B tokens

Fixes #391

The phrase "run a 100B BitNet b1.58 model" is frequently misinterpreted as referring to a 100 billion parameter model. The model is actually 8B parameters trained on 100B tokens. Reworded to make this distinction clear. Fixes microsoft#391

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify that 100B refers to training tokens, not parameters #399

Clarify that 100B refers to training tokens, not parameters #399

Ayush10 commented Feb 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Clarify that 100B refers to training tokens, not parameters #399

Are you sure you want to change the base?

Clarify that 100B refers to training tokens, not parameters #399

Conversation

Ayush10 commented Feb 1, 2026

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant