Skip to content

Conversation

@Ayush10
Copy link

@Ayush10 Ayush10 commented Feb 1, 2026

Summary

  • Reworded "run a 100B BitNet b1.58 model" to "run a BitNet b1.58 model with 100B training tokens" in the README
  • The original phrasing is frequently misinterpreted as referring to a 100 billion parameter model, when the model is actually 8B parameters trained on 100B tokens

Fixes #391

The phrase "run a 100B BitNet b1.58 model" is frequently misinterpreted as
referring to a 100 billion parameter model. The model is actually 8B
parameters trained on 100B tokens. Reworded to make this distinction clear.

Fixes microsoft#391
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Please consider rewording section about 100B training tokens

1 participant