September 10, 2025
Release
Tri Series Intermediate Checkpoints Release



As part of our Open Source Month, we’re releasing intermediate checkpoints from the Tri family—0.5B, 1.9B, 7B, and 70B—and, to our knowledge (as of Sep 10, 2025), this is the first release of intermediate checkpoints for large language models trained from scratch in Korea.
Checkpoints are released at fixed step intervals—approximately 20B tokens for 0.5B, 40B for 1.9B, and 160B for 7B and 70B—so training dynamics can be analyzed consistently. We look forward to seeing how the research community leverages these.
The smaller models (Tri-0.5B and Tri-1.9B) are test-run checkpoints produced during system bring-up. While not polished finals, they’re valuable for studying scaling behavior, convergence, and phase transitions across training steps.
As part of our Open Source Month, we’re releasing intermediate checkpoints from the Tri family—0.5B, 1.9B, 7B, and 70B—and, to our knowledge (as of Sep 10, 2025), this is the first release of intermediate checkpoints for large language models trained from scratch in Korea.
Checkpoints are released at fixed step intervals—approximately 20B tokens for 0.5B, 40B for 1.9B, and 160B for 7B and 70B—so training dynamics can be analyzed consistently. We look forward to seeing how the research community leverages these.
The smaller models (Tri-0.5B and Tri-1.9B) are test-run checkpoints produced during system bring-up. While not polished finals, they’re valuable for studying scaling behavior, convergence, and phase transitions across training steps.
Training Details
| 0.5B | 1.9B | 7B | 70B | |
|---|---|---|---|---|
| batch size (tokens) | 1M | 2M | 2M | 8M | 
| learning rate | 6e-3 | 3e-3 | 2e-4 | 1.5e-4 | 
| optimizer | AdamW | AdamW | AdamW | AdamW | 
| beta1 | 0.9 | 0.9 | 0.9 | 0.9 | 
| beta2 | 0.95 | 0.95 | 0.95 | 0.95 | 
| learning rate scheduler | WSD | WSD | WSD | WSD | 
| total tokens seen | 1.26T | 1.88T | 2T | 1.5T | 
| 0.5B | 1.9B | 7B | 70B | |
|---|---|---|---|---|
| batch size (tokens) | 1M | 2M | 2M | 8M | 
| learning rate | 6e-3 | 3e-3 | 2e-4 | 1.5e-4 | 
| optimizer | AdamW | AdamW | AdamW | AdamW | 
| beta1 | 0.9 | 0.9 | 0.9 | 0.9 | 
| beta2 | 0.95 | 0.95 | 0.95 | 0.95 | 
| learning rate scheduler | WSD | WSD | WSD | WSD | 
| total tokens seen | 1.26T | 1.88T | 2T | 1.5T | 
Training Loss


You can browse the checkpoints here:
Tri-0.5B : https://huggingface.co/trillionlabs/0.5B-Intermediate-Checkpoints
Tri-1.9B : https://huggingface.co/trillionlabs/1.9B-Intermediate-Checkpoints
Tri-7B : https://huggingface.co/trillionlabs/Tri-7B-Intermediate-Checkpoints
Tri-70B : https://huggingface.co/trillionlabs/Tri-70B-Intermediate-Checkpoints
Feel free to check out the full Tri-series collection here: https://huggingface.co/collections/trillionlabs/tri-series-687fa9ff7eb23e8ba847ef93
You can browse the checkpoints here:
Tri-0.5B : https://huggingface.co/trillionlabs/0.5B-Intermediate-Checkpoints
Tri-1.9B : https://huggingface.co/trillionlabs/1.9B-Intermediate-Checkpoints
Tri-7B : https://huggingface.co/trillionlabs/Tri-7B-Intermediate-Checkpoints
Tri-70B : https://huggingface.co/trillionlabs/Tri-70B-Intermediate-Checkpoints
Feel free to check out the full Tri-series collection here: https://huggingface.co/collections/trillionlabs/tri-series-687fa9ff7eb23e8ba847ef93

