September 10, 2025

Release

Tri Series Intermediate Checkpoints Release

As part of our Open Source Month, we’re releasing intermediate checkpoints from the Tri family—0.5B, 1.9B, 7B, and 70B—and, to our knowledge (as of Sep 10, 2025), this is the first release of intermediate checkpoints for large language models trained from scratch in Korea.

Checkpoints are released at fixed step intervals—approximately 20B tokens for 0.5B, 40B for 1.9B, and 160B for 7B and 70B—so training dynamics can be analyzed consistently. We look forward to seeing how the research community leverages these.

The smaller models (Tri-0.5B and Tri-1.9B) are test-run checkpoints produced during system bring-up. While not polished finals, they’re valuable for studying scaling behavior, convergence, and phase transitions across training steps.

As part of our Open Source Month, we’re releasing intermediate checkpoints from the Tri family—0.5B, 1.9B, 7B, and 70B—and, to our knowledge (as of Sep 10, 2025), this is the first release of intermediate checkpoints for large language models trained from scratch in Korea.

Checkpoints are released at fixed step intervals—approximately 20B tokens for 0.5B, 40B for 1.9B, and 160B for 7B and 70B—so training dynamics can be analyzed consistently. We look forward to seeing how the research community leverages these.

The smaller models (Tri-0.5B and Tri-1.9B) are test-run checkpoints produced during system bring-up. While not polished finals, they’re valuable for studying scaling behavior, convergence, and phase transitions across training steps.

Training Details


0.5B

1.9B

7B

70B

batch size (tokens)

1M

2M

2M

8M

learning rate

6e-3

3e-3

2e-4

1.5e-4

optimizer

AdamW

AdamW

AdamW

AdamW

beta1

0.9

0.9

0.9

0.9

beta2

0.95

0.95

0.95

0.95

learning rate scheduler

WSD

WSD

WSD

WSD

total tokens seen

1.26T

1.88T

2T

1.5T


0.5B

1.9B

7B

70B

batch size (tokens)

1M

2M

2M

8M

learning rate

6e-3

3e-3

2e-4

1.5e-4

optimizer

AdamW

AdamW

AdamW

AdamW

beta1

0.9

0.9

0.9

0.9

beta2

0.95

0.95

0.95

0.95

learning rate scheduler

WSD

WSD

WSD

WSD

total tokens seen

1.26T

1.88T

2T

1.5T