Building world-class multilingual models

Building world-class multilingual models

Building world-class multilingual models

We're building Korea's superintelligence through world-class multilingual models.
We're building Korea's superintelligence through world-class multilingual models.
We're building Korea's superintelligence through world-class multilingual models.

Trillion-7B: Korean-Centric LLM

Trillion-7B is a highly efficient multilingual LLM leveraging Cross-lingual Document Attention (XLDA) forknowledge transfer and achieving competitiveperformance with minimal multilingual training data.

Trillion-7B: Korean-Centric LLM

Trillion-7B is a highly efficient multilingual LLM leveraging Cross-lingual Document Attention (XLDA) forknowledge transfer and achieving competitiveperformance with minimal multilingual training data.

Trillion-7B: Korean-Centric LLM

Trillion-7B is a highly efficient multilingual LLM leveraging Cross-lingual Document Attention (XLDA) forknowledge transfer and achieving competitiveperformance with minimal multilingual training data.

Revolutionary Token Efficiency

Revolutionary Token Efficiency

Revolutionary Token Efficiency

Trillion-7B stands as the most token-efficient Korean-centric multilingual large language model available. Unlike conventional models, it achieves exceptional multilingual performance while dedicating only 10% of its training data to multilingual content.

Trillion-7B stands as the most token-efficient Korean-centric multilingual large language model available. Unlike conventional models, it achieves exceptional multilingual performance while dedicating only 10% of its training data to multilingual content.

Trillion-7B stands as the most token-efficient Korean-centric multilingual large language model available. Unlike conventional models, it achieves exceptional multilingual performance while dedicating only 10% of its training data to multilingual content.

Cross-lingual Document Attention (XLDA)

Cross-lingual Document Attention (XLDA)

Cross-lingual Document Attention (XLDA)

Our breakthrough XLDA mechanism revolutionizes knowledge transfer from English to target languages including Korean and Japanese. This innovation enables world-class multilingual understanding capabilities with unprecedented resource efficiency.

Our breakthrough XLDA mechanism revolutionizes knowledge transfer from English to target languages including Korean and Japanese. This innovation enables world-class multilingual understanding capabilities with unprecedented resource efficiency.

Our breakthrough XLDA mechanism revolutionizes knowledge transfer from English to target languages including Korean and Japanese. This innovation enables world-class multilingual understanding capabilities with unprecedented resource efficiency.

Optimized Training Strategy

Optimized Training Strategy

Optimized Training Strategy

Efficient Data Composition

Efficient Data Composition

Efficient Data Composition

Only 10% of 2T training tokens allocated to multilingual data.

Only 10% of 2T training tokens allocated to multilingual data.

Only 10% of 2T training tokens allocated to multilingual data.

Customized Tokenizer

Customized Tokenizer

Customized Tokenizer

Optimized for Korean language processing.

Optimized for Korean language processing.

Optimized for Korean language processing.

Cost Efficiency

Cost Efficiency

Cost Efficiency

Full training completed in just 59.4K H100 GPU hours ($148K).

Full training completed in just 59.4K H100 GPU hours ($148K).

Full training completed in just 59.4K H100 GPU hours ($148K).