Welcome to XTuner V1 English Documentation#
LLM One-Stop Toolbox
XTuner V1 is a new generation large model training engine specifically designed for ultra-large-scale MoE models. Compared with traditional 3D parallel training architectures, XTuner V1 has been deeply optimized for the current mainstream MoE training scenarios in academia.
π Speed Benchmark#
Core Features#
π Dropless Training
Flexible Scaling, No Complex Configuration: 200B scale MoE without expert parallelism; 600B MoE only requires intra-node expert parallelism
Optimized Parallel Strategy: Compared with traditional 3D parallel solutions, smaller expert parallel dimensions enable more efficient Dropless training
π Long Sequence Support
Memory Efficient Design: Through advanced memory optimization technology combinations, 200B MoE models can train 64k sequence length without sequence parallelism
Flexible Extension Capability: Full support for DeepSpeed Ulysses sequence parallelism, maximum sequence length can be linearly extended
Stable and Reliable: Insensitive to expert load imbalance during long sequence training, maintaining stable performance
β‘ Excellent Efficiency
Ultra-Large Scale Support: Supports MoE model training up to 1T parameters
Breakthrough Performance Bottleneck: First time achieving FSDP training throughput surpassing traditional 3D parallel solutions on MoE models above 200B scale
Hardware Optimization: Training efficiency surpasses NVIDIA H800 on Ascend A3 NPU supernodes
π₯ Roadmap#
XTuner V1 is committed to continuously improving the pretraining, instruction fine-tuning, and reinforcement learning training efficiency of ultra-large-scale MoE models, with a focus on optimizing Ascend NPU support.
π Training Engine#
Our vision is to build XTuner V1 into a universal training backend that seamlessly integrates into a broader open-source ecosystem.
Model |
GPU(FP8) |
GPU(BF16) |
NPU(BF16) |
|---|---|---|---|
Intern S1 |
β |
β |
β |
Intern VL |
β |
β |
β |
Qwen3 Dense |
β |
β |
β |
Qwen3 MoE |
β |
β |
β |
GPT OSS |
β |
β |
β |
Deepseek V3 |
β |
β |
β |
KIMI K2 |
β |
β |
β |
π§ Algorithm Suite#
Algorithm components are rapidly iterating. Community contributions are welcome - use XTuner V1 to scale your algorithms to unprecedented scales!
Implemented
β Multimodal Pretraining - Full support for vision-language model training
β Multimodal Supervised Fine-tuning - Optimized for instruction following
β GRPO - Group Relative Policy Optimization
Coming Soon
β‘ Inference Engine Integration#
Seamless integration with mainstream inference frameworks
β LMDeploy
β vLLM
β SGLang
π€ Contribution Guidelines#
We thank all contributors for their efforts to improve and enhance XTuner. Please refer to the Contribution Guidelines to understand the relevant guidelines for participating in the project.
π Acknowledgments#
The development of XTuner V1 is deeply inspired and supported by excellent projects in the open-source community. We extend our sincere gratitude to the following pioneering projects:
Training Engines:
[Torchtitan](pytorch/torchtitan) - PyTorch native distributed training framework
[Deepspeed](deepspeedai/DeepSpeed) - Microsoft deep learning optimization library
[MindSpeed](https://gitee.com/ascend/MindSpeed) - Ascend high-performance training acceleration library
[Megatron](NVIDIA/Megatron-LM) - NVIDIA large-scale Transformer training framework
Reinforcement Learning:
XTuner V1βs reinforcement learning capabilities draw on the excellent practices and experience of the following projects
[veRL](volcengine/verl) - Volcano Engine Reinforcement Learning for LLMs
[SLIME](THUDM/slime) - THUβs scalable RLHF implementation
[AReal](inclusionAI/AReaL) - Ant Reasoning Reinforcement Learning for LLMs
[OpenRLHF](OpenRLHF/OpenRLHF) - An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray
We sincerely thank all contributors and maintainers of these projects for their continuous advancement of the large-scale model training field.
ποΈ Citation#
@misc{2023xtuner,
title={XTuner: A Toolkit for Efficiently Fine-tuning LLM},
author={XTuner Contributors},
howpublished = {\url{https://github.com/InternLM/xtuner}},
year={2023}
}
Open Source License#
This project adopts the Apache License 2.0 Open Source License. At the same time, please comply with the licenses of the models and datasets used.