Publications
* Equal Contribution † Project Lead
♠ Innovator
Coming Soon...
♠ Data Synthesis
Self-Foveate: Enhancing Diversity and Difficulty of Synthesized Instructions from Unsupervised Text via Multi-Level Foveation, Mingzhe Li†, Xin Lu, Yanyan Zhao.
- Proposes an automated LLM-driven framework named Self-Foveate for instruction synthesis from unsupervised text.
- Introduces a “Micro-Scatter-Macro” multi-level foveation methodology guiding LLMs to extract fine-grained and diverse information.
- Demonstrates superior performance across multiple unsupervised corpora and model architectures.
♠ Multimodal
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm, Jingqi Tong*, Yurong Mou*, Hangcheng Li*, Mingzhe Li*, Yongzhuo Yang*, Ming Zhang, Qiguang Chen, Tianyi Liang, Xiaomeng Hu, Yining Zheng, Xinchi Chen, Jun Zhao, Xuanjing Huang, Xipeng Qiu.
- Introduces “Thinking with Video”, a new paradigm unifying visual and textual reasoning through video generation models.
- Develops VideoThinkBench, a reasoning benchmark for video generation models covering both vision-centric and text-centric tasks.
- Demonstrates that Sora-2 surpasses SOTA VLMs on several tasks.
♠ LLM Safety
STAR-S: Improving Safety Alignment through Self-Taught Reasoning on Safety Rules, Di Wu, Yanyan Zhao, Xin Lu, Mingzhe Li, Bing Qin.
- Proposes STAR-S (Self-TAught Reasoning based on Safety rules), a framework that integrates safety rule reasoning learning into a self-taught loop.
- Introduces a synergistic cycle where models reason and reflect on safety rules, then use fine-tuning to enhance safety reasoning capabilities.
- Demonstrates superior defense against jailbreak attacks compared to baseline models through iterative improvement of safety rule understanding.


