Welcome to Yichuan Mo’s Homepage! 👏

I am a fourth-year PhD candidate at the ZERO Lab in the School of Intelligence Science and Technology at Peking University, advised by Prof. Yisen Wang. I received my bachelor's degree from Shanghai Jiao Tong University in 2022, where I was co-advised by Prof. Shilin Wang and Prof. Junchi Yan.

My research focuses on the safety alignment of deep learning models, particularly Large Language Models and diffusion-based image/text generation models with over 900 citations. More broadly, I am interested in exploring next-generation language model paradigms, including model architectures, post-training, and decoding strategies. Recently, I have been studying language diffusion models and investigating their potential integration with conventional auto-regressive pipelines.

I am an optimistic and kind person who enjoys finding happiness in everyday life. I am particularly passionate about sports, especially badminton, swimming, and running. My long-term aspiration is to build safe Artificial General Intelligence (AGI) systems that can continuously and reliably benefit all humanity. In the face of rapidly evolving AI technologies, I maintain a humble attitude and learn from fellow researchers and enjoy embracing new challenges.

I expect to graduate in June 2027 and am currently seeking job opportunities starting in summer 2027 in areas including large language model training, quant research, and academic positions. Feel free to contact me at mo666666@stu.pku.edu.cn!

🎓 Education

  • Peking University
    Sep. 2022 – Present
    Ph.D. Candidate
    School of Intelligence Science and Technology
  • Shanghai Jiao Tong University
    Sep. 2018 – Jun. 2022
    B.Eng.
    School of Computer Science

💯 Academic Performance

Undergraduate

  • GPA: 90.93/100 (or 3.94/4.3), Rank: 2/128 (Top 1.6%)
  • Courses: 55.81% above A, 24.42% above A+

Graduate

  • GPA: 3.88/4.0
  • Courses: 71.4% above A

🏆 Selective Awards

Undergraduate

  • 2019.10 Merit Student of Shanghai Jiao Tong University
  • 2019.12 National Scholarship (Top 1% in CE Dept.)
  • 2020.12 National Scholarship (Top 1% in CE Dept.)
  • 2021.12 Weichai Power Scholarship
  • 2022.05 Outstanding Dormitory
  • 2022.06 Outstanding Graduate of Shanghai (Top 3%)
  • 2022.06 Outstanding Undergraduate Thesis, Shanghai Jiao Tong University (Top 1%, 1 out of 128 students in CE Dept.)

Graduate

  • 2022.12 Sunshine Dormitory
  • 2023.09 Xiaomi Scholarship (First-Class)
  • 2023.09 Merit Student of Peking University
  • 2024.09 Yuehua Luo Scholarship
  • 2024.09 Outstanding Research Award in Peking University
  • 2025.04 Nomination for Academic Star(Five Graduate students in AI Dept.
  • 2025.11 Taotian Scholarship (Eight Graduate students in AI Dept.)
  • 2026.04 Optiver AI PhD Scholarship (Six PhD students in China)

📝 Papers

(* Equal Contribution and # Student First Author)

Accepted

  • TrustLDM: Benchmarking Trustworthiness in Language Diffusion Model

    ICLR 2026 Trustworthy Workshop (First benchmark for evaluating trustworthiness of language diffusion models)

  • Decoding Large Language Diffusion Models with Foreseeing Movement

    ICLR 2026 DeLTa Workshop

  • Jailbreak and Guard Aligned Language Models with Only Few In-Context Demonstrations

    TPAMI 2026 (Adopted at scale by Anthropic)

  • On the Adversarial Transferability of Generalized “Skip Connections”

    TPAMI 2026 (Journal extension of SGM, original paper cited 400+ times on Google Scholar)

  • Fight Back Against Jailbreaking via Prompt Adversarial Tuning

    NeurIPS 2024

  • TERD: A Unified Framework for Safeguarding Diffusion Models Against Backdoors

    ICML 2024 (First backdoor input detection method for diffusion models)

  • PID: Prompt-Independent Data Protection Against Latent Diffusion Models

    ICML 2024

  • When Adversarial Training Meets Vision Transformers: Recipes from Training to Architecture

    NeurIPS 2022 (Spotlight, Top 5%) (First work to improve adversarial robustness of ViTs)

  • Improving Generative Adversarial Networks via Adversarial Learning in Latent Space

    NeurIPS 2022 (Spotlight, Top 5%)

  • DICE: Domain-attack Invariant Causal Learning for Improved Data Privacy Protection and Adversarial Robustness

    SIGKDD 2022

  • Multi-Task Learning Improves Synthetic Speech Detection

    ICASSP 2022

Preprint

  • SelfCAD: Protecting Your Efficient Reasoning Capabilities via Self-Cautious Insertion

    Preprint 2026

  • Generalist++: A Meta-learning Framework for Mitigating Trade-off in Adversarial Training

    arXiv 2025

  • Are Smarter LLMs Safer? Exploring Safety-Reasoning Trade-offs in Prompting and Fine-Tuning

    arXiv 2025 (First to reveal the safety–reasoning capability trade-off)

🤖 Open-source Models

🛠️ Academic Service

  • Reviewer: NeurIPS 2023/2024/2025; ICLR 2024/2025/2026; ICML 2024/2025/2026; CVPR 2025/2026; ICCV 2025; IJCAI 2024; AAAI 2025/2026; AISTATS 2025; ECCV 2026
  • Top Reviewer of NeurIPS 2023 (Top 10.49%)
  • Top Reviewer of NeurIPS 2024 (Top 8.60%)
  • Top Reviewer of NeurIPS 2025 (Top 8.02%)
  • Notable Reviewer of ICLR 2025 (Top 3%)

🔗 Links

(Alphabetical Order)