Jiahao’s Homepage

Hi! I am currently a Postdoctoral Researcher jointly supervised by the Institute of Automation, Chinese Academy of Sciences (CASIA) and Wenge AI Lab. I am affiliated with the State Key Laboratory of Multimodal Artificial Intelligence Systems. I received my PhD from CASIA in 2024 and my bachelor’s degree from Southwest University in 2019.

My research interests include:

Alignment of LLM/VLM (RL, interpretability)
Safety Assessment of LLM/VLM (Jailbreak, red-teaming)
Adversarial Robustness (mainly during my Phd period)

News

Oct. 2025: Serving as an Area Chair for ACL Rolling Review (ARR).
Aug. 2025: The helpful-only model released on Hugging Face got 65k downloads.
July 2025: Awarded 2nd place in the ICML 2025 AI4MATH Challenge (Track 2: SeePhys).

Publications

Zhaoyu Ma, Yuan Shan, Jiahao Zhao, Nan Xu, Lei Wang. Meow: End-to-End Outline Writing for Automatic Academic Survey. arxiv, 2025.
Jiahao Zhao, Liwei Dong. Jinx: Unlimited LLMs for Probing Alignment Failures. technical report, 2025.
Jiahao Zhao, Wenji Mao, Daniel Zeng. Disentangled Text Representation Learning with Information-Theoretic Perspective for Adversarial Robustness. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024.
Yilin Cao, Jiahao Zhao, Rui Zhang, Hanyi Zou, Wenji Mao. TARA: Token-level Attribute Relation Adaptation for Multi-Attribute Controllable Text Generation. Findings of the Association for Computational Linguistics, EMNLP 2024.
Xingjin Wang, Jiahao Zhao, Jiahui Shi, Linjing Li, Daniel Zeng. A Novel Visual-Enhanced Dual Stream Long-Term Decision Framework for Large Language Model Agents. International Conference on Neural Information Processing, ICONIP 2024.
Minzheng Wang, Nan Xu, Jiahao Zhao, Yin Luo, Wenji Mao. PromISe: Releasing the Capabilities of LLMs with Prompt Introspective Search. Proceedings of the 2024 Joint International Conference on Computational Linguistics, LREC-COLING 2024.
Jiahao Zhao, Minzheng Wang, Nan Xu, Yin Luo, Wenji Mao. Enhancing Adversarial Robustness of LLMs with Analytic Hierarchy Process. First Conference on Language Modeling, COLM 2024.
Jiahao Zhao, Wenji Mao. Generative Adversarial Training with Perturbed Token Detection for Model Robustness. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023.
Jiahao Zhao, Penghui Wei, Wenji Mao. Robust Neural Text Classification and Entailment via Mixup Regularized Adversarial Training. Proceedings of the 44th International ACM SIGIR Conference, SIGIR 2021.
Penghui Wei, Jiahao Zhao, Wenji Mao. A Graph-to-Sequence Learning Framework for Summarizing Opinionated Texts. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021.
Penghui Wei, Jiahao Zhao, Wenji Mao. Effective Inter-Clause Modeling for End-to-End Emotion-Cause Pair Extraction. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020.

Experience

Jan. 2025 – Present. Postdoc, CASIA & Wenge.
Mainly working on AI4S LLM/Agent development.
Nov. 2023 – Dec. 2024. Intern, Luminlp @ miHoYo Inc.
Lived a happy time here. Learned a bitter lesson as well.
说到做到，有话直说，只认功劳，追求极致
June. 2023 – Sep. 2023. Intern, AI Lab @ Wenge.
Witnessed the first wave of LLMs.

Awards

2024: “Climbing” Second-Class Scholarship, Institute of Automation, CASIA
2023: Outstanding Student Award, Institute of Automation, CASIA
2019: Outstanding Graduate, Southwest University
2018: National Scholarship, Ministry of Education

Jiahao Zhao

News

Publications

Experience

Awards