My name is Chenghao Yang. I am currently a first-year Ph.D. student at University of Chicago. I am fortunately advised by Prof. Allyson Ettinger and Prof. Chenhao Tan. My research is generously supported by Eckhardt Scholarship.

My research interest focuses on natural language processing (NLP) and Machine Learning (ML). My goal is to design practical NLP systems as well as understand the underlying human intelligence behind the natural language. Recently, I worked on pragmatics, word-level semantics, robustness, question-answering and continuous-time event stream modeling.

I am fortunate to be guided and mentored with so many great reseachers and I deeply appreciate their help: Prof. Jason Eisner at JHU, Prof. He He at NYU, Prof. Kai-Wei Chang at UCLA, Prof. Xuezhe Ma at USC, Prof. Smaranda Muresan at Columbia University, Prof. Zhiyuan Liu at Tsinghua University and Dr. Mo Yu at Wechat Research.

Before I join UChicago, I was previously an applied scientist at AWS AI, under the lead of Andrew O. Arnold. My full-time work at AWS AI is mostly about building and evaluating large-scale language models for code generation. I obtained my M.S. degree in Computer Science Department at Columbia University (ML track) and my bachelor's degree in Software College, Beihang University.

I enjoy listening to music, playing the guitar, watching movies, and the anime in my personal spare time. I recently become fascinated by cooking Chinese dishes.

I believe the famous quote "The best way to learn is to teach." I often make many slides or do some chalk talks to explain novel concepts or technical ideas. See the collection of my slides here at Google Drive (Recently: Emergent Ability of LLM, Unified View of Pattern Efficient Tuning, Open-Retrieval QA / ICT Retriever Pretraining: Explained, Tutorial on Certified Robustness in NLP )

Feel free to send me any comments or feedbacks! I am happy to chat about various topics in ML and NLP. I am also open to various form of research collaboration.

Personal News

  • [Nov, 2022] Gives a Talk on Predicting and Explaining Message-Level Disclosures of Opioid Use Disorder at University of Maryland, College Park.
  • [Oct, 2022] Excited to share that my work collaborated with Prof. Xuezhe (Max) Ma at USC has been accepted to EMNLP 2022! Shout out to my great mentor Max and thanks for help from Marius Mosbach!
  • [Sept, 2022] Officially start my Ph.D. at UChicago!
  • [Aug, 2022] Last days in AWS AI. Thanks for great mentors, colleagues, managers and leaders. I learn a lot. See my official goodbye tweet. Looking forward to more exciting news for this excellent team!
  • [July, 2022] Attend NAACL'22 for presenting my TACL'21 paper on NarrativeQA. Say hi if you are in Seattle as well!
  • [June, 2022] Visited USC and UCLA at LA. Thank Robin, Xuezhe, Muhao, Swabha and Kai-Wei for hosting me and having a chat with me in LA!
  • [April, 2022] Officially accept UChicago CS Ph.D. offer. I will work with my great advisors Prof. Allyson Ettinger and Prof. Chenhao Tan. My research will also be generously supported by UChicago Eckhardt Scholarship. Thanks, UChicago!
  • [Jan, 2022] Excited to share that my work collaborated with Prof. Jason Eisner and Prof. Hongyuan Mei has been accepted to ICLR 2022! It mainly focuses on building a neural-symbolic hybrids based on Transformer architecture and can be used for event stream modeling. Please take a look at our full paper and our codebase!
  • [Oct, 2021] Officially be invited to serve as reviewers for ACL Rolling Reviews. Will work on September (as emergency), October and November.
  • [June, 2021] Officially join AWS AI as an applied scientist intern. Looking forward to explore Robustness + QA project! Feel free to reach out if you are also at AWS!
  • [May, 2021] Three Important News:
    • My TACL paper on NarrativeQA has been officially accepted by TACL (work done during my internship at IBM). Thanks to my great mentor Mo Yu and co-authors!
    • My paper on suicide risk assessment has been accepted to ACL 2021 as a short paper! Thanks to my great advisor Smara and my supportive co-author Yudong!
    • Also, our colaboration works with Columbia School of Social Work on COVID-19 social media analysis has also been officially accepted by Journal of Addiction Medicine Production. Thanks to all great collaborators! Very excited to contribute my efforts on COVID-19 related researches.
  • [April, 2021] Officially graduated with a Master's degree in Computer Science from Columbia. Thanks to my great research advisor Smaranda Muresan, my awesome and patient lecturers and TAs and thanks for the accompany of my classmates!
  • [Oct, 2020] I will start my visiting research assistant at JHU CLSP during Spring 2021. I will work with Prof. Jason Eisner and his PhD advisee Hongyuan Mei on a remote basis.
  • [Jun, 2020] I start my internship at IBM Watson as a Sr. Cognitive Software Developer. I will work with Dr. Mo Yu on NarrativeQA projects. Feel free to connect if you are also at IBM!
  • [Jan, 2020] I start working as a Research Assistant at Columbia University, working with Prof. Smaranda Muresan on the topic of NLP for health and social good.
  • [Dec, 2019] I finished my visiting at Tsinghua University as a Visiting Student Research Assistant. Great thanks for my advisor Prof. Zhiyuan Liu and my great collaborators Hao Zhu, Ruobin Xie, Fanchao Qi, Yuan Zang and Junjie Huang.

Publication

(“*” indicates equal contribution)

Journal Papers

  1. Chenghao Yang*, Xiangyang Mou*, Mo Yu*, Bingsheng Yao, Xiaoxiao Guo, Saloni Potdar, Hui Su., Narrative Question Answering with Cutting-Edge Open-Domain QA Techniques: A Comprehensive Study TACL 2021 [paper]
  2. Nabila El-Bassel, Karli R Hochstatter, Melissa Slavin, Chenghao Yang*, Yudong Zhang*, Smaranda Muresan., Harnessing the Power of Social Media to Understand the Impact of COVID-19 on People Who Use Drugs During Lockdown and Social Distancing. Journal of Addiction Medicine

Conference Papers

  1. Chenghao Yang, Xuezhe Ma., Improving Stability of Fine-Tuning Pretrained Language Models via Component-Wise Gradient Norm Clipping, EMNLP 2022 [paper] [codebase]
  2. Chenghao Yang, Hongyuan Mei, Jason Eisner., Transformer Embeddings of Irregularly Spaced Events and Their Participants, ICLR 2022 [full paper] [codebase]
  3. Chenghao Yang, Yudong Zhang, Smaranda Muresan., Weakly-Supervised Methods for Suicide Risk Assessment: Role of Related Domains, ACL 2021 (Short) [paper] [codebase]
  4. Chenghao Yang*, Yuan Zang*, Fanchao Qi*, Zhiyuan Liu, Meng Zhang, Qun Liu, Maosong Sun., Word-level Textual Adversarial Attacking as Combinatorial Optimization, ACL 2020 (Long) [paper] [codebase]
  5. Fanchao Qi*, Junjie Huang*, Chenghao Yang, Zhiyuan Liu et al., Modeling Semantic Compositionality with Sememe Knowledge, ACL 2019 (Long & Oral) [paper] [codebase]

Workshop Papers

  1. Chenghao Yang*, Yuhui Zhang*, Zhengping Zhou*, Zhiyuan Liu., Enhancing Transformer with Sememe Knowledge, RepL4NLP@ACL 2020 [paper]
  2. Xiangyang Mou, Mo Yu, Bingsheng Yao, Chenghao Yang, Xiaoxiao Guo, Saloni Potdar, Hui Su., Frustratingly Hard Evidence Retrieval for QA Over Books, NUSE@ACL 2020 [paper]

Service

  • ARR Reviewer: {Sept, Oct, Nov} 2021, {March} 2022
  • Conference Reviewer: EMNLP {2021, 2022}, ACL {2020, 2021}, NAACL 2021, COLING {2020, 2022}, NLPCC 2020
  • Workshop Reviewer:TL4NLP @ NeurIPS 2022

Misc

  • I deeply understand how difficult it is to get an NLP Ph.D. offer in US for international students. I have released my Statement-of-Purpose at cs-sop.org
  • In 2021, I have also created a target school list with friends applying for CS Ph.D. program together. It includes deadline and requirements for application materials. Although it is outdated, but usually universities do not change their policies that much and we have included the link to the detailed policy.
  • I have mentored, collaborated and advised many great students applying for ML/NLP Ph.D. It turns out they all go to places better than me :-). Proud for them!
  • If you are an undergraduate / master student at UChicago and interested in working with me, please contact my advisors.