About Me

Hi there! I am a CS PhD student at UC San Diego, fortunately advised by Prof. Julian McAuley. My research interests are in vision & language, with a current focus on building and understanding multimodal LLMs. On the application side, my work spans conditioned text generation, autonomous agents, personalization & recommendation, and machine learning for healthcare.

Publications:

ArXiv Preprints:

GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation
An Yan, Zhengyuan Yang, Wanrong Zhu, Kevin Lin, Linjie Li, Jianfeng Wang, Jianwei Yang, Yiwu Zhong, Julian McAuley, Jianfeng Gao, Zicheng Liu, Lijuan Wang

GPT-4V as a Generalist Evaluator for Vision-Language Tasks
• Xinlu Zhang, Yujie Lu, Weizhi Wang, An Yan, Jun Yan, Lianke Qin, Heng Wang, Xifeng Yan, William Yang Wang, Linda Ruth Petzold

CLIP also Understands Text: Prompting CLIP for Phrase Understanding
An Yan, Jiacheng Li, Wanrong Zhu, Yujie Lu, William Wang, Julian McAuley

Refereed Publications:

Driving through the Concept Gridlock: Unraveling Explainability Bottlenecks in Automated Driving
• Jessica Echterhoff, An Yan, Kyungtae Han, Amr Abdelraouf, Rohit Gupta, Julian McAuley
• Winter Conference on Applications of Computer Vision (WACV 2024)

Mitigating Spurious Correlations for Medical Image Classification via Natural Language Concepts
An Yan, Yu Wang, Petros Karypis, Zexue He, Amilcare Gentili, Chun-Nan Hsu, Julian McAuley
• Conference on Neural Information Processing Systems (NeurIPS 2023) Medical Imaging Workshop

M4: A Multi-Level, Multi-Task, and Multi-Domain Medical Benchmark for Language Model Evaluation
• Zexue He, Yu Wang, An Yan, Yao Liu, Eric Y Chang, Amilcare Gentili, Julian McAuley, Chun-Nan Hsu
• Empirical Methods in Natural Language Processing (EMNLP 2023)

Learning Concise and Descriptive Attributes for Visual Recognition
An Yan, Yu Wang, Yiwu Zhong, Chengyu Dong, Zexue He, Yujie Lu, William Wang, Jingbo Shang, Julian McAuley
• International Conference on Computer Vision 2023 (ICCV 2023)

Personalized Showcases: Generating Multi-Modal Explanations for Recommendations
An Yan, Zhankui He, Jiacheng Li, Tianyang Zhang, Julian McAuley
• The International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2023)

Visualize Before You Write: Imagination-Guided Open-Ended Text Generation
• Wanrong Zhu, An Yan, Yujie Lu, Wenda Xu, Xin Eric Wang, Miguel Eckstein, William Wang
• European Chapter of the Association for Computational Linguistics (EACL 2023)

ImaginE: An Imagination-Based Automatic Evaluation Metric for Natural Language Generation
• Wanrong Zhu, Xin Wang, An Yan, Miguel Eckstein, William Wang
• European Chapter of the Association for Computational Linguistics (EACL 2023)

Disambiguating Medical Reports via Contrastive Knowledge Infusion
• Zexue He, An Yan, Amilcare Gentili, Julian McAuley, Chun-Nan Hsu
• AAAI Conference on Artificial Intelligence (AAAI 2023).

Robust multi-view fracture detection in the presence of other abnormalities using HAMIL-Net
• Xing Lu, Eric Chang, Jiang Du, An Yan, Julian McAuley, Amilcare Gentili, Chunnan Hsu
• Military Medicine, 2023.

RadBERT: Adapting Language Models to Radiology
An Yan, Chun-Nan Hsu, Amilcare Gentili, Julian McAuley
• Journal of Radiology: Artificial Intelligence, 2022.

Personalized Complementary Product Recommendation
An Yan, Yan Gao, Chaosheng Dong, Jinmiao Fu, Tong Zhao, Yi Sun, Julian McAuley
• The ACM Web Conference (WWW 2022)

Semi-supervised Multi-Label Classification with 3D CBAM Resnet for Tuberculosis Cavern Report
• Xing Lu, An Yan, Eric Y Chang, Chun-nan Hsu, Julian McAuley, Jiang Du, Amilcare Gentili
• Conference and Labs of the Evaluation Forum (CLEF-2022)

Weakly Supervised Contrastive Learning for Chest X-Ray Report Generation
An Yan, Zexue He, Xing Lu, Jiang Du, Eric Chang, Amilcare Gentili, Julian McAuley, Chun-Nan Hsu
• Empirical Methods in Natural Language Processing (EMNLP 2021)

Describing Visual Differences Needs Semantic Understanding of Individuals
An Yan, Xin Wang, Tsu-Jui Fu, William Wang
• European Chapter of the Association for Computational Linguistics (EACL 2021)

Multimodal Style Transfer Learning for Outdoor Vision-and-Language Navigation
• Wanrong Zhu, Xin Wang, Tsu-Jui Fu, An Yan, Pradyumna Narayana, Kazoo Sone, Sugato Basu, William Wang
• European Chapter of the Association for Computational Linguistics (EACL 2021)

2D Convolutional Neural Networks for Sequential Recommendation
An Yan, Shuo Cheng, Wang-Cheng Kang, Mengting Wan, Julian McAuley
• ACM International Conference on Information and Knowledge Management (CIKM 2019)

PA3D: Pose-Action 3D Machine for Video Recognition
An Yan, Yali Wang, Zhifeng Li, Yu Qiao
• IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2019)

Work Experience

Research Intern at Microsoft, Redmond, WA.
Hosts: Zhengyuan Yang, Jianfeng Wang, Linjie Li, Kevin Lin, Zicheng Liu, Lijuan Wang.
Working on new applications with GPT-4V.
Sep 2023 - Present.

Research Intern at Adobe, San Jose, CA.
Hosts: Raghav Addanki, David Arbour, Zhao Song, Tong Yu.
Gradient-based constrained sampling from LMs.
Jun 2023 - Sep 2023.

Research Intern at Meta, Menlo Park, CA.
Hosts: Cem Akkaya, Licheng Yu, Charlie Zhu, Yang Bai.
Multi-modal pre-training for ads understanding and generation.
Jun 2022 - Sep 2022.

Applied Scientist Intern at Amazon, Seattle, WA.
Hosts: Chaosheng Dong, Yan Gao, Jinmiao Fu, Tong Zhao.
Personalized complementary recommendation. Top 10 most viewed publications of 2022 at Amazon Science.
Jun 2021 - Sep 2021.

Applied Scientist Intern at Amazon, Santa Barbara, CA.
Hosts: Craig Bennett, Nic Jedema.
Alexa QA quality evaluation.
Jun 2020 - Sep 2020.

Education

University of California San Diego
Ph.D. & M.S. in Computer Science
Sep 2018 - Present.

University of Science and Technology of China
B.E. in Electronic Engineering & Information Science
Sep 2014 - Jun 2018.