About Me
Hi there! I am a CS PhD student at UC San Diego, fortunately advised by Prof. Julian McAuley. My research interests are in vision & language, with a current focus on building and understanding multimodal LLMs. On the application side, my work spans conditioned text generation, autonomous agents, personalization & recommendation, and machine learning for healthcare.
Publications:
ArXiv Preprints:
GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation
• An Yan, Zhengyuan Yang, Wanrong Zhu, Kevin Lin, Linjie Li, Jianfeng Wang, Jianwei Yang, Yiwu Zhong, Julian McAuley, Jianfeng Gao, Zicheng Liu, Lijuan Wang
GPT-4V as a Generalist Evaluator for Vision-Language Tasks
• Xinlu Zhang, Yujie Lu, Weizhi Wang, An Yan, Jun Yan, Lianke Qin, Heng Wang, Xifeng Yan, William Yang Wang, Linda Ruth Petzold
CLIP also Understands Text: Prompting CLIP for Phrase Understanding
• An Yan, Jiacheng Li, Wanrong Zhu, Yujie Lu, William Wang, Julian McAuley
Refereed Publications:
Driving through the Concept Gridlock: Unraveling Explainability Bottlenecks in Automated Driving
• Jessica Echterhoff, An Yan, Kyungtae Han, Amr Abdelraouf, Rohit Gupta, Julian McAuley
• Winter Conference on Applications of Computer Vision (WACV 2024)
Mitigating Spurious Correlations for Medical Image Classification via Natural Language Concepts
• An Yan, Yu Wang, Petros Karypis, Zexue He, Amilcare Gentili, Chun-Nan Hsu, Julian McAuley
• Conference on Neural Information Processing Systems (NeurIPS 2023) Medical Imaging Workshop
M4: A Multi-Level, Multi-Task, and Multi-Domain Medical Benchmark for Language Model Evaluation
• Zexue He, Yu Wang, An Yan, Yao Liu, Eric Y Chang, Amilcare Gentili, Julian McAuley, Chun-Nan Hsu
• Empirical Methods in Natural Language Processing (EMNLP 2023)
Learning Concise and Descriptive Attributes for Visual Recognition
• An Yan, Yu Wang, Yiwu Zhong, Chengyu Dong, Zexue He, Yujie Lu, William Wang, Jingbo Shang, Julian McAuley
• International Conference on Computer Vision 2023 (ICCV 2023)
Personalized Showcases: Generating Multi-Modal Explanations for Recommendations
• An Yan, Zhankui He, Jiacheng Li, Tianyang Zhang, Julian McAuley
• The International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2023)
Visualize Before You Write: Imagination-Guided Open-Ended Text Generation
• Wanrong Zhu, An Yan, Yujie Lu, Wenda Xu, Xin Eric Wang, Miguel Eckstein, William Wang
• European Chapter of the Association for Computational Linguistics (EACL 2023)
ImaginE: An Imagination-Based Automatic Evaluation Metric for Natural Language Generation
• Wanrong Zhu, Xin Wang, An Yan, Miguel Eckstein, William Wang
• European Chapter of the Association for Computational Linguistics (EACL 2023)
Disambiguating Medical Reports via Contrastive Knowledge Infusion
• Zexue He, An Yan, Amilcare Gentili, Julian McAuley, Chun-Nan Hsu
• AAAI Conference on Artificial Intelligence (AAAI 2023).
Robust multi-view fracture detection in the presence of other abnormalities using HAMIL-Net
• Xing Lu, Eric Chang, Jiang Du, An Yan, Julian McAuley, Amilcare Gentili, Chunnan Hsu
• Military Medicine, 2023.
RadBERT: Adapting Language Models to Radiology
• An Yan, Chun-Nan Hsu, Amilcare Gentili, Julian McAuley
• Journal of Radiology: Artificial Intelligence, 2022.
Personalized Complementary Product Recommendation
• An Yan, Yan Gao, Chaosheng Dong, Jinmiao Fu, Tong Zhao, Yi Sun, Julian McAuley
• The ACM Web Conference (WWW 2022)
Semi-supervised Multi-Label Classification with 3D CBAM Resnet for Tuberculosis Cavern Report
• Xing Lu, An Yan, Eric Y Chang, Chun-nan Hsu, Julian McAuley, Jiang Du, Amilcare Gentili
• Conference and Labs of the Evaluation Forum (CLEF-2022)
Weakly Supervised Contrastive Learning for Chest X-Ray Report Generation
• An Yan, Zexue He, Xing Lu, Jiang Du, Eric Chang, Amilcare Gentili, Julian McAuley, Chun-Nan Hsu
• Empirical Methods in Natural Language Processing (EMNLP 2021)
Describing Visual Differences Needs Semantic Understanding of Individuals
• An Yan, Xin Wang, Tsu-Jui Fu, William Wang
• European Chapter of the Association for Computational Linguistics (EACL 2021)
Multimodal Style Transfer Learning for Outdoor Vision-and-Language Navigation
• Wanrong Zhu, Xin Wang, Tsu-Jui Fu, An Yan, Pradyumna Narayana, Kazoo Sone, Sugato Basu, William Wang
• European Chapter of the Association for Computational Linguistics (EACL 2021)
2D Convolutional Neural Networks for Sequential Recommendation
• An Yan, Shuo Cheng, Wang-Cheng Kang, Mengting Wan, Julian McAuley
• ACM International Conference on Information and Knowledge Management (CIKM 2019)
PA3D: Pose-Action 3D Machine for Video Recognition
• An Yan, Yali Wang, Zhifeng Li, Yu Qiao
• IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2019)
Work Experience
Research Intern at Microsoft, Redmond, WA.
Hosts: Zhengyuan Yang, Jianfeng Wang, Linjie Li, Kevin Lin, Zicheng Liu, Lijuan Wang.
Working on new applications with GPT-4V.
Sep 2023 - Present.
Research Intern at Adobe, San Jose, CA.
Hosts: Raghav Addanki, David Arbour, Zhao Song, Tong Yu.
Gradient-based constrained sampling from LMs.
Jun 2023 - Sep 2023.
Research Intern at Meta, Menlo Park, CA.
Hosts: Cem Akkaya, Licheng Yu, Charlie Zhu, Yang Bai.
Multi-modal pre-training for ads understanding and generation.
Jun 2022 - Sep 2022.
Applied Scientist Intern at Amazon, Seattle, WA.
Hosts: Chaosheng Dong, Yan Gao, Jinmiao Fu, Tong Zhao.
Personalized complementary recommendation. Top 10 most viewed publications of 2022 at Amazon Science.
Jun 2021 - Sep 2021.
Applied Scientist Intern at Amazon, Santa Barbara, CA.
Hosts: Craig Bennett, Nic Jedema.
Alexa QA quality evaluation.
Jun 2020 - Sep 2020.
Education
University of California San Diego
Ph.D. & M.S. in Computer Science
Sep 2018 - Present.
University of Science and Technology of China
B.E. in Electronic Engineering & Information Science
Sep 2014 - Jun 2018.