Search results (11)
« Back to PublicationsA benchmark of expert-level academic questions to assess AI capabilities.
Journal article
Center for AI Safety . et al, (2026), Nature, 649, 1139 - 1146
MEDS: Building Models and Tools in a Reproducible Health AI Ecosystem
Conference paper
McDermott MBA. et al, (2025), Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2, 6243 - 6244
ACES: AUTOMATIC COHORT EXTRACTION SYSTEM FOR EVENT-STREAM DATASETS
Conference paper
Xu J. et al, (2025), 13th International Conference on Learning Representations Iclr 2025, 66701 - 66716
CheXalign: Preference fine-tuning in chest X-ray interpretation models without human feedback
Conference paper
Hein D. et al, (2025), Proceedings of the Annual Meeting of the Association for Computational Linguistics, 1, 27679 - 27702
Automated Structured Radiology Report Generation
Conference paper
Delbrouck JB. et al, (2025), Proceedings of the Annual Meeting of the Association for Computational Linguistics, 1, 26813 - 26829
RadEval: A framework for radiology text evaluation
Conference paper
Xu J. et al, (2025), Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 546 - 557
Tree-of-Quote Prompting Improves Factuality and Attribution in Multi-Hop and Medical Reasoning
Conference paper
Xu J. et al, (2025), Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 5605 - 5622
Overview of the First Shared Task on Clinical Text Generation: RRG24 and “Discharge Me!”
Conference paper
Xu J. et al, (2024), BioNLP 2024 - 23rd Meeting of the ACL Special Interest Group on Biomedical Natural Language Processing, Proceedings of the Workshop and Shared Tasks, 85 - 98
GREEN: Generative Radiology Report Evaluation and Error Notation
Conference paper
Ostmeier S. et al, (2024), Emnlp 2024 2024 Conference on Empirical Methods in Natural Language Processing Findings of Emnlp 2024, 374 - 390
AnnoDash, a clinical terminology annotation dashboard.
Journal article
Xu J. et al, (2023), JAMIA open, 6
QuizBot: A Dialogue-based Adaptive Learning System for Factual Knowledge
Conference paper
Ruan S. et al, (2019), Conference on Human Factors in Computing Systems Proceedings