DALPHIN

Background

Artificial Intelligence in the form of chatbots is an emerging reality. Equipped with vision-language capabilities, models like GPT-4o or PathChat can process both images and text, representing a possibility of multimodal virtual assistants in healthcare. While there is an urgent need and growing expectation to adopt digital assistants to support clinical diagnostics, clinicians and researchers must question chatbots’ capability to answer diagnostic questions.

Aim

The aim of DALPHIN (DigitAL PatHology assIstant beNchmark) is to create a multicentric open benchmark for virtual assistants applied to diagnostic problems in digital pathology. Pathologists from multiple clinical centers will provide cases, consisting of histopathology regions of interest (ROIs), questions, and answers, across various pathology subspecialties. We will assess the performance of both general-purpose and pathology-specific chatbots on our benchmark, and compare this to the performance of pathologists with different levels of expertise. Ultimately, we plan to publicly release the benchmark on the Grand-Challenge platform, where submissions will be evaluated automatically and ranked on a leaderboard.

People

Francesco Ciompi

Francesco Ciompi

Associate Professor

Nadieh Khalili

Nadieh Khalili

Workgroup Lead

Carlijn Lems

Carlijn Lems

PhD Candidate

Frédérique Meeuwsen

Frédérique Meeuwsen

Pathologist and Postdoctoral Researcher

Milda Pocevičiūtė

Milda Pocevičiūtė

Postdoctoral Researcher

Natalie Klubickova

Natalie Klubickova

Visiting Researcher

Alon Vigdorovits

Alon Vigdorovits

Visiting Researcher