Vision-language Model for Medical Images and Reports
vision-language model, report generation, medical foundation models integrated with LLM
 
 By combining the power of language and vision, large-scale vision-language models could unlocked exciting possibilities for the future applications.
For instance, we could get more powerful image representations by leveraging the rich information in free-text reports.
Furthermore, these models can be trained to generate descriptive captions for medical images, facilitating automated radiology reports generation.