fig4

Beyond generalist LLMs: building and validating domain-specific models with the SpAMCQA benchmark

Figure 4. Statistical Validation of the Private Clinical Dataset. This figure provides a comprehensive statistical analysis of the clinical records curated for domain-specific fine-tuning, confirming the dataset’s clinical validity and representativeness. (A) The age distribution of SpA patients shows a peak incidence in the 30-39 age group, aligning with the known epidemiology of disease onset[2]; (B) The gender ratio indicates a male predominance (62% vs. 38%), reflecting the established higher prevalence of ankylosing spondylitis in males[2]; (C) The disease composition of the fine-tuning dataset is intentionally diverse, dominated by SpA subtypes but critically including key differential diagnoses such as rheumatoid arthritis. This case mix mirrors a typical rheumatology clinic’s diagnostic challenge; (D) The positivity rates for key diagnostic biomarkers, Human Leukocyte Antigen B27 (HLA-B27) at 88% and elevated C-Reactive Protein (CRP) at 65%, are consistent with reference values for SpA patient cohorts[28]. Collectively, these analyses demonstrate that our private dataset is a high-fidelity representation of the real-world clinical scenarios SpAD-LLM is designed to address. SpA: Spondyloarthritis; SpAD-LLM: Spondyloarthritis Diagnosis Large Language Model.

Artificial Intelligence Surgery
ISSN 2771-0408 (Online)
Follow Us

Portico

All published articles will be preserved here permanently:

https://www.portico.org/publishers/oae/

Portico

All published articles will be preserved here permanently:

https://www.portico.org/publishers/oae/