Radiology has been essential to accurately diagnosing diseases and assessing responses to treatment. The challenge however lies in the shortage of radiologists globally. As a response to this, a number of Artificial Intelligence solutions are being developed. The challenge Artificial Intelligence radiological solutions however face is the lack of a benchmarking and evaluation standard, and the difficulties of collecting diverse data to truly assess the ability of such systems to generalise and properly handle edge cases. We are proposing a radiograph-agnostic platform and framework that would allow any Artificial Intelligence radiological solution to be assessed on its ability to generalise across diverse geographical location, gender and age groups.