Guidelines for the development, evaluation, and implementation of AI-based prediction models in healthcare

Due to the continuous development of medicine and the huge amount of clinical data collected while practicing it, there is an urgent need to optimize the working time of clinicians. One way to do this is to analyze clinical data using machine learning and artificial intelligence (AI) based on predictive models. These models predict the presence or future occurrence of specific outcomes (e.g., a specific condition or disease) with specific inputs (e.g., certain patient characteristics or medical images, etc.). While the results of this approach are promising, the development of an AI-based prediction model (AIPM) requires a careful quality assessment before it can be applied in daily practice.

Due to patient safety and the high quality of analysis performed, it is necessary to adhere to the guidelines and quality criteria related to the development, evaluation, and implementation of AIPM. To facilitate access to these guidelines, a group of scientists conducted a literature review based on 72 selected papers and published the results of their work in the journal Digital Medicine (DOI: 10.1038 / s41746-021-00549-7). Based on the analysis carried out by Anne A.H. de Hond et al., six stages were distinguished in the AIPM development, assessment, and implementation cycle which provide a framework for the responsible adoption of AI-based predictive models in healthcare.

The first phase of AIPM implementation is data preparation. This stage was considered in several areas in the above-mentioned article (medical context, patient privacy, sample size, representativeness, data quality, data pre-processing, and data coding standards), and a summary of these is described below.

Medical context

According to recommendations in the literature, before developing AIPM, you should define the medical problem and the context in which the product is to be used. You should also conduct a thorough investigation into the current standard of care and demonstrate the legitimate purpose of using AIPM in your area. The patients’ needs should also be analyzed, as should the health actions that are taken (therapies and interventions) based on AIPM predictions. In addition, the focus should be on the criteria for clinical success, including an analysis of the potential risks of prediction errors. At this stage, the developers should focus on carrying out an analysis of the benefits of AIPM, as well as the costs of its development and maintenance, and the consequences of its misuse.

Patient privacy

All data should be stored following GDPR, PIPEDA, and HIPAA rules, and should be in accordance with local regulations. It was also underlined that, where necessary, data protection specialists should be consulted. The authors of the publication paid special attention to the data that already existed which was collected for a purpose other than AIPM, and the team would like to use it to develop their own product.

Sample size and representativeness

Some recommendations suggest minimizing the amount of data held, encryption, or the use of pseudonymization or anonymization methods. It is recommended that the amount of data collected should be reported and should also be large enough to achieve the intended purpose, which is specific to each AIPM.

Moreover, representativeness is reported to have a great influence on the evaluation and prevention of algorithmic bias and poor calibration. Therefore, it is important to provide data representative of the target population, including appropriate heterogeneity and diversity (data collection time, place and setting, gender, age, ethnicity, medical history, and inclusion and exclusion criteria).

Data quality

The collected data should undergo high-quality confirmation through the inspection and description of missing data. Potential errors in measurements and the randomness and systematic nature of their mechanisms should be considered. Moreover, a clear definition and method for measuring the variable should be provided, including a specification of the devices used to carry out the measurements. It is also recommended that the data be checked further by randomly checking sets of them for errors. The authors also reported the benefits of installing an error correction process that should be used when creating and implementing the model. Particular attention is paid to the quality of the data, especially if it has been manually marked. In such a situation, it is recommended that the labeling experience be carefully checked and difficult cases discussed. It is best to work with independent experts who are not involved in the AIPM assessment when doing this.

Data pre-processing and data coding standards

Data pre-processing steps should be used to prepare the data for subsequent phases, which may include dividing the collected data into subsets such as training, tuning, and test sets. Moreover, pre-processing may include data augmentation, the removal of outliers, the re-coding or transformation of variables, and the imputation of missing data. All such procedures should be described in detail in the procedural documentation.

To facilitate interoperability and the adoption of AIPM in healthcare facilities, it was recommended that data management be aligned with appropriate coding standards and widely accepted protocols.

The entire data preparation process is laborious and demanding. In addition to the data preparation phase, the scientists also identified five consecutive steps: AIPM development, AIPM validation, software application development, AIPM impact assessment, and AIPM implementation into daily health practice. Due to the interest of the health service in new technologies and the undeniable possibilities of AIPM, such studies are a great help for employees involved in the development, evaluation, and implementation of AIPM.

Details of data preparation as well as the next stages of AIPM implementation can be found in the article: Guidelines and quality criteria for artificial intelligence-based prediction models in healthcare: a scoping review.

Data at Cardiomatics

Cardiomatics is a certified medical device (class IIa) systematically verified in accordance with the ISO norm EN 60601-2-47:2015. Being compliant with the General Data Protection Regulation (GDPR), Cardiomatics ensures the security of personal data processing. Due to the dynamic development of the product, AI models are assessed on different test sets. Both the data and the amount of training data are constantly growing. Thanks to this, inferences about algorithm performance are made on the data that directly reflect the diversity of patients, but also of Holter recorders. Precisely manually annotated signals, checked and verified by two independent medical specialists, are collected in large sets, which are used to train the algorithm. In accordance with best practice in data-centric AI, we attach great importance to data development. In addition to building new data sets, we are constantly improving existing ones.

Join the digital revolution in cardiology

The opportunities for machine learning and AI in healthcare are promising, but we at Cardiomatics know very well that to successfully grow complex data-driven prediction models, careful quality and applicability assessments are required before those models are disseminated to professionals and applied in daily practice. We follow the latest guidelines on quality criteria related to the development, assessment, and implementation of AI on an ongoing basis.

In this text, we have shared some important tips for those closely involved in the development, evaluation, and implementation of AI-based prediction models (AIPM), including software engineers, data scientists, and healthcare professionals.

If you’ve read this text to the end and find it useful in your daily work, maybe Cardiomatics is a great place for you! Our team is constantly growing! Check out our current job offers!