Aspects to be considered for responsible data acquisition

The use of AI algorithms in health services has been widely explored to improve the diagnosis and treatment of diseases, but the success of these systems directly depends on the quality and privacy of the data used [1]. The use of medical images in particular is challenging as this data is highly complex and contains sensitive information about patients. As such, it is crucial that responsible data acquisition is taken into account to ensure patient safety and the quality of data used in AI systems. In this text, we will address the three main aspects that must be considered for the responsible acquisition of medical imaging data [1,2]

1. Patient consent and privacy
The first aspect to be considered is the patient’s consent and privacy. It is critical to ensure that patients have given their informed consent to the use of their data and that their information is protected in accordance with applicable laws and regulations. For this, it is necessary to ensure that the data is anonymized and encrypted before being shared, in addition to implementing adequate security measures to prevent unauthorized access to the data [3].
In addition to data anonymization and encryption, there are other challenges that can make it difficult to obtain informed consent from patients for the use of their data in AI systems in healthcare. One of the main challenges is patients’ lack of understanding of the use of their data, which can lead to hesitation in providing the necessary consent. Additionally, some patients may not want to share their data due to privacy and confidentiality concerns [4].
Another challenge is the lack of uniformity in data privacy laws and regulations across countries and regions, which can lead to confusion about the legal obligations of organizations that collect patient data. For example, in the European Union, the General Data Protection Regulation (GDPR) establishes strict requirements for the protection of personal data, while in the United States, legislation varies from state to state [5]
.

In addition, it is important to emphasize that obtaining informed consent is not a one-time process, but an ongoing one. Patients should have the option to withdraw their consent at any time, which could lead to problems with the continued use of their data in AI systems. Therefore, it is necessary to establish effective mechanisms to manage patient consent on an ongoing and transparent basis.

2. Data Quality
Data quality is a crucial aspect in the responsible acquisition of medical imaging data, as it directly influences the accuracy and reliability of predictions made by artificial intelligence (AI) systems in healthcare. It is essential that the data used is of high quality and representative of the population in question for the AI to be able to make accurate and reliable predictions [5-7].
To ensure that the data used is of high quality, rigorous data validation must be performed, which includes checking the quality of images and removing images with artifacts or technical issues. In addition, it is important to assess the diversity of the data in relation to factors such as age, gender, and ethnicity [5-7].
Data diversity is critical to ensure that the AI system is able to generalize to the population in question, avoiding biases and prejudices that could affect the accuracy of predictions. For example, if the data used to train the AI system is predominantly from individuals of a certain age or gender, the system may struggle to make accurate predictions for other age groups or genders [5-7].
Assessing data diversity can also help identify health inequalities and disparities that can be addressed through specific interventions. For example, identifying inequalities in access to health services can lead to the implementation of policies and programs to improve access to these services for underrepresented groups in the population. It is important to emphasize that the quality of the data is not an aspect that must be considered only in the acquisition, but must also be continuously evaluated during the entire process of training and implementation of the AI system [5-7]..
Therefore, rigorous data validation and assessment of data diversity are essential to ensure the quality of data used in AI systems in healthcare and to avoid biases and prejudices that could affect the accuracy of predictions. Ensuring data quality is an important responsibility of everyone involved in the acquisition and use of medical imaging data in AI systems in healthcare.

3. Ethics and social responsibility
In addition to the need to follow ethical guidelines for acquiring medical imaging data, there are other ethical limitations that need to be considered when implementing artificial intelligence systems in healthcare [8,9].
The first ethical limitation is the concern about the potentially discriminatory use of AI systems. If the data used to train the AI system is not diverse enough, the system can reproduce existing biases in society, resulting in prejudice or discrimination against certain groups. For example, an AI system that has been trained with data predominantly from patients of a certain ethnicity may struggle to make accurate predictions for patients of other ethnicities [8,9].
The second ethical limitation is the concern with the privacy and confidentiality of patient data. The collection and use of patients’ personal data without their informed consent violates patients’ privacy and may be considered unethical. Furthermore, if the data is not properly protected, it can be accessed by unauthorized persons, which can lead to privacy violations and consequent damage to the patient’s reputation [8,9].
It is therefore essential that AI systems in healthcare are developed in a responsible and ethical manner, taking into account the diversity of data used, patient privacy, and the need to protect against prejudice and discrimination.

References:
1. Liao, C., Gao, X., & Tang, J. (2020). Framework for protecting patient data in the era of artificial intelligence. Journal of medical systems, 44(11), 1-11.
2. Li, J., Li, Y., Li, X., & Sun, Y. (2021). A survey of medical image datasets for machine learning applications. Journal of Healthcare Engineering, 2021.
3. Van Horn, J. D., Catanese, M., & Samuels, E. R. (2020). Ethics of artificial intelligence in radiology: summary of the joint European and North American multisociety statement. Radiology, 295(2), 318-319.
4. Cho, H., & Lee, J. (2017). Consent for the use of personal information for medical research in the age of big data research. Journal of Korean Medical Science, 32(9), 1353-1360.
5. Fernández-Alemán, J. L., Señor, I. C., Lozoya, P. Á. O., & Toval, A. (2013). Security and privacy in electronic health records: A systematic literature review. Journal of biomedical informatics, 46(3), 541-562.
6. Savage, N. (2019). Data privacy laws: the global landscape in 2019. Nature, 569(7755), 481-483.
7. Mittelstadt, B. D., & Floridi, L. (2016). The ethics of big data: current and foreseeable issues in biomedical contexts. Science and engineering ethics, 22(2), 303-341.
8. Greenberg, R. B. (2018). Privacy, autonomy, and dignity in the age of big data and AI. AMA Journal of Ethics, 20(10), E947-E953.
9. Wiens, J., Saria, S., Sendak, M., Ghassemi, M., Liu, V. X., Doshi-Velez, F., & Jung, K. (2019). Do no harm: a roadmap for responsible machine learning for health care. Nature Medicine, 25(9), 1337-1340.

Aspects to be considered for responsible data acquisition

Rafael Misse

Gradient Health