FAIR data and the impact of AI on research. In Milan, there was a discussion on why it is important to link both aspects
Why is FAIR data essential in research? Which European Open Science Cloud (EOSC) services can improve FAIR data production? How is artificial intelligence (AI) impacting scientific fields and how does FAIR data help train AI? These and other questions were discussed by representatives of European research infrastructures and members of the ESFRI-EOSC working group at the workshop “FAIR Data Productivity and Advanced Digitisation” organised by the European Strategy Forum on Research Infrastructures (ESFRI), held on 23-24 January at the University of Milan.
The purpose of the workshop was to gather the views of experts from research infrastructures on working with FAIR data and advanced digitalisation of research and their ideas on how to successfully implement the EOSC and Open Science initiatives. Experts from the European Research Infrastructures (RI) such as EPOS-ERIC, ESRF, CESSDA ERIC, CERN, ESCAPE, BBMRI-ERIC, CSIC, EGI Foundation, MARIS, CNRS LAPP, OSCARS, SoBigData, e-IRG Delegate, Computational Biology Research Centre of Human Technopole, representatives of EOSC and the European Commission discussed the implications of linking AI and FAIR data.
“The trustworthiness of data in research is absolutely crucial and AI could become a tool to test the quality and originality of scientific data,” said Giorgio Rossi, Professor of Physics at the University of Milan and former ESFRI Chair, who also represents Italy on the Steering Committee in EOSC.
EOSC for science or science for EOSC?
During the conference, speakers shared good practices in working with the data and services provided by EOSC. Representatives from European Research Infrastructures mentioned it as essential to serving their scientific user community, advanced metadata and maintaining collaboration within domain-specific research communities where FAIR data principles are followed. Conversely, bottlenecks limiting data productivity are lack of training and staff issues, data not being open to all users, fairness of new data, and hidden data only available on request, preventing users from combining data and extracting new information from FAIR datasets.
Services provided by EOSC that could improve FAIR data include the introduction of data interoperability standards and support for their adoption and use, dedicated training programmes and EU-level facilities such as EOSC data space.
“In general, already 50 per cent of the data in the production or post-production phase carries the FAIR data label,” Rossi stressed.
FAIR research objects are also a tool of interest to the industry as products for the market (an example is the images from The European Synchrotron Radiation Facility in Grenoble).
Different approaches
Speaking for The Consortium of European Social Science Data Archives (CESSDA) was Jindřich Krejčí from the Institute of Sociology of the CAS, who emphasized the tradition of data sharing in sociological research and the bottom-up approach to comparative research and data reuse. “Building the concept of a shared data culture is an inherent part of research,” said Jindřich Krejčí.
Petr Holub from the Biobanking and Biomolecular Resources Research Infrastructure (BBMRI) and BioMedAI approached the issue from a different angle. He stressed that artificial intelligence has an ever-increasing importance in healthcare research. AI has an indispensable role in the field of digital pathology to facilitate cancer diagnosis and treatment. It also helps when working with anonymization and synthesis of data for their publication, where the challenge remains in dealing with sensitive data. Holub’s talk emphasized the need for a broader role for the EOSC than just a funding source.
Jan Hrušák from the J. Heyrovsky Institute of Physical Chemistry of the CAS (HIPC), who moderated the final panel discussion on AI, emphasized that “By leveraging the capabilities of AI in research infrastructures for data processing, scientists can unlock new research opportunities, streamline workflows, and accelerate discoveries in various scientific fields. While the integration of AI in research infrastructures and data processing offers numerous benefits but also brings with it several challenges. There is a shortage of professionals with the expertise to develop, implement, and maintain AI systems and the training and acquiring skilled personnel in AI technologies for research purposes remain a challenge.”