Reliable ICT Infrastructure a condition for research data sharing – African NRENs to play an important role
Whether it will be called a guideline, a roadmap or a framework – all participants during the AOSP ICT Infrastructure meeting held on 14 May 2018 in Pretoria, South Africa were in agreement that a document guiding African countries in preparing ICT infrastructures in support of research data sharing, will be of benefit to all. The one day meeting brought together key stakeholders. African regional NRENs (National Research Education Networks) attendees included Dr Pascal Hoba (Chief Executive Officer, UbuntuNet Alliance), Dr Ousmane Moussa Tessa (Chief Executive Officer, NigerREN & member of the WACREN Board, on behalf of Dr Boubakar Barry (Executive Director, WACREN), Dr Yousef Torman (Managing Director, ASREN) and Dr Leon Staphorst (Executive Director, SANRen).
The objective of this meeting was to help NRENs better understand the needs experienced by collaborative data intensive research projects, and for NRENs to consider future service delivery in support of research data. The three projects represented included H3ABioNet (Prof Nicky Mulder, Head: Computational Biology, UCT & Lead: H3ABioNet), GBIF (Dr Mélianie Raymond, Senior Programme Officer for Node Development, GBIF Secretariat) and Dr Jasper Horrell (representing the Square Kilometre Array Organisation, SA).
The GBIF Integrated Publishing Toolkit (IPT) is a free open source software tool used to publish and share biodiversity datasets through the GBIF network. The IPT can also be configured with either a DataCite or EZID account in order to assign DOIs to datasets transforming it into a data repository. Dr Raymond during her presentation indicated that more portals, laptops/workstations and IPT installations for selected nodes are required to enhance the sharing and visibility of biodiversity data. Capacity building needs include training of researchers, students, lecturers and more in digitization, data cleaning, data publishing, and data analysis, towards more relevant and sustainable data use in support of decision making concerning biodiversity conservation.
H3ABioNet provides support for the H3Africa Human, Heredity & Health in Africa Consortium, which focuses on the study of genomics and environmental determinants of common diseases, with the goal of improving the health of African populations. Prof Mulder shared the limitations that apply when sharing human data, and the importance of protecting the rights and privacy of human subjects when participating in research studies. The project follows a well-established workflow using open source software tools, at the same time having policies built into the various stages of working with the data. As with biodiversity, skills need to be constantly developed, and infrastructure needs to be maintained and upgraded. A challenge faced by funded projects is that collected data needs to be curated when projects come to an end, and it is for governments to discuss as to whether data is regarded as a national asset, and who will fund the long-term curation of the data.
According to Dr Jasper Horrell from the Inter-University Institute for Data Intensive Astronomy (IDiA), key science on the SKA will be achieved by large-scale survey programs executed by globally distributed teams of researchers and through creating massive data. A cloud computing system that utilizes the OpenStack Infrastructure as a Service framework has been established by IDiA. OpenStack is a cloud operating system that controls large pools of compute, storage, and networking resources throughout a datacenter, all managed through a dashboard that gives administrators control while empowering their users to provision resources through a web interface. This ideal for the large amounts of data that is expected to be collected through the telescopes.
Regional NRENs represented indicated that they are in full support of working with AOSP on developing and populating a framework as part of service delivery to their research communities, and to also invite national NRENs in their respective regions to explore opportunities. Important elements to be included in such a document have been identified, and the group will continue as a working group, building on what is already in place through the SADC Cyberinfrastructure Framework, of which an overview was provided by Prof Colin Wright. This framework was approved by SADC ministers in June 2016, and the next step would be to revisit the existing framework and to adapt – where needed – for the whole of Africa, with input from key stakeholders across Africa. It was also clear that – through possible partnerships and lessons learned from KENET, Ilifu, DIRISA, Sci-GaIA and more, the design, development and implementation of ICT infrastructures in support of data sharing and curation can become a reality – sooner rather than later.
The AOSP ICT Infrastructure Framework will be tested during various stages and across different domains, before it will be finalized to be shared with African countries interested in advancing the sharing and responsible management of data.
Also view the following presentations:
The African Open Science Platform/Susan Veldsman
Introduction to GBIF for the African Open Science Platform/Mélianie Raymond
Research Infrastructures H3ABioNet Case Study/Nicky Mulder
Data Infrastructure Development for the SKA/Jasper Horrell
Reflections on the SADC Infrastructure Framework/Colin Wright