Creating and using representative Big Open datasets: global challenges and promises

WHEN: 15:15-16:30 (GMT+9), June 24 (Monday)

LOCATION: Hall D 2


During this symposium, the Open Science SIG will emphasize the theme of “global challenges and promises of Big Open datasets” via insights shared by 4 speakers with extensive expertise within the open science community. Big Neuroimaging datasets are increasingly being used to make rapid progress in discovering links between brain structure and function and behaviour. These large datasets offer impressive advantages for testing hypotheses with considerable power, allowing for the use of more sophisticated modelling techniques. At the same time, the widespread use of relatively few large, often Western and male, datasets can run the risk of over-representing the characteristics of those particular participants and overlooking more diverse populations. As a consequence, it is unclear just how generalisable some of the identified principles of brain structure and function may be.

Through this series of talks, key speakers involved in large neuroimaging datasets will discuss some of the critical choice points in their design, current best practices for harmonising imaging sequences and other phenotypic data between datasets, guidelines for appropriate interpretations, and approaches to overcome existing sex and gender inequalities. Attendees will find out about the challenges facing large dataset initiatives from across the world (including Europe, Africa, North America, and Asia), and discover the variety of data now becoming available.

The recorded talks will be released in our YouTube and DouYu channels after the conference!

Global FAIR Brain Data; collaborations across high-, medium- and low-income countries.

Speaker: Filima Patrick and Ebere Wogu (University of Port-Harcourt, Nigeria), Damian Eke (University of Nottingham, UK), and Franco Pestilli (Speaker, University of Texas, USA)


Neuroimaging research is a high-income country field. Challenges due to lack of training, infrastructure, and sociocultural barriers have limited data collection, analysis, and sharing in low- and medium-income countries. As of today, the FAIR principles for data stewardship have had a profound influence on research (Wilkinson et al., 2016), but are effectively a privileged concept for high-income countries. Undoubtedly, the global representation of the world population, heterogeneity, and diversity is still limited in the shared neuroimaging datasets. Brain datasets from low- and middle-income countries such as those in the African continent are still missing from the global research ecosystem. Global brain research outputs and neurotechnologies are largely informed only by datasets collected from populations in the global north. The scientific and translational implication of the lack of datasets in the global south can affect the development of therapies, limit innovation, and the generalization of findings to global world populations.


We will describe the Nigerian Brain Dataset: the first neuroimaging dataset publicly shared from Nigeria. We will describe some of the characteristics of this clinical-quality, low-income country dataset, as well as the barriers to collecting, organizing, and sharing the dataset. We will propose possible ways of mitigating the challenges in an attempt to contribute to advancing FAIR brain data in Africa. We will discuss the mitigation proposal in the context of the recently started Brain Research International Data Governance Exchange project (bridge.incf.org). Funded by the Wellcome Trust, BRIDGE aims to study the legal, ethical, and technical infrastructure challenges and develop facilitatory tools for data governance that can help data sharing across low-, medium- and high-income countries. The project also pursues establishing training, education, and research collaborations between African, Latin American, European, and North American countries.

Releasing the 3R-BRAIN resource: A decade of journey to harness psychometrics for neuroscience.

Speaker: Xinian Xuo (Beijing Normal University)

Reproducibility, replicability and reliability (3R) remain challenging for cognitive neuroscience while psychometric theory has been increasingly appreciated by the community. However, systematic psychometric assessments are sparse due to the lack of a well-designed large-scale neuroimaging resource. I will introduce a big data, namely 3R-BRAIN, to fill this gap. This open data contains three parts of richly sampled at individual level accounting for measurements of variability across scanners, time occasions, magnetic field strengths, task designs. I will officially announce the release of 3R-BRAIN.

UK Biobank: A Big Open dataset with global challenges and enormous promise

Speaker: Ollie Gray (UK Biobank)


In 2014, UK Biobank started the world’s largest multi-modal imaging study, with the aim of acquiring brain, cardiac and abdominal magnetic resonance imaging, dual-energy X-ray absorptiometry and carotid ultrasound data from 100,000 participants. To further enhance the phenotypic characterisation of the cohort, we are now in the process of inviting 60,000 participants back to a longitudinal repeat of the imaging assessment.


The availability of exquisitely detailed imaging data at scale has enabled the development of a growing range of image processing algorithms and pipelines by an increasingly global research community. This community interact with the dataset to generate a growing number of imaging-derived phenotypes that are subsequently integrated into the UK Biobank resource and released regularly back to the community.


We will share the challenges of both conducting a longitudinal study at scale and analysing the acquired data and describe solutions. We will describe how the UK Biobank’s cloud research environment, the Research Analysis Platform, provides computational power, funding, and training potential to drive scientific progress and enable global collaborations. Finally, we will provide examples of significant recent developments to the resource and highlight gains in scientific insight from cross-modal investigations into the plethora of UK Biobank imaging, genetic, and linked health data.

Bridging Gaps in Women’s Health Research: The ENIGMA Neuroendocrinology Working Group

Speaker: Carina Heller (ENIGMA Neuroendocrinology Working Group)


The persistent neglect of women’s health in research poses a significant barrier to effective diagnostic and treatment strategies1. Despite efforts to include sex as a biological variable (SABV) and integrate sex and gender based analysis (SGBA) into study designs, inequalities persist in both research and medical practice2. Notably, female patients are more likely to experience adverse drug effects compared to their male counterparts3. Moreover, only 5% of neuroscience and psychiatry studies in 2019 statistically examined the influence of sex and gender, emphasizing the ongoing gap in understanding these critical factors4. Additionally, a funding disparity in research exists between conditions that predominantly affect women, such as premenstrual dysphoric disorder, and those that affect both men and women (NIH reporter search).


Sex hormones such as estrogens, androgens, and progesterone play a crucial role in shaping the female brain throughout the lifespan. Important organizational effects occur during perinatal stages and transition phases such as puberty and pregnancy5. Hormonal fluctuations exert both short-term and long-lasting effects on brain structure and function, influencing mental health and contributing to mood disturbances, especially during those transition periods6.


The current state of research is at a pivotal crossroad, with advancements in technology and methods enabling researchers to investigate the biopsychological effects of sex hormones across a female’s lifespan. Recognizing the limitations of small, cross-sectional datasets in the past, the ENIGMA Neuroendocrinology Working Group emerges as a potent contributor to pool data from around the world, investigating the effects of hormones on the female brain in large datasets, particularly in under-studied conditions. The Lancet noted ENIGMA as an innovative model where “Crowdsourcing meets Neuroscience”7. By bridging historical data gaps and fostering collaboration, the scientific field moves closer to unlocking a deeper understanding of the biopsychological effects of hormones — a crucial step in promoting holistic healthcare for women across the lifespan.

References:
  1. Mauvais-Jarvis, F. et al. Sex and gender: modifiers of health, disease, and medicine. Lancet 396, 565–582 (2020).
  2. White, J., Tannenbaum, C., Klinge, I., Schiebinger, L. & Clayton, J. The Integration of Sex and Gender Considerations Into Biomedical Research: Lessons From International Funding Agencies. J. Clin. Endocrinol. Metab. 106, 3034 (2021).
  3. Karlsson Lind, L., Rydberg, D. M. & Schenck-Gustafsson, K. Sex and gender differences in drug treatment: experiences from the knowledge database Janusmed Sex and Gender. Biol. Sex Differ. 14, 1–4 (2023).
  4. Rechlin, R. K., Splinter, T. F. L., Hodges, T. E., Albert, A. Y. & Galea, L. A. M. An analysis of neuroscience and psychiatry papers published from 2009 and 2019 outlines opportunities for increasing discovery of sex differences. Nat. Commun. 2022 131 13, 1–14 (2022).
  5. Rehbein, E., Hornung, J., Sundström Poromaa, I. & Derntl, B. Shaping of the Female Human Brain by Sex Hormones: A Review. Neuroendocrinology 111, 183–206 (2021).
  6. Barth, C., Crestol, A., Lange, A.-M. G. de & Galea, L. A. M. Sex steroids and the female brain across the lifespan: insights into risk of depression and Alzheimer’s disease. Lancet Diabetes Endocrinol. 0, (2023).
  7. Mohammadi, D. ENIGMA: crowdsourcing meets neuroscience. Lancet. Neurol. 14, 462–463 (2015).


Learning Objectives:

  • Provide a state-of-the-art overview on how large open datasets were created - providing insight to those who plan to build their own large open neuroimaging datasets.
  • Shine a light on the need for further representative samples, from which to draw general conclusions about the mechanisms underlying human brain structure and function. For this it will highlight the various open datasets that are already available globally.
  • Provide considerable practical guidance for ECRs and more established researchers on how to start using these databases, and examples on how their use can supplement and enhance smaller, more targeted research studies.