"Improving the prevention, diagnosis and treatment of a wide range of serious and life-threatening illnesses – including cancer, heart diseases, stroke, diabetes, arthritis, osteoporosis, eye disorders, depression and forms of dementia."
UK Biobank is a long-term prospective biobank study in the United Kingdom (UK)[1] that houses de-identified[2] biological samples and health-related data[3] on half a million people. Volunteer participants aged 40-69 were recruited between 2006 and 2010[1] from across Great Britain and consented to share their health data and to be followed for at least 30 years thereafter with the aim to enable scientific discoveries into the prevention, diagnosis, and treatment[3] of disease.
UK Biobank holds more than 10,000 variables of data on many of their 500,000 participants to inform research including biological samples, physical measurements, body and brain imaging data, bone density data, activity tracking and lifestyle questionnaire data.[9] Participants continue to provide more data over time. They have over 15 million biological samples stored, which can be requested for use by researchers, and their online database holds over 30 petabytes of data.[10][11][12] Their human genome sequencing database, proteomic database, and human imaging project are the largest in the world.[13][14] The project is enabling scientists to study the onset of diseases such as cancers, heart disease, and age-related conditions in the early stages of their development.[15][16][17]Nature has referred to UK Biobank as an "unprecedented open access database."[18]
Since 2012,[19] 30,000 researchers from over 90 countries[20] have registered to use UK Biobank. As of November 2023 there have been over 9,000[21] peer-reviewed publications using UK Biobank data, including over 3,000 in 2023.[22]
UK Biobank was conceived in the early 2000s,[24] with Professor Sir Rory Collins appointed as the Principal Investigator and Chief Executive of UK Biobank in 2005.[25] An incremental approach was adopted to developing the study procedures and technology, using systems designed and developed by the Clinical Trial Service Unit. This consisted of a series of pilot studies of increasing complexity and sophistication with interludes for assessment of results and additional scientific input. In-house trials were conducted during 2005, and a fully integrated clinic was run at Altrincham, Greater Manchester throughout Spring 2006 where 3,800 individuals were assessed. On 22 August 2006, it was announced that the main programme would recruit men and women aged between 40 and 69 based from up to 35 regional centres.[26]
Following the initial pilot stage in the 2005-6 period,[27] the main study began in April 2007 and by the end of that year 50,000 people had taken part. Recruitment reached 100,000 in April 2008, 200,000 in October 2008, 300,000 in May 2009, 400,000 in November 2009 and passed the 500,000 target in July 2010. The volunteers were largely healthy, wealthy and white European. Rather than recruiting more participants into the biobank, the organisation is helping other institutions establish and run similar initiatives.[28] Participant enrolment was declared complete in August 2010.[29] However recruitment proved more efficient than hoped and only 22 centres had been opened when the recruitment target of 500,000 was reached in 2010.
In August 2022, UK Biobank celebrated its 20th anniversary.[30]
The study is following about 500,000 volunteers in the UK, enrolled at ages from 40 to 69. Initial enrolment took place over four years from 2006, and the volunteers will be followed for at least 30 years thereafter.[34]
Prospective participants were invited to visit an assessment centre, at which they completed an automated questionnaire and were interviewed about lifestyle, medical history and nutritional habits; basic variables such as weight, height, blood pressure etc. were measured; and blood and urine samples were taken. These samples were preserved so that it was possible to later extract DNA and measure other biologically important substances. During the whole duration of the study it was intended that all disease events, drug prescriptions and deaths of the participants are recorded in a database, taking advantage of the centralized UK National Health Service.[35][36]
During the initial physical examination, basic feedback was provided to the participant regarding their weight, height, BMI, blood pressure, lung vital capacity, bone density and intra-ocular pressure; however if any other medical problems were detected, neither the participant nor their physician would be notified. Problems detected later, such as genetic risk factors, were not conveyed to either participant or physician ("to ensure that volunteers are not penalised by insurance companies, for example, which may require customers to disclose the results of any genetic tests.").[37]
From 2012, researchers were able to apply to use the database (though they are not given access
to the volunteers, who will remain strictly anonymous).
A typical study using the database might compare a sample of participants who developed a particular disease, such as cancer, heart disease, diabetes or Alzheimer's disease, with a sample of those that did not, in an attempt to measure the
benefits, risk contribution and interaction of specific genes, lifestyles, and medications.
In 2017 researchers were able to access the database including genetic information.[38][39] By 2017 Biobank participants had approximately 1.3 million hospitalisations, 40,000 cancer incidents with 14,000 of them having died.[40]
Since the completion of recruitment several new types of data have been added:
During 2011-12 participants who supplied an email address were asked to assist by completing web-based dietary questionnaires, with the aim of combining a series of daily 'snapshots' to form a picture of overall nutrition. 176,012 of the participants responded at least once and 27,535 completed four questionnaires over a 16-month period.[40]
During 2012–13 25,000 participants at the Stockport centre were asked to attend the assessment centre to repeat the initial measurements. It was intended to repeat these assessments every few years.[40]
In 2013 to 2015, Axivity AX3 tri-axial wrist physical activity monitors were distributed to 100,000 participants, which recorded week-long triaxial acceleration at 100 Hz.[41][42] This data was centrally processed, and listed on the Data Showcase.[43][44]
In 2014 and 2015 120,000 participants completed a questionnaire on cognitive functions. Four of the tests were repeats of the initial assessment and two tests (symbol digit substitution and trail making) were new.[40]
A new type of assessment centre opened in 2014 to collect imaging data. The visits extended the initial dataset to include magnetic resonance imaging (MRI) scans of brain[45][46][47][48] heart and abdomen, as well as neck-to-knee volumetric MRI scans, whole body dual-energy X-ray absorptiometry (DXA) scan of bones and joints, ultrasound measurements of the carotid arteries and resting 12-lead electrocardiogram (ECG). Initial data on 4,000 participants was released at the end of 2015 and by mid-2018 over 25,000 participants had been scanned. It is planned to scan 100,000 participants by 2022, and to do additional repeat scans on 10,000 of these 2–3 years later.[40]
In 2015 and 2016, 117,500 participants completed questionnaires on occupational history and related medical information.[40]
In 2016 and 2017 137,400 participants completed questionnaires on mental health events including subjective well-being estimates, psychotic experiences, self-harm behaviours, traumatic events and cannabis and alcohol use.[40]
A set of additional assays on the blood and urinary samples were being conducted in 2016 and 2017[40] with blood results expected to be released in Q4/2018.
A genomic assay of 820,967 SNPs was conducted on the participants blood samples. Data from an initial 150,000 participants were released in 2015, the remainder in July 2017,[49][39] and the first results in October 2018.[50][51]
Information from UK registries of death (from 2006) and cancer (Scotland from 1957, England and Wales from 1995) were linked to the main Biobank dataset on an ongoing basis.[40]
Data from NHS hospital inpatient records (England from 1996, Scotland from 1997 and Wales from 1998) were linked to the main dataset on an ongoing basis.[40]
In 2019 exome sequence data from 50,000 persons was released, with 470,000 being available in 2023.[4]
In 2020 20,000 volunteers agreed to collect and send a monthly blood sample for analysis of SARS-CoV-2 antibodies. They included existing Biobank participants and their children and adult grandchildren living in separate households.[52]
In 2021 NMR metabolomic data on approximately 121,000 persons was released.[53]
In June 2021 a subset of volunteers who had acknowledged that they had already received at least their first COVID-19 vaccine dose, were asked to participate in a study to determine if their COVID-19 antibodies were as a result of their vaccination or from a prior infection.
In October 2023, measurements of circulating proteins assayed by Olink's Proximity extension assay DNA-linked antibody technology were published[54] for a subset of ~53,000 UK Biobank participants. The UK Biobank Pharma Proteomics Project was funded by a consortium of pharmaceutical companies.[55]
In 2023, UK Biobank released the whole genome sequencing data of all 500,000 participants, the largest number of whole genome sequences ever released for medical research.[5][6] The release was supported by UKRI, Wellcome, industry partners including Amgen, AstraZeneca, GSK, and Johnson & Johnson, with sequencing conducted by deCODE Genetics and the Wellcome Sanger Institute.[7]
Following the initial pilot stage in the 2005-6 period, the main study began in April 2007 and by the end of that year 50,000 people had taken part.
Recruitment reached 100,000 in April 2008, 200,000 in October 2008, 300,000 in May 2009, 400,000 in November 2009 and passed the 500,000 target in July 2010. Participant enrolment was declared complete in August 2010.[56] The volunteers were largely healthy, wealthy and white European. Rather than recruiting more participants into the biobank, the organisation is helping other institutions establish and run similar initiatives.[57]
The UK Biobank dataset was opened to applications from researchers in March 2012.[58] The resource is available to scientists from the UK and outside, whether they work in the public or private sector, for industry, academia or a charity, subject to verification that the research is health-related and in the public interest. Researchers must register to be approved to use UK Biobank data.[59] Researchers are required to publish their results in an open source publication site or in an academic journal and return their findings to the UK Biobank.[60]
In 2021, UK Biobank launched its cloud-based Research Analysis Platform (UKB-RAP), providing information technology infrastructure to store and analyse UK Biobank's large dataset regardless of the researcher's own technological capabilities.[61] By 2023 the platform had over 5,000 users.[62] The platform is hosted by Amazon Web Services, which also pledged $1.5 million in research credits for early career researchers and researchers from low and low-middle income countries to reduce limitations when collating, storing and securely accessing data.[63]
By 2023 30,000 researchers had registered to use the resource and over 9,000 peer-reviewed articles based on UK Biobank data had been published.[64]
A 2023 review found that participants with sense of meaning and purpose in life have a decreased risk of dementia.[66] Two other studies of participants in the UK Biobank study found that dementia risk was higher for those who were more socially isolated.[67]
A study from the UK Biobank showed a reduction in grey matter thickness, overall reduction in brain size and greater cognitive decline in patients after COVID-19 compared with control groups.[68][69] The UK Biobank also reported on an increased risk of hospitalization for those who contracted COVID-19 with obesity.[70]
Reviews of UK Biobank data have found that pescatarians and vegetarians have a lower risk of colorectal and prostate cancer compared to red meat eaters.[71] Consumption of processed meat increases risk of breast cancer.[72] They have also found that men with higher total and central adiposity have an increased risk of prostate cancer death.[73]
The UK Biobank project operates within the terms of an Ethics and Governance Framework.[74][75][76] The Framework describes a series of standards to which UK Biobank will operate during the creation, maintenance and use of the resource and it elaborates on the commitments that are involved to those participating in the project, researchers and the public more broadly. The independent UK Biobank Ethics and Governance Council provides advice to the project and monitors its conformity with the Framework.[77] The Council also advises more generally on the interests of research participants and the general public in relation to the project.[citation needed]
The project has been generally praised for its ambitious scope and unique potential. A scientific review panel concluded, the "UK Biobank has the potential, in ways that are not currently available elsewhere, to support a wide range of research".[56]Colin Blakemore, chief executive of the MRC, predicted it "will provide scientists with extraordinary information"[79] and "grow into a unique resource for future generations."[56] There was some early criticism, however. GeneWatch UK, a pressure group that claims to promote the responsible use of genetic information, asserted that the complexity of the programme could result in the finding of "false links between genes and disease",[56] and expressed concern that the genetic information from patients could be patented for commercial purposes. Biobank's chief executive described such a risk as "extremely low, if it exists at all."[79]
Some literature has raised concerns that the UK Biobank is not representative of the diversity of the UK population or is not applicable to diverse populations.[80][81]
Questions and concerns have been raised about the nature of some of the approved studies that have made use of UK Biobank data. Some of these studies are less obviously related to UK Biobank's self-stated mission of "Improving the prevention, diagnosis and treatment of a wide range of serious and life-threatening illnesses". In 2016, a study on "genetic contributions to social deprivation and household income" was published using UK Biobank data.[82] In 2019, researchers published a study on the genetics of same-sex behaviour, which made use of data from 23andMe and UK Biobank.[83] Following the publication, researchers and bioethicists voiced concerns about the approval and nature of the study, including how the findings were being exploited commercially, similar to a former UK Biobank-based study on educational attainment.[84]
In November 2023, The Observer reported that UK Biobank had approved access for companies in the insurance sector between 2020 and 2023 and that this runs counter some of the publicly made claims about how insurance companies and law enforcement would not access the data and records donated by volunteers.[85] In response, UK Biobank released a response, stating that the quoted passages were only made in relation to identifiable data, not the de-identified data that was ultimately shared with insurance companies.[86]
In 2023 the UK's Secretary of State for Science, Innovation and TechnologyMichelle Donelan described UK Biobank as "mak[ing] an unparalleled contribution to science across the whole world, by putting invaluable information at researchers' fingertips. It is without question a jewel in the crown of UK science, and an envy of the world".[87]
In October 2024, The Guardian published a story detailing that the far-right "Human Diversity Foundation" had gotten access to UK Biobank data to use it in pseudo-scientific race studies, raising questions about the governance and processes for approving and controlling access to the data.[88] Following the initial publication in The Guardian, UK Biobank released a statement that criticised the report and dismissed the main findings, claiming to have conducted a full investigation that had "found no evidence of misuse of UK Biobank data" as only publicly accessible summary statistics where used.[89] However, in a follow-up report by The Guardian after UK Biobank's statement, the newspaper reported on correspondence between a senior medic and UK Biobank chief executive, Prof Sir Rory Collins, in which Collins said that the inquiries were continuing.[90] In the same article, The Guardian also reported on the US-based startup, Heliospect Genomics that claims to use UK Biobank data to predict traits such as IQ, sex and height, as well as risk of obesity or mental illness, in human embryos for IVF treatment.
The UK Biobank is funded by the UK Department of Health, the Medical Research Council, the Scottish Executive, and the Wellcome Trust medical research charity. The cost of the initial participant recruitment and assessment phase was 62 million GBP.[91]
^Alfaro Almagro, F. (25 April 2017). "Image Processing and Quality Control for the first 10,000 Brain Imaging Datasets from UK Biobank". bioRxiv10.1101/130385.