Largest-ever cohort of U.S. twins fuels new BD Spoke study


Studying the causes of disease is essential to medical research. However, the discussion is sometimes framed, misleadingly, as ‘nature vs. nurture’—is your condition the result of your genetics or your environment? Generally, the answer is both. But to what degree? A new study in Nature Genetics explores this question for hundreds of diseases with a largest-of-its-kind data set.

Led by members of the Northeast Big Data Hub community and supported by a NSF BD Spoke grant, the study analyzed de-identified insurance claims data for 56,000 twin pairs. Its goal was to assess the extent to which heritability (the amount of disease variation that can be attributed to genetic factors) and local environment contributed to the diseases and other conditions found among this group. While many other study designs consider a single disease or environmental factor at a time, this study examined 560 in total, providing a wealth of information for future medical research. The researchers, who hail from Harvard Medical School and the University of Queensland, integrated individual-level genetic and diagnostic data with zip code-level information, which indicates local environmental influences—such as socioeconomic status, air pollution quality level and weather/climate.

Nature and/or nurture: a complex picture

Among their conclusions, they found that 40% of the diseases studied had some genetic component, while factors stemming from a shared living environment influenced 25% to some degree. Cognitive disorders demonstrated the greatest genetic contribution on average, while eye disorders and respiratory diseases demonstrated the greatest degree of environmental influence.

However, the average genetic contribution across all conditions was about 31%, while shared environment averaged at about 10%. In other words, more than half of the variation seen could not be explained by either shared environment or genetics. Alternate explanations might include environmental exposure that is not shared, such as different diets, or other factors that only one twin out of a pair experiences.

The researchers also examined the effects of specific environmental risk factors, such as socioeconomic status (SES) as determined by zip code. Morbid obesity demonstrated the strongest link to SES, but has both genetic and environmental influences. The researchers see their work as a first step toward further untangling this interplay between nature and nurture. “This finding opens up a whole slew of questions, including whether and how a change in socioeconomic status and lifestyle might compare against genetic predisposition to obesity,” noted senior study author Chirag Patel.

At 56,000+ twin pairs, the researchers compiled the largest-known twins cohort in the U.S.

Data on twins is particularly useful for health studies, given their shared genetic inheritance. Notably, this study created the biggest twin cohort in the United States—double the size of the next-largest-known U.S. cohort, the Mid-Atlantic Twin Registry, and comparable in size to the largest international twin registries. “Our findings can provide signposts that inform subsequent research efforts and help scientists narrowly focus their pursuits,” said study first author Chirag Lakhani.

Link to study

More information on our Health priority area

Learn more about the team’s work in a talk given by lead author Chirag Lakhani at our 2018 Annual Summit, below!