Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Dec;612(7941):720-724.
doi: 10.1038/s41586-022-05477-4. Epub 2022 Dec 7.

Genetic diversity fuels gene discovery for tobacco and alcohol use

Gretchen R B Saunders #  1 Xingyan Wang #  2 Fang Chen #  2 Seon-Kyeong Jang #  1 Mengzhen Liu #  1 Chen Wang #  2 Shuang Gao  2 Yu Jiang  3 Chachrit Khunsriraksakul  2 Jacqueline M Otto  1 Clifton Addison  4 Masato Akiyama  5   6 Christine M Albert  7   8 Fazil Aliev  9 Alvaro Alonso  10 Donna K Arnett  11 Allison E Ashley-Koch  12   13 Aneel A Ashrani  14 Kathleen C Barnes  15   16 R Graham Barr  17 Traci M Bartz  18   19 Diane M Becker  20 Lawrence F Bielak  21 Emelia J Benjamin  22   23 Joshua C Bis  18 Gyda Bjornsdottir  24 John Blangero  25 Eugene R Bleecker  26 Jason D Boardman  27 Eric Boerwinkle  28 Dorret I Boomsma  29 Meher Preethi Boorgula  15 Donald W Bowden  30 Jennifer A Brody  18 Brian E Cade  31   32   33 Daniel I Chasman  8 Sameer Chavan  15 Yii-Der Ida Chen  34 Zhengming Chen  35   36 Iona Cheng  37   38 Michael H Cho  39   40 Hélène Choquet  41 John W Cole  42   43 Marilyn C Cornelis  44 Francesco Cucca  45 Joanne E Curran  25 Mariza de Andrade  46 Danielle M Dick  9 Anna R Docherty  47   48   49 Ravindranath Duggirala  25 Charles B Eaton  50 Marissa A Ehringer  51   52 Tõnu Esko  53 Jessica D Faul  54 Lilian Fernandes Silva  55 Edoardo Fiorillo  56 Myriam Fornage  28   57 Barry I Freedman  58 Maiken E Gabrielsen  59 Melanie E Garrett  12   13 Sina A Gharib  18   60   61 Christian Gieger  62 Nathan Gillespie  48 David C Glahn  63 Scott D Gordon  64 Charles C Gu  65 Dongfeng Gu  66 Daniel F Gudbjartsson  24   67 Xiuqing Guo  68 Jeffrey Haessler  69 Michael E Hall  70 Toomas Haller  53 Kathleen Mullan Harris  71 Jiang He  72   73 Pamela Herd  74 John K Hewitt  51   75 Ian Hickie  76 Bertha Hidalgo  77 John E Hokanson  78 Christian Hopfer  79 JoukeJan Hottenga  29 Lifang Hou  44 Hongyan Huang  80   81 Yi-Jen Hung  82 David J Hunter  83 Kristian Hveem  59   84   85 Shih-Jen Hwang  86 Chii-Min Hwu  87 William Iacono  1 Marguerite R Irvin  77 Yon Ho Jee  80 Eric O Johnson  88   89 Yoonjung Y Joo  44   90 Eric Jorgenson  91 Anne E Justice  92   93 Yoichiro Kamatani  5   94 Robert C Kaplan  69   95 Jaakko Kaprio  96 Sharon L R Kardia  21 Matthew C Keller  51   75 Tanika N Kelly  72   73 Charles Kooperberg  19   69 Tellervo Korhonen  96 Peter Kraft  80   81 Kenneth Krauter  97 Johanna Kuusisto  98   99 Markku Laakso  98 Jessica Lasky-Su  100 Wen-Jane Lee  101 James J Lee  1 Daniel Levy  86 Liming Li  102 Kevin Li  103 Yuqing Li  37 Kuang Lin  35 Penelope A Lind  104   105   106 Chunyu Liu  107 Donald M Lloyd-Jones  108 Sharon M Lutz  109   110 Jiantao Ma  86   111 Reedik Mägi  52 Ani Manichaikul  112 Nicholas G Martin  64 Ravi Mathur  88 Nana Matoba  5   113 Patrick F McArdle  114 Matt McGue  1 Matthew B McQueen  115 Sarah E Medland  104 Andres Metspalu  53 Deborah A Meyers  26 Iona Y Millwood  35   36 Braxton D Mitchell  114   116 Karen L Mohlke  117 Matthew Moll  39   40 May E Montasser  114 Alanna C Morrison  28 Antonella Mulas  56 Jonas B Nielsen  59   118 Kari E North  93 Elizabeth C Oelsner  17 Yukinori Okada  119   120   121   122 Valeria Orrù  56 Nicholette D Palmer  30 Teemu Palviainen  96 Anita Pandit  103 S Lani Park  123 Ulrike Peters  69   124 Annette Peters  125   126   127 Patricia A Peyser  21 Tinca J C Polderman  128   129 Nicholas Rafaels  15 Susan Redline  31   32   130 Robert M Reed  131 Alex P Reiner  69   124 John P Rice  132 Stephen S Rich  112 Nicole E Richmond  78 Carol Roan  133 Jerome I Rotter  68 Michael N Rueschman  31 Valgerdur Runarsdottir  134 Nancy L Saccone  65   135 David A Schwartz  136 Aladdin H Shadyab  137 Jingchunzi Shi  138 Suyash S Shringarpure  138 Kamil Sicinski  133 Anne Heidi Skogholt  59 Jennifer A Smith  21   54 Nicholas L Smith  124   139   140 Nona Sotoodehnia  18   141 Michael C Stallings  51   75 Hreinn Stefansson  24 Kari Stefansson  24   142 Jerry A Stitzel  51 Xiao Sun  72 Moin Syed  1 Ruth Tal-Singer  143 Amy E Taylor  144   145   146 Kent D Taylor  68 Marilyn J Telen  12 Khanh K Thai  41 Hemant Tiwari  147 Constance Turman  80   81 Thorarinn Tyrfingsson  134 Tamara L Wall  148 Robin G Walters  35   36 David R Weir  54 Scott T Weiss  100 Wendy B White  149 John B Whitfield  64 Kerri L Wiggins  150 Gonneke Willemsen  29 Cristen J Willer  151   152   153 Bendik S Winsvold  59   154   155 Huichun Xu  114 Lisa R Yanek  20 Jie Yin  41 Kristin L Young  156 Kendra A Young  78 Bing Yu  28 Wei Zhao  21 Wei Zhou  153   157 Sebastian Zöllner  158   159 Luisa Zuccolo  144   146   160 23andMe Research TeamBiobank Japan ProjectChiara Batini  161 Andrew W Bergen  162   163 Laura J Bierut  132 Sean P David  164   165 Sarah A Gagliano Taliun  166   167   168 Dana B Hancock  88 Bibo Jiang  2 Marcus R Munafò  144   145   169 Thorgeir E Thorgeirsson  24 Dajiang J Liu  170 Scott Vrieze  171
Affiliations

Genetic diversity fuels gene discovery for tobacco and alcohol use

Gretchen R B Saunders et al. Nature. 2022 Dec.

Abstract

Tobacco and alcohol use are heritable behaviours associated with 15% and 5.3% of worldwide deaths, respectively, due largely to broad increased risk for disease and injury1-4. These substances are used across the globe, yet genome-wide association studies have focused largely on individuals of European ancestries5. Here we leveraged global genetic diversity across 3.4 million individuals from four major clines of global ancestry (approximately 21% non-European) to power the discovery and fine-mapping of genomic loci associated with tobacco and alcohol use, to inform function of these loci via ancestry-aware transcriptome-wide association studies, and to evaluate the genetic architecture and predictive power of polygenic risk within and across populations. We found that increases in sample size and genetic diversity improved locus identification and fine-mapping resolution, and that a large majority of the 3,823 associated variants (from 2,143 loci) showed consistent effect sizes across ancestry dimensions. However, polygenic risk scores developed in one ancestry performed poorly in others, highlighting the continued need to increase sample sizes of diverse ancestries to realize any potential benefit of polygenic prediction.

PubMed Disclaimer

Conflict of interest statement

The spouse of N.L. Saccone is listed as an inventor on issued U.S. patent 8080371 ‘Markers of addiction’, covering the use of certain single-nucleotide polymorphisms in determining the diagnosis, prognosis and treatment of addiction. M.H.C. has received grant funding from GSK and Bayer, and speaking or consulting fees from AstraZeneca, Illumina and Genentech. R.T.-S. is a former employee and current shareholder of GSK and is currently a non-executive member of the ENA Respiratory board of directors. She reports personal fees from Teva, Immunomet, Vocalis Health and ENA Respiratory (until January 2021). D.A.S. is the founder and chief scientific officer of Eleven P15, a company focused on the early diagnosis of treatment of pulmonary fibrosis. J.B.N. and E.J. are employed by Regeneron Pharmaceuticals, Inc. The spouse of C.J.W. is employed by Regeneron Pharmaceuticals, Inc. L.J.B. is listed as an inventor on Issued U.S. Patent 8080371 ‘Markers for addiction’, covering the use of certain single-nucleotide polymorphisms in determining the diagnosis, prognosis and treatment of addiction. The 23andMe Research Team, including J.S. and S.S.S., are employees of 23andMe, Inc., and hold stock and/or stock options in 23andMe. T.E.T., D.F.G., H.S., G.B. and K. Stefansson are employees of deCODE genetics/AMGEN. M. Moll received grant support from Bayer. A.W.B. is listed as a co-inventor on a U.S. patent application ‘Biosignature discovery for substance use disorder using statistical learning’ assigned to BioRealm, LLC, and serves as a scientific advisor and consultant to BioRealm, LLC. All other authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Ancestry composition and effect size moderation.
a, Ancestry compositions of contributing studies (each point is a study). Colours are coded by primary ancestry of individuals in the cohort. Studies with less than 90% of individuals assignable to a single ancestry group are shown in grey. Ancestry component 3 was a north–south EUR cline, which was omitted here as we did not conduct meta-analyses stratified by northern versus southern Europe. TOPMed, Trans-Omics for Precision Medicine. b, Extent of effect size moderation as a function of the same ancestry dimensions as shown in a. The full moderation results are in Supplementary Table 2. Each point in b represents an independent variant with the standardized MDS component coefficient from our trans-ancestry models (that is, γ) along the x axes, and the corresponding mean difference in effect sizes (β) for the ancestry-stratified meta-analysis of the given ancestry versus all other ancestries along the y axes. The grey circles indicate variants showing little to no evidence of effect size heterogeneity across ancestry, whereas the coloured circles represent variants with adequate evidence of effect size heterogeneity. The plots highlight that the majority of variants have similar effect sizes across all ancestry clines, with some potentially interesting exceptions in which the variant effects sizes differ substantially between ancestry clines.
Fig. 2
Fig. 2. Within-ancestry and across-ancestry performance of polygenic scores in an independent target sample (Add Health).
a, Incremental variance explained for each target ancestry group. The colour of the stacked bars indicates the ancestry from which the polygenic score was derived; the total height of each set of the stacked bars (and 95% confidence intervals) correspond to the total variance explained by all four ancestry-stratified scores combined. For example, in the target EUR subsample, non-EUR polygenic scores add little over and above the EUR score. Note that some comparisons are underpowered to detect differences in predictive accuracy across ancestry (see Supplementary Note). Heritabilities, estimated by LD score regression, of each phenotype–ancestry combination are depicted by the grey dashed bar (with 95% confidence intervals) and corresponding sample sizes; these represent the maximum expected accuracy of the polygenic risk score (PRS). b, The manner in which the phenotype mean in the target sample changes as a function of the EUR PRS deciles. c, Results from an interaction model, in which each phenotype was modelled as a function of an interaction between the EUR-based PRS and target ancestry (coded as a factor with EUR ancestry as the reference and scores scaled within ancestry). The bands around each line denote the 95% confidence intervals. Significant interactions are noted with text. Using SmkInit as an example, the purple line represents the predicted proportion of regular smokers as a function of the EUR PRS in the EUR subsample of Add Health, the blue lines show the predicted proportion of regular smokers by standard deviation of the EUR PRS in the EAS subsample, and so on. In this case, the magnitude of the association between the EUR-based PRS and SmkInit (that is, the slope) was significantly greater in the EUR target ancestry than all other ancestries. Full PRS results are in Supplementary Table 12.
Extended Data Fig. 1
Extended Data Fig. 1. Ancestry space of studies contributing to meta-analysis (panel a), versus individuals from TOPMed and 1000 Genomes (panel b).
The meta-regression within the MEMO model requires specification of ancestry clines. To ensure consistency in the meaning of ancestry clines across all five MEMO analyses (one for each phenotype) we created a single multidimensional scaling solution based on allele frequencies from all phenotypes in all participating cohorts. These solutions are plotted in panel a (circles correspond to TOPMed cohorts, squares are all other cohorts which used imputed microarray genotypes, and triangles are 1000 Genomes ancestry groups). Colors of points correspond to the primary assigned ancestry of each cohort (studies with 
Extended Data Fig. 2
Extended Data Fig. 2. Multi-ancestry meta-analysis Manhattan plots.
Black horizontal line corresponds to P = 5 × 10−9, the GWAS significance threshold used for all analyses. Note that some y-axis scales are discontinuous to better illustrate variants with very small P-values (e.g., the Drinks per Week y-axis is cut at 30 with a maximum value of 307.7, denoting a P-value of 1.9 × 10−308). All P-values are from two-sided statistical tests.
Extended Data Fig. 3
Extended Data Fig. 3. Tissue expression and brain cell type enrichment in high priority genes.
Panel a shows tissue expression enrichment in ‘high priority’ genes. We define high priority genes here as those located nearest to the variants in fine-mapped credible intervals containing less than five variants. These genes were compared to ‘control’ genes identified in the same way, but from variants in credible intervals with PIP 

Similar articles

Cited by

References

    1. World Health Organization. Tobacco. WHOhttps://www.who.int/news-room/fact-sheets/detail/tobacco (2022).
    1. World Health Organization. Alcohol. WHOhttps://www.who.int/news-room/fact-sheets/detail/alcohol (2022).
    1. World Health Organization. The top 10 causes of death. WHOhttps://www.who.int/news-room/fact-sheets/detail/the-top-10-causes-of-death (2020).
    1. Griswold MG, et al. Alcohol use and burden for 195 countries and territories, 1990–2016: a systematic analysis for the Global Burden of Disease Study 2016. Lancet. 2018;392:1015–1035. - PMC - PubMed
    1. Liu M, et al. Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use. Nat. Genet. 2019;51:237–244. - PMC - PubMed

Publication types

Grants and funding