National Survey on Drug Use and Health (NSDUH-2016)

Parent Series Details:


The National Survey on Drug Use and Health (NSDUH) series, formerly titled National Household Survey on Drug Abuse, is a major source of statistical information on the use of illicit drugs, alcohol, and tobacco and on mental health issues among members of the U.S. civilian, non-institutional population aged 12 or older. The survey tracks trends in specific substance use and mental illness measures and assesses the consequences of these conditions by examining mental and/or substance use disorders and treatment for these disorders.

Examples of uses of NSDUH data include the identification of groups at high risk for initiation of substance use and issues among those with co-occurring substance use disorders and mental illness.

NSDUH public-use data files are available for download in SAS, SPSS, STATA and ASCII formats, and online analysis with SDA. NSDUH restricted-use data files are available for online analysis with the R-DAS.

The NSDUH is sponsored by the Center for Behavioral Health Statistics and Quality (formerly Office of Applied Studies), Substance Abuse and Mental Health Services Administration. For more information, visit the NSDUH website.

NSDUH State and Substate Estimates

Due to disclosure limitations for respondent confidentiality, a state variable is not included in public-release datasets. The data tables on SAMHSA’s site and the interactive NSDUH State and Substate Estimates tools use the full non-restricted analytic file that includes the state variable. 

NSDUH restricted-use data availability information is here:

SAMHSA’s restricted-use data analysis system (RDAS) is an online crosstab tool that includes restricted data, including certain geographic identifiers, like state:

NSDUH state and substate estimates are located here:

1999-2019 NSDUH Small Area Estimation


NSDUH Variable Crosswalk Charts

PUFVariableCrosswalkChart_2019 and Prior.xlsx
PUFVariableCrosswalkChart_2018 and Prior.xlsx

NSDUH Reports and Detailed Tables

NSDUH Questionnaire Details

The population of the NSDUH series is the general civilian population aged 12 and older in the United States. Questions include age at first use, as well as lifetime, annual, and past-month usage for the following drugs: alcohol, marijuana, cocaine (including crack), hallucinogens, heroin, inhalants, tobacco, pain relievers, tranquilizers, stimulants, and sedatives. The survey covers substance abuse treatment history and perceived need for treatment, and includes questions from the Diagnostic and Statistical Manual (DSM) of Mental Disorders that allow diagnostic criteria to be applied.

Respondents were also asked about personal and family income sources and amounts, health care access and coverage, illegal activities and arrest record, problems resulting from the use of drugs, perceptions of risks, and needle-sharing. Demographic data include gender, race, age, ethnicity, educational level, job status, income level, veteran status, household composition, and population density.

The questionnaire was significantly redesigned in 1994. The 1994 survey included for the first time a rural population supplement to allow separate estimates to be calculated for this population. Other modules have been added each year and retained in subsequent years: mental health and access to care (1994-B); risk/availability of drugs (1996); cigar smoking and new questions on marijuana and cocaine use (1997); question series asked only of respondents aged 12 to 17 (1997); questions on tobacco brand (1999); marijuana purchase questions (2001); prior marijuana and cigarette use, additional questions on drug treatment, adult mental health services, and social environment (2003); and adult and adolescent depression questions derived from the National Comorbidity Survey, Replication (NCS-R) and National Comorbidity Survey, Adolescent (NCS-A) (2004).

Survey administration and sample design were improved with the implementation of the 1999 survey, and additional improvements were made in 2002. Since 1999, the survey sample has employed a 50-state design with an independent, multistage area probability sample for each of the 50 states and the District of Columbia. At this time, the collection mode of the survey changed from personal interviews and self-enumerated answer sheets to using computer-assisted personal interviews and audio computer-assisted self-interviews. In 2002, the survey’s title was officially changed to the National Survey on Drug Use and Health (NSDUH).

Since 2002, participants are given $30 for participating in the study. This resulted in an increase in participation rates from the years prior to 2002. Also, in 2002 and 2011, the new population data from the 2000 and 2010 decennial Censuses, respectively, became available for use in the sample weighting procedures. For these reasons, data gathered for 2002 and beyond cannot validly be compared to data prior to 2002.

NSDUH underwent a partial redesign in 2015, so there are several measures that “broke trends” in 2015, meaning that estimates from 2015 and later are no longer comparable to their 2014 and earlier counterparts. This also means that you cannot pool data across incomparable years. For affected measures, you will likely only be able to look at the 2002-2014 timeframe to pool enough years of comparable data to get a sufficient sample size at the county level. Measures that were not affected can be pooled through 2015. More information on the partial 2015 redesign and its effects on estimates is available here:


Study Details:

The target population for the 2016 survey was the same as has been defined since the 1991 survey: the civilian, noninstitutionalized population of the United States (including civilians living on military bases) who were 12 years of age or older at the time of the survey. Before 1991, the sample was drawn from the household population of the contiguous 48 states. Residents of Alaska and Hawaii were added to the sample population in 1991, as were residents of noninstitutional group quarters (e.g., college dormitories, group homes, civilians dwelling on military installations) and persons with no permanent residence (homeless people in shelters and residents of single rooms in hotels). In addition, six special-interest metropolitan statistical areas (MSAs) were oversampled in 1991. The 1992 and 1993 surveys retained the oversampling of the six MSAs and also were designed to provide quarterly as well as annual estimates. Since 1999, the survey sample has employed a 50-state design with an independent, multistage area probability sample for each of the 50 states and the District of Columbia. For the 1999 through 2013 surveys, the 8 states with the largest populations (which together account for 48 percent of the total U.S. population aged 12 or older) were designated as large sample states (California, Florida, Illinois, Michigan, New York, Ohio, Pennsylvania, and Texas) with a target sample size of 3,600, and the remaining 42 states and the District of Columbia had target sample sizes of 900. The 2014 through 2017 sample design allows for a more cost-efficient sample allocation to the largest states, while maintaining sufficient sample sizes in the smaller states to support small area estimation at the state and substate levels. The 2014 through 2017 NSDUHs were designed to yield 4,560 completed interviews in California; 3,300 completed interviews each in Florida, New York, and Texas; 2,400 completed interviews each in Illinois, Michigan, Ohio, and Pennsylvania; 1,500 completed interviews each in Georgia, New Jersey, North Carolina, and Virginia; 967 completed interviews in Hawaii; and 960 completed interviews in each of the remaining 37 states and the District of Columbia. Consistent with previous designs, the 2014 through 2017 design also oversamples youths aged 12 to 17 and young adults aged 18 to 25. However, the 2014 through 2017 design places more sample in the 26 or older age groups to more accurately estimate drug use and related mental health measures among the aging drug- using population. The 2016 sample was allocated to age groups as follows: 25 percent for youths aged 12 to 17, 25 percent for young adults aged 18 to 25, 15 percent for adults aged 26 to 34, 20 percent for adults aged 35 to 49, and 15 percent for adults aged 50 or older.


Study Redilevery

FILEDATE: 02/28/2018

Number of Variables: 2,664

Number of Observations: 56,897


Study Scope

Time period: 
Collection date: 
Geographic coverage : 
United States
Unit of observation: 
Data types: 
survey data
The civilian, noninstitutionalized population of the United States aged 12 and older, including residents of noninstitutional group quarters such as college dormitories, group homes, shelters, rooming houses, and civilians dwelling on military installatio
Data were collected and prepared for release by Research Triangle Institute, Research Triangle Park, North Carolina.
Since 1999, the survey sample has employed a 50-state design with an independent, multistage area probability sample for each of the 50 states and the District of Columbia.
Prior to the 2002 survey, this series was titled National Household Surveys on Drug Abuse.
Although the design of the 2016 survey is similar to the design of the 1999 through 2001 surveys, there are important methodological differences since 2002 that affect the estimates. Each NSDUH respondent since 2002 has been given an incentive payment of $30. This change resulted in an improvement in the survey response rate. In addition, in 2002 and 2011 new population data from the 2000 and 2010 decennial Censuses, respectively, became available for use in NSDUH sample weighting procedures. Therefore the data from 2002 and later should not be compared with data collected in 2001 or earlier to assess changes over time.
For selected variables, statistical imputation was performed following logical inference to replace missing responses. These variables are identified in the codebook as "...LOGICALLY ASSIGNED" for the logical procedure, or by the designation "IMPUTATION-REVISED" in the variable label when the statistical procedure was also performed. The names of statistically imputed variables begin with the letters "IR". For each imputation-revised variable, a corresponding imputation indicator variable indicates whether a case's value on the variable resulted from an interview response or was imputed. Missing values for some demographic variables were imputed by the unweighted hot-deck technique used in previous surveys. Beginning in 1999, imputation of missing values for most variables was accomplished using predictive mean neighborhoods (PMN), a new procedure developed specifically for this survey. Both the hot-deck and PMN imputation procedures are described in the codebook.
To protect the privacy of respondents, all variables that could be used to identify individuals have been encrypted or collapsed in the public use file. To further ensure respondent confidentiality, the data producer used data substitution and deletion of state identifiers and a subsample of records in the creation of the public use file.
Previously published estimates may not be exactly reproducible from the variables in the public use file due to the disclosure protection procedures that were implemented.
The setup and dictionary files for Stata are designed to be compatible with StataSE, Version 8 and later. This is a large data file requiring that approximately 400 megabytes of Random Access Memory be allocated to Stata. Operations within Stata, including conversion of the ASCII data to Stata format, are likely to be slow. Analysts may wish to download subsets of data from the SAMHDA Survey Documentation and Analysis (SDA) system for use with Stata.
In the income section, which was interviewer-administered, a split-sample study had been embedded within the 2006 and 2007 surveys to compare a shorter version of the income questions with a longer set of questions that had been used in previous surveys. This shorter version was adopted for the 2008 NSDUH and will be used for future NSDUHs.
Subject Terms: 
  • addiction
  • alcohol
  • alcohol abuse
  • alcohol consumption
  • amphetamines
  • barbiturates
  • cocaine
  • controlled drugs
  • crack cocaine
  • demographic characteristics
  • depression (psychology)
  • drinking behavior
  • drug abuse
  • drug dependence
  • drug treatment
  • drug use
  • drugs
  • employment
  • hallucinogens
  • health care
  • heroin
  • households
  • income
  • inhalants
  • marijuana
  • mental health
  • mental health services
  • methamphetamine
  • pregnancy
  • prescription drugs
  • sedatives
  • smoking
  • stimulants
  • substance abuse
  • substance abuse treatment
  • tobacco use
  • tranquilizers
  • youths

Study Methodology

Mode of data collection: 
Like the 1999 to 2015 surveys, the 2016 survey was conducted using CAI methods. This survey also allows for improved state estimates based on minimum sample sizes per state. The target sample size of 67,507 allows SAMHSA to continue reporting adequately precise demographic subgroup estimates at the national level without needing to oversample specially targeted demographics, as was required in the past. The achieved sample size for the 2016 survey was 67,942 individuals. A coordinated sample design was developed for the 2014 through 2017 NSDUHs. The coordinated design facilitated a 50 percent overlap in third-stage units (area segments [see below]) between each 2 successive years from 2014 through 2017.11 This design was intended to increase the precision of estimates in year-to-year trend analyses because of the expected positive correlation resulting from the overlapping sample between successive survey years. The 2016 design allows for computation of estimates by state in all 50 states plus the District of Columbia. States may therefore be viewed as the first level of stratification and as a reporting variable. Compared with previous sample designs, the 2014 through 2017 sample design moves from two to essentially five state sample size groups (lumping Hawaii with the remaining states and the District of Columbia). The 2014 through 2017 surveys have a sample designed to yield 4,560 completed interviews in California; 3,300 completed interviews each in Florida, New York, and Texas; 2,400 completed interviews each in Illinois, Michigan, Ohio, and Pennsylvania; 1,500 completed interviews each in Georgia, New Jersey, North Carolina, and Virginia; 967 completed interviews in Hawaii; and 960 completed interviews in each of the remaining 37 states and the District of Columbia—for a total national target sample size of 67,507. The sample is selected from 6,000 area segments that vary in size according to state. The change in the state sample allocation was driven by the need to increase the sample in the original 43 small states (to improve the precision of state and substate estimates in these states) while moving closer to a proportional allocation in the larger states.
The estimates yielded by NSDUH are based on sample survey data rather than on complete data for the entire population. This means that the data must be weighted to obtain unbiased estimates for survey outcomes in the population represented by the 2016 NSDUH. The "final analysis weight" of the ith respondent, say wi, can be interpreted as the number of sampling units in the NSDUH target population represented by the ith respondent. The sum of the weights over all respondents is used to estimate the size of the total target population: ∑w = estimated size of target population, i where the summation is over all respondents in the 2016 NSDUH. Similar to the 2015 NSDUH, three sets of analysis weights at the person level, questionnaire dwelling unit (QDU) level, and person pair level were developed for the 2016 NSDUH. The person-level, QDU-level, and person pair-level analysis weights shared the same first 11 weight components at the screening dwelling unit (SDU) level. In addition to the 11 common weight components, QDU-level and person pair-level analysis weights had several specific weight components, and the final weights are the product of all of the weight components. As in the 2015 NSDUH, all of the adults in the 2016 NSDUH sample received the WHODAS questions. Therefore, there was no need to have a separate adult mental health weight in the 2016 NSDUH because the person-level analysis weight could be used to produce the adult mental health estimates. The person-level analysis weights (ANALWT_C) are the product of 16 weight components. Each weight component accounts for either a selection probability at a selection stage or an adjustment factor adjusting for nonresponse, coverage, or extreme weights. The sum of the weight over all respondents on the data file represents an estimate of the total number of individuals in the target population. In view of the use of weights as expansion factors in forming estimates, the weight can be interpreted as the total number of target population individuals each record on the file represents. For variance estimation, suitable software, such as SUDAAN®, should be used to take the sample design into account.23 Similar to the 2015 NSDUH, the 2016 NSDUH used 2010 census-based population estimates in the poststratification adjustment. Details of the weight components and the sample weighting procedures appear in the 2016 NSDUH Methodological Resource Book, which will be available in 2018.24
Response rates: 
  • Strategies for ensuring high rates of participation resulted in a weighted screening response rate of 81.94 percent and a weighted interview response rate for the CAI of 71.20 percent. (Note that these response rates reflect the original sample, not the subsampled data file referenced in this document.)

Extent of processing: 
  • Data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. Ready-to-go data files are also routinely created along with setups in the major statistical software formats as well as standard codebooks to

Study Bibliography