National Study of Youth and Religion, Wave 3 (2007-2008)

Data Archive > U.S. Surveys > General Population > National > National Studies of Youth and Religion > Summary

In Wave 3 every attempt was made to re-interview all English-speaking Wave 1 youth survey respondents. At the time of this third survey the respondents were between the ages of 18-24. The survey was conducted from September 24, 2007 through April 21, 2008 using a Computer Assisted Telephone Interviewing (CATI) system programmed using Blaise software. The Howard W. Odum Institute for Research in Social Sciences at the University of North Carolina at Chapel Hill (Odum Institute) was hired to field the Wave 3 survey. Telephone calls were spread out over varying days and times, including nights and weekends. Every effort was made to re-contact and re-survey all original NSYR respondents (whether they completed the Wave 2 telephone survey or not), including those out of the country, in the military, and on religious missions. There were more difficulties in contacting and completing the survey with respondents who were in the military during Wave 3 because some of them were serving on active duty and were unable to be reached. Even their families were often unaware of their specific locations and did not have any knowledge of phone numbers or addresses where they could be reached. The Wave 3 Survey instrument replicated many of the questions asked in Waves 1 and 2 with some changes made to better capture the respondents’ lives as they grew older. For example, there were fewer questions on parental monitoring and more on post-high school educational aspirations.

Many variable names have been truncated to allow for downloading of the data set as an SPSS portable file. Original variable names are shown in parentheses at the beginning of each variable description.

Data File
Cases: 2,532
Variables: 484
Weight Variable: RWEIGHT, PANEL
A longitudinal weight, RWEIGHT (rweight2_w3), has been calculated for use when analyzing data from Wave 3 with either Wave 1 or Wave 2 of the NSYR survey data (excluding data from the Jewish oversample). To develop the new raw weight, RWEIGHT (rweight2_w3), a simple correction factor was applied within each region-income stratum (defined by the four census regions and five income levels at Wave 1) to adjust the weight for each individual. This accounts for the change in the distribution of the respondents of the NSYR by census regions and income groups resulting from Wave 3 sample attrition. A new weight variable, PANEL (panel_weight), is included for analyses involving all three waves of data. It is similar to rweight2_w3, except it makes the correction based on the individuals that participate in all 3 waves. Again, we recommend the use of raw weights when using software developed for analysis of survey data, e.g., Stata or SAS, especially when using commands designed for survey analysis such as "svymean" or "svyregress" in Stata. The only exception to this is when software documentation specifically requests that users normalize the weights before estimation. It is the user’s responsibility to determine whether raw or normalized weights should be used in an analysis.
Data Collection
Date Collected: September 24, 2007 through April 21, 2008
Funded By
The Lilly Endowment, Inc.
Collection Procedures
As in previous waves, prior to beginning the telephone survey each respondent’s verbal consent was obtained. As all respondents were over the age of 18 at the time of the third wave survey, parental consent was no longer required. In Wave 3, respondent identity was confirmed using name, date of birth, and the name of the city and state where he or she had completed the first wave survey.
Sampling Procedures
An RDD telephone survey sampling method was chosen for this study because of the advantages it offers compared to alternative survey sampling methods. Unlike school-based sampling, for example, our RDD telephone method was able to survey not only school-attending youth, but also school dropouts, home-schooled youth, and students frequently absent from school. Using RDD, we were also able to ask numerous religion questions which many school principals and school boards often disallow on surveys administered in school.

For more information, see
Principal Investigators
Dr. Christian Smith
Department of Sociology
University of Notre Dame
Related Publications
Smith, Christian and Melinda Lundquist Denton. 2003. “Methodological Design and Procedures for the National Survey of Youth and Religion (NSYR).” Chapel Hill, NC: The National Study of Youth and Religion.

Smith, Christian and Melinda Lundquist Denton. 2005. Soul Searching: The Religious and Spiritual Lives of American Teenagers. Oxford: Oxford University Press.

Smith, Christian. 2009. Souls in Transition: The Religious and Spiritual Lives of Emerging Adults. Oxford: Oxford University Press.

See for a list of publications.
All publications using NSYR data must contain the following acknowledgement:

“The National Study of Youth and Religion,, whose data were used by permission here, was generously funded by Lilly Endowment Inc., under the direction of Christian Smith, of the Department of Sociology at the University of Notre Dame.”
Retention and Response Rates
Of the total 3,328 (the 3,364 original sample minus those found to be date of birth outliers in Wave 2, and those who were discovered to have passed away before the fielding of the Wave 3 survey) original respondents who in Wave 3 all attempts were made to re-survey, 36 were found to be ineligible to participate.

The following are the Wave 3 survey rates that data users will find important to note:

1) W3 OVERALL RETENTION RATE - Of the remaining eligible 3,282 Wave 1 respondents, 2,532 participated in the Wave 3 survey (including 13 partial cases ), for a Wave 3 completion rate of 77.1%.

2) W2/W3 RETENTION RATE – Of the respondents who completed the Wave 2 survey 86.3% completed the Wave 3 survey.

3) W1/W2/W3 RETENTION RATE – Of the original eligible respondents, 68.4% completed both the Wave 2 and the Wave 3 survey. Note that 274 respondents who did not complete the W2 survey did complete the W3 survey.

4) W1/W3 COMBINED RESPONSE RATE – Calculated by multiplying the W1 and W3 response rates, is 43.9%.

5) W3 ATTRITION RATE – Of the total original eligible respondents, 22.9% did not complete the Wave 3 survey.

6) W3 REFUSAL RATE – Of the total original eligible respondents, 6% refused to participate in the Wave 3 survey.

7) W3 CONTACT RATE – Of the total original eligible respondents, 87.7% were successfully contacted (whether they completed the survey or not).

On the key variables, never drink alcohol, never use marijuana, and regular smoking of cigarettes reveal very minor differences between W1 and W3 responders and even smaller differences between W2 and W3 responders. Similar to what was found in Wave 2, when analyses are run comparing W3 responders to non-responders we see that non-responders are slightly more likely to never attend religious services. This finding is consistent with other social science research. On key demographic variables, again, only very minor differences were found between W1 and W3 responders and even smaller differences between the W2 and W3 responders.
Missing Data
With the exception of a few created variables, the standard “.” indicator of missing data has not been used. In the actual dataset (but not the codebook survey instrument), for all variables, DON’T KNOW=777, REFUSED=888, and NOT ASKED=999. All missing values are reported as 3-digit numbers, except for those that had more digits to start with (year, for example). The 999–NOT ASKED response indicates a valid skip of the question. In other words, a respondent does not have a response for that question because they were intentionally skipped past the question. In a few cases, there is a value of 666, which indicates an INVALID SKIP. These are cases where a respondent was incorrectly skipped out of a question due to computer or programmer error, or partial cases. The use of these codes instead of traditional missing data indicators means that analysts must be very careful to be aware of these cases in their analyses. Stata, for example, will not recognize 777, 888 or 999 as missing data. Therefore, unless you tell it otherwise, the stats package will include 777, 888, and 999 as actual values in your analysis. Always pay attention to the value of skip code indicators.
Religion Variables
The religion questions in the NSYR survey are complex. We have worked hard to create interpretable religion variables to be used in analysis. All of the original variables have been left in the dataset. However, many of these are incomplete because they were asked of only a subset of the respondents or because they do not include open-ended verbatim responses. For consistency across analyses, we ask that all analysts use the standard integrated religion variables created by NSYR as the starting point for their analysis. These variables include the following:

W3 Teen Re-Created Integrated Religion Variables:

RELIGION (religion_w3)
RELIG (relig1_w3)
RELIG (relig2_w3)
ATHEIST (atheist_w3)
RELIG3 (relig3_w3)
MORRL1_1 (morerel1_1_w3)
MORRL1_2 (morerel1_2_w3)
MORRL2_1 (morerel2_1_w3)
MORRL2_2 (morerel2_2_w3)
LESRL1_1 (lessrel1_1_w3)
LESRL1_2 (lessrel1_2_w3)
LESRL2_1 (lessrel2_1_w3)
LESRL2_2 (lessrel2_2_w3)

W3 New Teen Integrated Religion Variables:

RELATT (relatt_w3)
GODVIEW2 (godview2_w3)
TWOREL (tworel_w3)
LRNREL (lrnrel_w3)
RELATT2 (relatt2_w3)
WHOHEAV2 (whoheaven2_w3)
RELATCON (relatconv_w3)
TRADREL (tradrel_w3)
SPIRIT (spirituality_w3)
PARTCONV (partrelatconv_w3)

Descriptions of Key Integrated Teen Religion Variables:

RELIG1 (relig1_w3) is the religious identification variable. This is the variable that indicates what the respondent considers him/herself to be regardless of attendance.

RELATT (relatt_w3) is the pure attendance variable in that it is coded using only the attendance variables. We only consider first named attendance. The “Indeterminate” group on this variable includes anyone who said they attended (in ATTEND (attend_w3) and CHURTYPE (churtype_w3)) and were “Christian/Just Christian,” “Orthodox,” or “Independent.” It also includes those who could not be classified (i.e., they were “Other”) and responded with “Just Christian,” “Tabernaculo De Alcanza,” or “Bahai.”

TRADREL (tradrel_w3) is the variable categorizing respondents into major religious types. It is the combination of RELIG1 (relig1_w3) and RELATT (relatt_w3), but it always prioritizes attendance. So in essence this variable starts as an exact replication of RELATT (relatt_w3), but then uses RELIG1 (relig1_w3) to classify all those who could not be classified based on attendance (i.e., non-attenders and indeterminate). TRADREL (tradrel_w3) is similar to RELAFF (relaff_w2) in wave 2 (and therefore to the reltrad variable from wave 1), with slight differences. First, TRADREL (tradrel_w3) always prioritizes named attendance. Next, it splits the “no affiation” group into 3 separate categories: not religious (respondents who do not attend and do not self identify with a religion); indeterminate attends (respondents who attend a denomination but one that can not be categorized (e.g., Just Christian)), and indeterminate non-attends (respondents who do not attend but do identify with a group that can not be categorized). Additionally, we were able to use prprot1_w3 to help categorize indeterminate individuals who answered this question. The variables used to determine TRADREL (tradrel_w3) included: ATTREG (attreg_w3), CHURTYPE (churtype_w3), BAPTIST (baptistgrp_w3), METHST (methstgrp_w3), PRSBIAN (prsbiangrp_w3), LUTHAN (luthangrp_w3), REFMED (refmedgrp_w3), CONGAL (congalgrp_w3), CHCHST (chchstgrp_w3), RELIG0 (relig0_w3), RELIG0A (relig0a_w3), RELIG0B (relig0b_w3), ATHEIST1 (atheist1_w3), HALF1 (half1_w3), OTHREL1 (anothrel1_w3), prport1_w3, and teenrace.
Wave 3 Gender
For those data users working with a combination of W1, W2, and W3 survey data sets, please note that there are 3 cases where the respondents reported themselves to be of a different gender in W3 than they did in W1, but the W3 gender is the same as reported at W2. And there is one case whose reported W3 gender is different from their Wave 1 and Wave 2 gender. Thus, there are total of 4 cases that have different genders on GENDER (gender_w3) than on teensex. When discrepancies arose during the W3 survey, interviewers took care to confirm that we had correctly recorded gender in the W3 dataset; thus we recommend relying on GENDER (gender_w3) as the gender variable in longitudinal analyses.
Wave 3 Age
In W3 of the survey, respondents were asked to confirm the date of birth they reported in W1 and/or in W2. In a few cases the W3 date of birth was different. In almost all cases, the date was off by one day or a year, likely the result of a W1 or W2 interviewer keying error. In W3 the survey interviewers were instructed to carefully confirm the date of birth, particularly when there was a discrepancy between the W1 (or W2) and W3 date. For this reason, NSYR researchers advise that analysts use AGECATS (agecats_w3), the variable for the age the respondent gave at the time of the W3 survey, for respondent age. Further, in confirming dates of birth for W3 we discovered one case whose true birth date was outside the age range of the original sample. We have therefore removed this case from the W3 dataset as “ineligible” and advise that analysts remove them from W1 and W2 analyses.
School Variables
The Wave 3 survey took place from September of 2007 through April 2008. Therefore, unlike in Wave 2, all education questions are in relation to the school they are “currently” attending. Although some respondents completed the survey during a school vacation (e.g., Thanksgiving or Winter break), interviewers instructed respondents to answer based on the school they attended when not on vacation. Also, unlike in past waves, the employment question refers to number of hours worked in a “typical” week, reducing the potential for upward bias among respondents answering this question while on a break from full-time school.
Longitudinal Variables
Included in the W3 data set are a series of created variables that indicate whether a particular behavior or characteristic was reported at particular waves. All of these variables are designated with the prefix “cu_”. Also, they each end in a suffix that represents the span of waves that were considered in determining the variable. For example, the variable CU_DRINK (cu_drink_13) indicates whether a respondent reported any level of alcohol consumption at Wave 1, 2, or 3, whereas the variable CU_COHAB (cu_cohab_23) indicates whether the respondent reported ever having cohabited with a romantic partner at either Wave 2 or Wave 3 (because the cohabitation questions were not asked at Wave 1). It must be noted that a person only has to answer in the affirmative to the pertinent question at any wave to be coded as “1” on these variable. For example, if a person reported drinking at Wave 1 but was missing from both Waves 2 and 3, he/she would still be coded as “1” on the variable CU_DRINK (cu_drink_13). For the variables ending in "_23", however, consider someone who did not respond at either Wave 2 or Wave 3 as missing (i.e., they never had an opportunity to report this behavior).

The creation of these variables presented the problem of people changing answers between waves. For example, several individuals reported having had sexual intercourse at Wave 2 but when asked at Wave 3 claimed to have never had sex. For the majority of these cumulative variables we code all cases who ever report the given behavior as “1.” But, for the sexual behavior questions (i.e., CU_TOUCH (cu_touch_13); CU_ORAL (cu_oralsex_13); CU_SEXIN (cu_sexinter_13)), we believed we needed to a create a separate category for such changes in reports, due to the possibility of being “born again.” To indicate these cases, we have added a “555” code to note cases that switch their report of the behavior across waves. Knowledgeable decisions and rationale should be used when using these variables and cases.
Parent Figure Variables
In each wave of survey data collection a series of questions regarding relationships with parent figures was asked. In each wave the respondents were asked to answer questions about the Wave 1 parent figures, who were the parents who took part in the Wave 1 parent survey, along with any reported partners of that parent at the time of the initial Wave 1 survey. In Waves 2 and 3 respondents were asked if there were any people, other than the Wave 1 parent figures, who they considered to be parent figures. If they said ‘yes’, they were asked to answer a series of questions about their relationships to those people. It should be noted, however, that we do not know whether the “other” parent figures reported in Wave 3 are or are not the parent figures that were reported on in Wave 2. We only know that they are not the parent figures initially reported in Wave 1.
Other Wave 3 Data Issues
Users should be aware that 13 cases did not complete the interview. For all questions after they stopped the interview they are coded as a legitimate skip. These cases are indicated in the intstatus variable.

As is customary with Stata dates, all dates have been converted into days since January 1, 1960. The variable labels are formatted as mm/dd/yyyy, but the variables actually contain a continous, numeric value marking the days that have elapsed since January 1, 1960.

Similarly, all time span created variables (e.g., length of longest relationships) are converted into days by multiplying the given: years by 365, months by 30.5, and weeks by 7. These three values (or whichever were present) were then added to the number of days to come up with a continuous measure in days. In cases where the final value ended in .5 (due to an odd number of months being reported), the final value was rounded up to the next whole day.

There are 12 cases whose household roster was miscoded to not include a cohabiting romantic partner. These cases are noted as invalid skips on PARTNER (partner_w3) and FSTCOHAB (fstcohab_w3). But these cases also incorrectly received questions in C:11, and in certain cases received incorrect wording to the other cohabitation questions. The combination of the problematic wording and incorrect questions being asked could have created misunderstandings and potentially faulty responses for these 12 cases on the cohabitation questions. Care should be used when handling the responses of these cases on the cohabitation questions.

LIVPAR (livpar_w3) is a created variable that indicates if the respondent reports a parent in the provided household roster. A respondent is considered to be living with a parent only if he/she reports a biological mother/father, step mother/father, or foster mother/father as living in the household. Other adult family members (e.g., grandparents, aunt, uncle, etc.) were not considered parents for this created variable.

HASMOTH2 (hasmother2_w3) and HASFATH2 (hasfather2_w3) are created variables that indicate the respondent reports considering “someone else” as a mother or father figure at Wave 3. Someone else here meaning either: someone in addition or in replace of a Wave 1 parent figure or “new” mother or father figure (i.e., the respondent did not report on a mother/father figure at Wave 1). But in interpreting the meaning of these variables it is important to note that the questions used to create these variables incorporated skip patterns that were based on the respondent’s parental situation at Wave 1. Specifically, parents and respondents were only asked questions about a potential parent figure if that parent figure lived in the same household as the teen. These are the parent figures that are first asked about here in Wave 3. Therefore, caution must be warranted in interpreting these cases as having “new” mother or father figures. Rather, the respondents could simply be reporting on individuals that they have always considered to be parent figures, but we simply have not asked about them because they were not living in the household at Wave 1. Of course there is the possibility that these are indeed individuals that are new to the person’s life since the first interview. Unfortunately, given the generality of the question it is impossible to differentiate.