World Religion Dataset: Global Religion Dataset

Data Archive > International Surveys and Data > Cross-National > Summary


The World Religion Dataset (WRD) aims to provide detailed information about religious adherence worldwide since 1945. It contains data about the number of adherents by religion in each of the states in the international system. These numbers are given for every half-decade period (1945, 1950, etc., through 2010). Percentages of the states' populations that practice a given religion are also provided. (Note: These percentages are expressed as decimals, ranging from 0 to 1, where 0 indicates that 0 percent of the population practices a given religion and 1 indicates that 100 percent of the population practices that religion.) Some of the religions (as detailed below) are divided into religious families. To the extent data are available, the breakdown of adherents within a given religion into religious families is also provided.

The project was developed in three stages. The first stage consisted of the formation of a religion tree. A religion tree is a systematic classification of major religions and of religious families within those major religions. To develop the religion tree we prepared a comprehensive literature review, the aim of which was (i) to define a religion, (ii) to find tangible indicators of a given religion of religious families within a major religion, and (iii) to identify existing efforts at classifying world religions. (Please see the original survey instrument to view the structure of the religion tree.) The second stage consisted of the identification of major data sources of religious adherence and the collection of data from these sources according to the religion tree classification. This created a dataset that included multiple records for some states for a given point in time. It also contained multiple missing data for specific states, specific time periods and specific religions. The third stage consisted of cleaning the data, reconciling discrepancies of information from different sources and imputing data for the missing cases.

The Global Religion Dataset: This dataset uses a religion-by-five-year unit. It aggregates the number of adherents of a given religion and religious group globally by five-year periods.

Data File
Cases: 14
Variables: 77
Weight Variable: None
Data Collection
Date Collected: 2009-2012
Original Survey (Instrument)
World Religion Dataset Codebook
Funded By
The John Templeton Foundation and the University of California, Davis
Collection Procedures
As noted above, the religion data source list was formed first. The sources varied from census-based data to specific estimates of religious groups or specific sources that focused on a given religion in a longitudinal manner (either within a given country or for several countries). Some of the sources contained multiple data points on global or regional levels, but most contained scattered data on specific countries at discrete points in time.

A general instructions sheet for coders was then generated. A number of issues had to be dealt with in order to insure high inter-coder reliability. First, denominational level data had to be aggregated into the appropriate religious families. This proved to be a major challenge, especially within the Protestant family of Christianity. For example, some sources coded Anglicans as Protestants. Other included multiple Protestant denominations, sometimes under different labels. A related problem was to classify the various Christian Orthodox denominations under the Eastern Orthodox family. Islamic denominations also presented a significant challenge. The coding instructions were not always sufficiently specific to handle the diversity of categories provided by different sources; hence, the project directors and the data managers had to resolve multiple ambiguities in these sources.

The initial strategy was to collect data from each source on a different record. This was done even if a given source listed only the number (or percentage) adherents for a single religion. Each data point (or a set of data points) was identified by the source from which it was taken, the date of the data and the date these data were coded within the given source. A number of tests were run on the data collected from each of the sources (such as consistency over time, source of the data coded in each source -- e.g., census, secondary data, etc. -- and comprehensiveness of coverage of different religions). A questionnaire was then distributed among project members to solicit reliability estimates for each of the sources used. The sources were then ranked according to an estimate of reliability.

Please see the original survey instrument for more information on how the data were aggregated, how data points from multiple sources were reconciled, how missing data were interpolated, and how the final data cleaning process was undertaken.
Sampling Procedures
Data were collected from a variety of sources. The sources varied from census-based data to specific estimates of religious groups, or specific sources that focused on a given religion in a longitudinal manner (either within a given country or for several countries).
Principal Investigators
Zeev Maoz, University of California, Davis, and Errol Henderson, The Pennsylvania State University
Related Publications
Zeev Maoz and Errol A. Henderson. 2013. The World Religion Dataset, 1945-2010: Logic, Estimates, and Trends. International Interactions, 39(3).
Citation
Please cite the Maoz and Henderson 2013 publication listed under Related Publications above when publishing results that use this dataset.
List of Principal Sources
Please see the link to the Original Codebook above for a complete list of sources. The sources are listed at the end of the Codebook.

Bookmark and Share