Skip to main content

Analytic methodology for childhood predictor analyses for wave 1 of the Global Flourishing Study

Abstract

In this article, we describe the statistical and design methodology of the demographic variation analyses used as part of a coordinated set of manuscripts for wave 1 of the Global Flourishing Study (GFS). Aspects covered include the following: childhood predictors regression analyses, accounting for the complex sampling design, missing data and imputation, sensitivity analysis for unmeasured confounding and meta-analysis. We provide a brief illustrative example of the childhood predictor analyses using the sense of mastery construct indicator from the GFS survey and conclude by outlining some strengths and limitations of the methodology employed.

Peer Review reports

Background

The Global Flourishing Study (GFS) is a large, multinational panel study that aims to explore the distribution, determinants, and interrelations of various concepts related to human well-being with more than 200,000 people across a geographically and culturally diverse set of countries around the world [1,2,3]. Interest in flourishing has surged in recent years across various fields like psychology, economics, and public health [4,5,6,7,8,9,10]. However, many aspects of well-being remain underexplored, especially globally, as much of the well-being literature has been shaped by Western perspectives [7, 11]. As a multinational panel study, the GFS provides an avenue to explore well-being and flourishing from a multicultural perspective to fill this gap.

The purpose of this article is to describe the methodology applied to the set of construct-specific childhood predictor analyses that were produced using currently available wave 1 data from the GFS, most of which are planned for inclusion in manuscripts that are being considered for publication as a coordinated set of manuscripts on wave 1 of the GFS. These childhood predictor analyses apply a common preregistration template principally focused on exploring the distribution of the associations that various childhood experiences have with scores on indicators of aspects of flourishing in each country and how the associations vary across countries.

Evaluating the associations of childhood factors with present aspects of flourishing provides a valuable contribution to understanding the determinants of flourishing as shaped by experiences growing up. Retrospective childhood experiences evaluated include quality of relationship with mother, quality relationship with father, parental marital status growing up, subjective financial status of family growing up, self-reported history of abuse during childhood, feeling like an outsider in one’s family when growing up, and self-rated health growing up. Many of the constructs included in the GFS are seldom included in cross-cultural cohort studies (see survey development report [12]), providing a unique opportunity to strengthen existing knowledge about aspects of well-being from a multinational perspective. All analyses are conducted separately by country, which not only preserves potential heterogeneity in the interpretation of survey items across countries but allows the results to be contextualized considering the sociocultural particularities within each country. Then, country-specific results are pooled using meta-analytic techniques to summarize the associations of childhood experiences with present aspects of flourishing. These analyses provide a template for evaluating the determinants of flourishing globally, and the use of a consistent methodology across manuscripts allows for comparability of results across many different aspects of flourishing.

There are three core components of the current article. First, we begin by providing a high-level description of the data and measures used in the childhood predictor analyses. Next, we discuss aspects of the methodology, namely the regression analyses, accounting for the complex sampling design, missing data and imputation, and meta-analysis. Lastly, we use the sense of mastery outcome (How often do you feel very capable in most things you do in life?; response options include always, often, rarely, never) to provide an illustrative example of the analyses and results that will be presented in the construct-specific childhood predictors analyses manuscripts; see Kim et al. [13] for more details on the mastery outcome.

Global flourishing study data

Currently available wave 1 GFS data includes nationally representative samples of the adult population (18 years old and older) from 22 geographically and culturally diverse countries, including Argentina, Australia, Brazil, Egypt, Germany, Hong Kong (Special Administrative Region of China), India, Indonesia, Israel, Japan, Kenya, Mexico, Nigeria, the Philippines, Poland, South Africa, Spain, Sweden, Tanzania, Turkey, the UK, and the USA (wave 1 data will also become available for mainland China once wave 2 data are released in early 2025). These countries were selected to (a) maximize coverage of the world’s population; (b) ensure geographic, cultural, and religious diversity; and (c) prioritize feasibility and existing data collection infrastructure. The study encompasses approximately 64% of the global population. These countries also include some of the world’s largest and most influential communities of religious believers, including Christians (Nigeria, Brazil, Germany, Philippines, South Africa, USA), Muslims (Indonesia, Nigeria, Turkey, Egypt), Buddhists (Japan), Hindus (India), and Jews (Israel). Data collection was carried out by Gallup, a global analytics and advisory organization with decades of experience collecting global data on various aspects of human life. Most of the data for wave 1 were collected in 2023, with some countries beginning data collection in 2022; exact dates of data collection vary by country [14]. The GFS is set to continue with four additional waves of annual panel data collection from 2024 to 2027. The precise sampling design that was used to collect wave 1 data varied by country to ensure nationally representative samples for each country. Further details of the sampling design methodology are available elsewhere [14, 15].

Survey items included numerous aspects of well-being such as happiness and life satisfaction, physical and mental health, meaning and purpose, character and virtue, close social relationships, and financial and material stability [16], along with numerous other demographic, social, economic, political, religious, personality, childhood, community, health, and well-being variables. Development of the GFS survey occurred over eight distinct phases: (1) selection of core well-being and demographic questions; (2) solicitation of social, political, psychological, and demographic questions from domain experts worldwide; (3) revision of the initial survey draft based on feedback from scholars around the world representing various academic disciplines; (4) modification of question items following input from experts in multinational, multiregional, and multicultural survey research; (5) survey draft refinement based on compiled input from an open invitation to comment, posted publicly, and sent to numerous listservs; (6) questionnaire optimization with support from Gallup survey design specialists; (7) adaptation of items from an interviewer-administered to a self-administered survey instrument using best practices for web survey design to minimize item non-response, illogical responses, and incomplete responses; and (8) confirmation by scholars in several participating countries that translations accurately captured the intended meaning of each question [3, 15].

The data are publicly available through the Center for Open Science (https://www.cos.io/gfs). During the translation process, Gallup adhered to TRAPD model (translation, review, adjudication, pretesting, and documentation) for cross-cultural survey research [17]. Additional information about methodology and survey development can be found in the GFS Questionnaire Development Report [3, 7], as well as the GFS Methodology [14], GFS Codebook (https://osf.io/cg76b), and GFS Translations documents [18].

Measures

Childhood predictor variables

A total of 17 childhood predictors were initially considered and preregistered. We initially preregistered to use all (17) retrospective recall items about childhood characteristics that were included in the GFS intake survey. The survey development process included multiple phases, the details of which are described in Lomas et al. [7]. The 17 items included year of birth/age, gender, race/ethnicity, immigration status, childhood abuse, feeling like an outsider in family, childhood health, subjective financial status growing up, parental marital status growing up, relationship with mother during childhood, relationship with father during childhood, feeling loved from mother growing up, feeling loved from father growing up, religious affiliation growing up, religious service attendance growing up, religious service attendance of mother growing up, and religious service attendance of father growing up. Religious affiliation was assessed in all countries, but the observed response options varied greatly across countries. Racial/ethnic identity was assessed in some but not all countries, and response options were unique to each country to be locally relevant. Additional details about the items and response options for each are reported in the GFS Questionnaire Development Report [3] and the GFS Codebook (https://osf.io/cg76b).

In general, recorded responses of “don’t know,” “refused,” “skipped,” “prefer not to answer,” and “does not apply” were coded as missing. However, the parental relationship predictors were re-coded with an additional indicator such that if a recorded response was “does not apply” to any of the parent relationship items, a 1 was used to refer to such cases and this control indicator was 0 otherwise. Responses to a parental relationship variable may be “does not apply” because either parent has died or was not present for some other reason, which would be an important childhood event to control for in the analysis. We used this additional coded indicator to control for such possible childhood events in the country-specific regression analyses.

The entire set of childhood predictors that were included in the initial preregistration could not be used due to issues with multicollinearity encountered during preliminary testing of statistical analysis code. This multi-collinearity was evident in the strong correlations of certain variables with other; in very large standard error estimates for these variables in the regression models; and in distorted heterogeneity estimates in the meta-analysis (see below). As a result of this multicollinearity, a reduced set of 13 childhood predictors was used. The removed predictors were (1) feeling loved from mother, (2) feeling loved from father, (3) religious service attendance of mother, and (4) religious service attendance of father. The two relationship quality variables with mother and father were strongly correlated with the love from mother and father variables. In consultation with partners at Gallup, the loved from parent items were dropped from the regression analyses because the relationship items were generally interpreted with greater consistency, and subject to fewer translational issues, across countries and languages. The parental religious service attendance while growing up items were especially collinear with the individual’s self-reported religious service attendance while growing, prompting the removal of the former from the set of predictors. We comment on multicollinearity in more detail below.

A note on multicollinearity

Although we expected that at least some of the childhood predictors would be correlated, we anticipated that the effects of multicollinearity would be minimal because there are important conceptual distinctions between even those predictors we might expect to be more closely related (e.g., general quality of relation with father vs. feeling love from father). We also anticipated that any concerns about multicollinearity might be mitigated by large country sample sizes. This was the case in some countries, such as the USA (N = 38,312), where the full set of initially preregistered childhood predictors was estimable with reasonable standard errors. However, we encountered multicollinearity issues for a number of outcomes in several countries, particularly when sample sizes were smaller. For example, in the Turkey sample (N = 1473), coefficient estimates when using the full set of childhood predictors could not be obtained with reasonable standard errors. Our appendix provides the country-specific results for Turkey and the USA after regressing sense of mastery on the full set of predictors for these two samples (see Additional File 1: Table S1) for illustration. Additional File 1: Table S1 also includes the meta-analysis results and the estimate of heterogeneity for each effect.

To try and address concerns about multicollinearity, we considered using a regularizing estimator such as the LASSO [19]. However, in consultation with colleagues at the Institute for Quantitative Social Science at Harvard University, there were not widely accepted approaches to incorporating complex sampling design adjustments or missing data into existing regularizing estimators, nor could we find an approach that was integrated into all software packages that might be used to perform the analyses. Thus, a decision was made to update and document a modification of the preregistration template by removing the four abovementioned childhood predictors from all analyses. Decomposing the effects of such highly related but conceptually distinct aspects of the childhood experience will be left to subsequent investigations where a more nuanced approach to statistical methodology can be considered.

Outcome variables

A range of continuous, binary, Likert-type, and nominal response scales were used to assess the different constructs included in the GFS. All items with at least 10 ordered responses were treated as approximately continuous. All remaining Likert-type and nominal items were recoded into binary variables based on cutoffs specified in the preregistrations for each outcome.

Childhood predictors analyses

This section describes the analyses which were carried out within each country, and we later describe the random effects meta-analyses used to summarize results across countries. Analyses were implemented across multiple software packages (R [20], Stata [21], SAS [22], and SPSS [23]) to ensure consistency in results and ease of use by the larger core group [24]. Implementing the analyses in separate software allows for a greater reach across fields for others to utilize and replicate our analyses in their software of choice. Any deviations across software packages implementation are described below.

Separating analyses by country

The core analyses of the GFS were conducted separately within each country. As described below, summary statistics were obtained by random effects meta-analysis rather than for example by use of a multi-level model. A key advantage of this approach is that it does not presume cross-cultural measurement equivalence of the measures, which is important because most constructs were assessed using a single item, and cognitive testing during the survey development process suggested some variation in the interpretation of items across countries [25, 26]. Thus, it may be preferable to treat the measures as closely related, but not identical, assessments of each construct across the countries. We chose to conduct analyses separately for each country because it preserves potential heterogeneity in the interpretation of survey items across countries, and it allows the results to be contextualized in light of the sociocultural particularities of each country. This approach also aligns with our decision to use a random effects meta-analysis to combine effect estimates for each childhood predictor category, which implies that the regression coefficients we combine are not necessarily representative of effects estimated from repeated samples from the same population of respondents but the effects vary across countries forming distinct populations with on those effects. The resulting meta-analyzed effects represent the average effect across populations, but without assumptions of the equivalence of those effects on different populations.

Regression analyses

Manuscripts that apply the preregistration template for the construct-specific childhood predictor analyses used complex survey-adjusted regression models to evaluate the effect of coded childhood characteristics on the GFS outcome. The type of regression model depended on the scale of the outcome, but the general form of the model is:

$$y=f(\text{X}\beta )+e$$

where \(f\left(.\right)\) is the link function which is either the identity function for continuous outcomes or the exponential function for modified Poisson regression; \(\text{X}\) is the design matrix of childhood predictors; \(\beta\) is the vector of regression coefficients; and \(e\) is the vector of errors distributed according to the outcome.

When the outcome is binary, we conducted modified Poisson regressions. Modified Poisson is a popular approach to estimating risk ratios [27]. The exponentiated coefficients from the results of the modified Poisson can be interpreted as risk ratios: a risk ratio is the ratio of the probability of an outcome in an exposed group to the probability of an outcome in an unexposed group. Logistic regression was not used for binary outcomes because the resulting odds ratio estimates are not infrequently incorrectly interpreted as risk ratios. This is especially problematic and inaccurate when binary outcomes are common (prevalence between 0.10 and 0.90), as is the case for many outcomes in the GFS. To aid in comparability of results across outcomes, and to avoid these problems with interpretation, risk ratios obtained by modified Poisson regression were employed throughout. However, risk ratios are non-invariant under re-labeling of the outcome categories (and readers should not misinterpret these as odds ratios).

Point estimates of effects were obtained using weighted least squares. Robust standard errors were computed using a Taylor series linearization approach and adjusted for stratified sampling when necessary. However, each software package differs in how these methods are implemented, resulting in minor variations in results across packages.

For analyses conducted using R, the survey package [28] was used to estimate the model. For analyses conducted in Stata, the built-in svy: reg and svy: poisson procedures were used [29]. For analyses through SAS, there was no built-in functionality for estimating a complex survey adjusted modified Poisson model. We modified a macro, %surveygenmod, [30] published as a conference proceeding of the SAS Global Forum in 2017 to obtain the necessary functionality in SAS; we needed to make several significant changes to the original macro, culminating in the modified macro surveygenmod2 [31].

Complete separation

For categorical outcomes, an issue known as complete separation [32] can sometimes occur when estimating the modified Poisson regression model. Complete separation concerns cases in which all respondents in one level of a childhood predictor variable have the same value on the outcome. This results in uninterpretable estimates for those effects, as there were no observed differences between the childhood predictors and the outcome. We have tried to note when this occurs in the country-specific results and the effect that such cases might have on the results of the meta-analyses.

Joint test of effects for groups of coefficients

We used Wald tests to conduct a joint hypothesis test of whether a group of parameters are significantly different than zero. A Wald-type test was the most straightforward approach to implement in different software to obtain comparable results and p-values. The general form of the Wald-type tests implemented across software is as follows, where the estimated parameter variance–covariance matrix (\(\widehat{\text{V}}\)) to test the hypothesis \({H}_{0}: \text{L}\beta =0\) where \(\beta\) is a vector of regression coefficients, and \(\text{L}\) is a design-like matrix specifying which elements in \(\beta\) are being tested. The test statistic is computed using:

$${F}_{Wald}=\frac{{\left(\text{L}\widehat{\upbeta }\right)}^{t}{\left({\text{L}}^{t}\widehat{\text{V}}\text{L}\right)}^{-1}\left(\text{L}\widehat{\upbeta }\right)}{rank({\text{L}}^{t}\widehat{\text{V}}\text{L})}$$

Then, p-values are obtained from the F-distribution with numerator degrees of freedom equal to the number of tested parameters (or rank of \({\text{L}}^{t}\widehat{\text{V}}\text{L}\)), and denominator degrees of freedom equal to the model degrees of freedom from the regression analysis minus the number of parameters tested (i.e., the sum of weights to approximate sample size minus the rank of \({\text{L}}^{t}\widehat{\text{V}}\text{L}\)). In our case, the sample size for each country is usually large enough that the specific value for the denominator degrees of freedom is likely to have little bearing on the results. The above F-statistic is calculated for the resulting regression results from each imputed dataset. The F-statistic and degrees of freedom from each estimated test were saved and averaged across imputations. A global p-value is then obtained using the average F-values and degrees of freedom from the F-distribution \({p}_{global}\text{ } = 1-pF\left({F}_{pooled}|d{f}_{1,pooled}, d{f}_{2,pooled}\right),\) where \(pF(.)\) is the cumulative F-distribution function, \({F}_{pooled}\) is the average F-statistic across imputed datasets, \(d{f}_{1,pooled}\) and \(d{f}_{2,pooled}\) are the average degrees of freedom across imputed datasets, and \({p}_{pooled}\) is the global p-value reported in each country-specific analysis.

The above F-test may have minor differences depending on software used, and the interested reader is referred to the package-specific documentation for joint tests: for R, see Lumley [33]; for Stata, see Stata Corp [29].test example 6; and for SAS, see SAS Institute [34] for tests with continuous outcome and see Padgett and Chen [31] for tests with binary outcomes.

E-values for sensitivity to unmeasured confounding and possible recall bias

For each childhood predictor, we calculated E-values to evaluate the sensitivity of results to potential unmeasured confounding. An E-value is the minimum strength of the association on the risk ratio scale that an unmeasured confounder must have with both the outcome and the predictor, above and beyond all measured covariates, for an unmeasured confounder to explain away an association [35]. A high E-value signifies that any unmeasured confounder would need to have a strong association with both the childhood predictor and the outcome to explain away the observed association. This suggests that the results are more likely to reflect a true causal relationship. An E-value closer to 1 signifies the opposite where the observed association may be explained away by an unmeasured confounder with a weak relationship with the outcome and predictor. Approximate E-values can be obtained for continuous outcomes through scale conversions [35]. E-values are provided for the country-specific results and for the meta-analysis estimated average effects—both random effects and population weighted estimates.

All of the childhood predictors are assessed retrospectively and are thus potentially subject to recall bias. The adult outcome may itself affect how participants recall their childhood experiences. This may be less likely for some childhood predictors (e.g., marital status of parents) than with others (e.g., self-rated health at age 12). Nevertheless, the concern needs to be taken into account in the interpretation of the analyses. It can, however, be shown that for recall bias to completely explain away the observed associations between the childhood predictors and the specific outcome would require that the effect of the adult outcome on biasing the retrospective assessments of the childhood predictor would essentially have to be at least as strong as the observed predictor-outcome associations themselves [36]. The observed multi-variable adjusted association itself thus constitutes an analogue of sort to the E-value for differential measurement error due to recall bias [36]. Comment will be made on this issue in the “Discussion” section within each of the individual childhood predictor manuscripts. In some cases, such as that illustrated below with mastery, recall bias might be sufficient to explain away some of the observed associations. However, when effect sizes are larger, this may be less plausible.

Accounting for the complex sampling design

Accounting for the complex sampling design was accomplished by utilizing the information provided by Gallup on the primary sampling unit (PSU) IDs, strata IDs, and sampling weights. The weighting variable and PSU/strata IDs were included in all country-specific analyses. A complexity arises when respondents are recruited via face-to-face, because sometimes this results in groups (strata) with a single PSU. When a stratum has only a single case, this is known as a lonely PSU and makes variance and standard error estimation more complex because traditional methods assume multiple PSUs within each stratum [37, 38]. We elected to use the “certainty” specification where single-PSU stratum does not contribute to the variance; this maintained relatively comparable results across statistical software depending on the level of missingness in the childhood characteristic and on the specific outcome. Complete details concerning the implementation of these methods to account for the complex sampling design of each country can be found in the open code [39]. The methods were generally the same across all software packages, with very minor exceptions such as the singleton PSU issue mentioned above. We mostly relied on the default settings within each software package, which led to nearly identical results across software packages, with slight differences, principally in standard errors, mainly attributable to the imputation of missing data.

Missing data and multiple imputation

All missing variables are imputed using multiple imputation by chained equations [40, 41]. The imputation model incorporated the criterion/outcome variable, all childhood/demographic characteristics, including race/ethnicity and childhood religious affiliation when available, and sampling weights. The sampling weights were included as a variable in the imputation models. Including the sampling weight in the multiple imputation procedure allowed study missingness to be related to the propensity of being included in the study. To avoid a singularity in the design matrix due to single-PSU strata, we elected not to include strata as a predictor in countries where strata were available. To account for variations in the assessment of certain variables across countries (e.g., race/ethnicity, childhood religious affiliation), we conducted the imputation process separately for each country. The within-country imputation approach ensured that the imputation model accurately reflects country-specific contexts and assessment methods.

When conducting multiple imputation, five imputed datasets are a commonly used default [42]. However, a more robust recommended number of imputations relate to the fraction of missing information (FMI) of the observed dataset [43]. The rate of missing data for this first wave of the GFS was quite low (< 5% for nearly all variables), and for the childhood predictor variables in particular, the item with the largest percent missing was the parent marital status variable (4.9%). Across all the items used as childhood predictors, the percent of respondents with any missing was 12.9% (a rough approximation of the FMI is therefore 0.129). Using an efficiency argument (FMI/m ≤ 0.05) commonly used, the number of imputed datasets needed would be ≈3. In preliminary testing, we evaluated using more imputed datasets (m = 20) and found no meaningful differences in results compared to only 5 imputations or when compared across software implementations of multiple imputation. Increasing past 5 imputed datasets was therefore thought to result in insufficient gains to justify the considerable increase in computational time due to the imputation being conducted separately by country and research team. However, we anticipate higher levels of missing data in subsequent waves due to wave-specific non-response, and analyses should consider using at least 20 imputations in subsequent waves of data analysis in spite of the additional computing time.

Meta-analysis

The 22 countries were chosen to have broad geographical, cultural, and religious coverage; the countries include all six populated continents and represent about half of the world’s population. The random effects meta-analysis would be interpreted as estimating the pooled effect of each childhood predictor and the standard deviation of the childhood effect across countries from a hypothetical underlying population of which the sample of 22 countries would be representative. While such an underlying population is hypothetical, given the broad diverse coverage of the 22 countries, this was viewed as a reasonable target of interest. However, the results for each of the 22 countries are also provided, which are of interest in their own right, and may also be useful for readers who would prefer not to consider this underlying hypothetical population. Moreover, we provide a population-weighted fixed effects meta-analysis to evaluate similar childhood predictor effect where the principal target of inference concerns individual people in the 22 countries rather than the countries themselves.

All meta-analyses were conducted in R [20] using the metafor package [44] through an open-source application developed for these analyses [39]. The effect sizes, or values to be meta-analyzed, were the unstandardized regression coefficients and associated standard errors. We did not transform the country-specific regression coefficients.

Random effects meta-analysis

For all the core GFS studies, a general random effects model was used, assuming the distribution of effect sizes in the population is normally distributed [45,46,47], that is:

$${y}_{i}\sim Normal({y}_{i}^{*},{v}_{i})$$
$${y}_{i}^{*}\sim Normal(\theta ,{\tau }^{2})$$

where \({y}_{i}\) is the unstandardized effect of the childhood predictor within each country, \({v}_{i}\) is the variance/uncertainty of \({y}_{i}\) within each country (i.e., the standard error of the regression coefficient), \({y}_{i}^{*}\) is the unknown “true effect” for the childhood predictor in country \(i\), \(\theta\) is the estimated average effect for the childhood predictor, and \({\tau }^{2}\) is the estimated variance/heterogeneity of \({y}_{i}^{*}\). The model was estimated using the Paule and Mandel estimator [48,49,50].

When a childhood predictor was highly collinear within a country, the resulting standard error of the effect can be large relative to the magnitude of the effect. Meta-analyzing several relatively imprecise effect estimates can result in the heterogeneity of effects to be severely underestimated (\({\widehat{\tau }}^{2}<0.01\)). A low estimate does not align with the mean differences in effects one sees when investigating the country-specific results, nor with a prior theoretical consideration. It is rather driven by the statistical artifact of extremely large standard errors arising from the multicollinearity of variables. The use of the Paule and Mandel estimator reduced the frequency of this occurring relative to restricted maximum likelihood in preliminary testing, but even the Paule and Mandel estimator did not eliminate the occurrence of this issue completely. In our results, we have noted when the estimates of heterogeneity based on the random effects meta-analyses are smaller than one would expect and refer readers to the accompanying online supplement where the forest plots provide a better indication of the heterogeneity (e.g., Q-statistics and Q-profile confidence intervals).

Proportion of effects outside a threshold

The output from the random effects meta-analyses includes the estimated proportion of effects estimates across countries with more substantial effect sizes lying above or below preregistered thresholds. These thresholds provide readers with an opportunity to gauge the distribution of effect sizes across countries in one particular manner, and are not intended to be used as strict benchmarks or cutoffs for determining whether observed associations are statistically or practically meaningful. For continuous outcomes, these thresholds were specified as unstandardized effects above 0.1, or below − 0.1. For binary outcomes, these thresholds were specified as risk ratios above 1.1, or below 0.9. Under substantial effect heterogeneity, there can in principal be notable proportions of effect in both directions. Estimates for the proportion of such effects were obtained using methodology based on calibrated effect sizes in meta-analyses The calibrated effect size is computed based on the meta-analysis results following well-established methods [51,52,53] that use the following formula:

$${\widetilde{y}}_{i}=\widehat{\theta }+\left({y}_{i}-\widehat{\theta }\right){\left(\frac{{\widehat{\tau }}^{2}}{{\widehat{\tau }}^{2}+{v}_{i}}\right)}^{0.5}$$

where \({\widetilde{y}}_{i}\) is the calibrated effect size of country \(i\). The calibrated effect sizes were used to approximate the proportion of at least (or at most) a pre-specified bound. The proportion is approximated following Mathur and VanderWeele [52] by identifying the number of observed calibrated effect sizes and dividing them by the number of studies (e.g., 22).

For continuous outcomes, we approximated \(Pr( {\widetilde{y}}_{i} < -0.10)\) and \(Pr( {\widetilde{y}}_{i}> 0.10)\), where \({\widetilde{y}}_{i}\) is the calibrated unstandardized linear regression coefficient. For binary outcomes, we approximated \(Pr(exp\left({\widetilde{y}}_{i}\right) < 0.90)\) and \(Pr(exp( {\widetilde{y}}_{i})> 1.10)\), where \(exp({\widetilde{y}}_{i})\) is the risk-ratio associated with effect \({\widetilde{y}}_{i}\). This empirical approach is the default method to approximate the proportion of effects by threshold [51]. However, another approximation is possible based on using the normal distribution, which aligns the estimation of proportions with the assumptions of the random effects meta-analysis. When the number of effects meta-analyzed is large, the meta-analytic mean (\(\widehat{\theta }\)) and standard deviation (\(\widehat{\tau }\)) can be used to estimate the proportion of effects that meet some specified lower threshold (\({q}_{1}\)) or upper threshold \(({q}_{2})\), as follows:

$$\widehat{Pr}\left({y}_{i}<{q}_{1}\right)=\Phi \left(\frac{{q}_{1}-\widehat{\theta }}{\widehat{\tau }}\right)$$
$$\widehat{Pr}\left({y}_{i}>{q}_{2}\right)=1-\Phi \left(\frac{{q}_{2}-\widehat{\theta }}{\widehat{\tau }}\right)$$

where \(\Phi \left(.\right)\) denotes the standard normal cumulative distribution function. This method is an option in our online app for meta-analysis but is not the default due to the relatively low number of countries (22) in the meta-analysis. In testing, we have found the proportions to be similar despite the relatively low number of effects being meta-analyzed.

Population weighted meta-analysis

A fixed effects meta-analysis was conducted as a supplemental analysis to the random effects meta-analysis described above, providing an opportunity for researchers to consider both sets of results depending on which interpretative approach is most appropriate for their purposes. Inferences focused on differences across countries may utilize the random effects estimates, as these align with the target of inference, whereas analyses giving individuals equal weight align more with the results of the supplemental fixed effects meta-analyses. While the random effects meta-analysis assumes a distribution over the childhood predictors across countries (relaxing assumptions of measurement invariance somewhat), the supplemental fixed effects meta-analysis does not assume a distribution of effects but more directly estimates the weighted average effect over countries where the weight in this analysis is the total 2023 population (rather than the observed sample size) within each country. Note the fixed effects approach taken here essentially estimates the effect averaged across individuals in the various countries, and can be given this interpretation even if there is heterogeneity across countries in effect sizes [54]. The meta-analytic estimate is:

$$\widehat{\theta }=\frac{\sum {w}_{i}{y}_{i}}{\sum {w}_{i}}$$

where \({y}_{i}\) is the regression coefficient for each country and \({w}_{i}\) is the weight for each country. A common choice for the weight is the inverse of the sampling variance \({v}_{i}\), but in this analysis, we aimed to estimate the overall effect by treating individuals with equal weight instead of countries with equal weight, and without assuming a common effect size across countries. We therefore used a weight for each country that scales based on the total 2023 population size of each country.

Using the population sizes provided by Gallup, the fixed effects meta-analysis estimated the average effect of each childhood predictor weighted by the population size of each country. The country sizes used to create weights are shown in Table 1.

Table 1 Gallup provided estimates of population sizes (2023) for population weighted meta-analysis

Global p-values (combining p-values from country-specific tests)

A decision was made to report p-values because it is common practice across many disciplines. The harmonic mean p-value was used to combine p-values across different countries [55, 56]. The combined p-value was used to test the null hypothesis of no effect of a variable (all categories have an estimated effect of 0 relative to baseline) in all countries, against the alternative hypothesis that in at least one country the group of regression coefficients (or risk ratios) for a given predictor are significantly different than 0 (or 1 for risk ratios). The harmonic mean p-value method is more robust to dependency among pooled p-values [56]. Although the country-specific tests are technically independent—an underlying assumption of most classic approaches to pooling p-values [57]—assuming independence of the p-values may not be entirely tenable given a common underlying set of items, translation procedures, childhood predictors, data cleaning techniques, and imputation models. To account for multiple testing, we present Bonferroni-corrected p-value significance thresholds for the meta-analytic results based on the number of predictors [58, 59] in the primary meta-analytic results. The Bonferroni adjustment for multiplicity was applied to the significance level cutoff (alpha) and not the p-values (we divided alpha by the number of tests and not multiplying the p-values by the number of tests). Providing the standard 0.05 significance threshold and Bonferroni-adjusted significance threshold provides transparency in how multiplicity was considered. However, the reported harmonic mean p-value is relatively robust to multiple testing already maintaining a constant type-I error rate regardless of the number of tests being conducted [56].

Example analysis – sense of mastery

We will illustrate the aforementioned methodology and analyses and corresponding results with an example concerning childhood predictors of a person’s adult sense of mastery; see Kim et al. [13] for further details.

Construct overview and importance

Sense of mastery—the perception that one has the ability to influence one’s environment and elicit desired outcomes [60, 61]—is important in its own right, but also because it shapes people’s trajectories of psychological, social, spiritual, behavioral, and physical health [62,63,64,65,66]. Sense of mastery was assessed in the GFS with the question, How often do you feel very capable in most things you do in life?; response options include always, often, rarely, and never, and was dichotomized by collapsing always/often and collapsing rarely/never.

Illustrative results

Table 2 provides the distribution of nationally representative descriptive statistics of the demographic characteristics and candidate childhood predictors of the entire sample. Participant ages ranged the entire adult lifespan (18–80 +). Gender of the participants was nearly equally balanced between males (48%) and females (51%), with a small representation of other gender (0.3%). The majority of participants report either having a somewhat good or very good relationship with their mother (89%) and with their father (80%) while growing up. Attending religious services at least once a week at age 12 was reported by an estimated 41% of participants.

Table 2 Nationally representative descriptive statistics of the observed sample

Table 3 provides the meta-analytic estimates of the effects of childhood experiences on a sense of mastery. A similar table is presented in all construct-specific manuscripts that include childhood predictor analyses following the template reported in this article. No single childhood predictor appeared to dominantly predict sense of mastery in adulthood; instead, a combination of several childhood predictors appears to be at play. For this particular analysis, most effect sizes were small and the estimated proportions with risk ratios above 1.1 or below 0.9 were likewise quite small. This was not the case with some other outcomes in the GFS, however. The results here are given for illustrative purposes. See Kim et al. [13] for further substantive interpretation.

Table 3 Random effects meta-analysis of regression of sense of mastery on childhood predictors

Table 4 provides E-value estimates for the meta-analysis. E-values indicated that some of the observed associations were potentially slightly to moderately robust to unmeasured confounding (Table 4). For example, when considering relationship with mother, an unmeasured confounder that was associated with both mastery and relationship with mother by risk ratios of 1.22 each (above and beyond the covariates already adjusted for) could explain away the association, but weaker joint confounder associations could not. Furthermore, to shift the CI to include the null, an unmeasured confounder associated with both sense of control and relationship with mother by risk ratios of 1.09 each could explain away the association, but weaker joint confounder associations could not. In several cases, a combination of unmeasured confounding and statistical uncertainty might suffice to explain away the results; and likewise, differential measurement error due to recall bias might suffice to explain away some of the results [35] given the modest effect sizes in Table 3.

Table 4 Sensitivity of meta-analyzed childhood predictors to unmeasured confounding

The online supplemental material in each manuscript will have a corresponding forest plot for each childhood predictor category included in the meta-analysis, which displays the meta-analyzed regression coefficients, the country-specific effect estimates, and additional information on heterogeneity of effects. An example forest plot for the meta-analysis of regression coefficients for sense of mastery across countries for the effect of a “Very good/somewhat good” relationship with mother category is shown in Fig. 1. The forest plots are constructed such that all effects are ordered by magnitude and the y-axis varies, allowing for a quick inspection of which countries have a high or low effect and whether these orders are similar across effects. The country-specific estimates reported in the online supplemental tables of each manuscript provide a complementary view that compares the magnitude of effects across predictors within a country, in contrast to the forest plots, which provide a comparison across countries for a predictor category (relative to the reference category). We repeated the meta-analyses using a population weighted (fixed effects) approach to estimate effects if we were to weight within-country results by the size of the population the sample represents. Those results are not shown here but are shown in the online supplemental material of each individual manuscript.

Fig. 1
figure 1

Forest plot for the meta-analysis of the effect of a “Very good/somewhat good” relationship with mother compared to a “Very bad/somewhat bad” relationship on sense of mastery

Strengths and limitations

The analytic methodology employed for the study of childhood predictors of outcomes related to well-being has several strengths and limitations. A notable strength is the broad population coverage of the GFS. The countries included in wave 1 of the GFS encompasses approximately 64% of the world’s population [3]. Most of the analytic methods employed are relatively well-established methods with a long history of being used in either epidemiology, public health, psychology, or sociology research. We aimed to employ rigorous methods that appropriately incorporate the unique complex sampling design used in each country to obtain robust standard errors. In line with this aim, we implemented macros and packages as needed [31]. All code to reproduce analyses are openly available in several languages (R, SAS, SPSS, and Stata) for researchers to explore these data and results. All analyses were conducted at the country-level, and the results were then pooled using meta-analytic techniques to account for uncertainty in the estimates and quantify heterogeneity across countries. We used a random effects “distribution of effects” perspective of meta-analytic methods for our primary analyses, but also reported a fixed effects “population-weighted” perspective as a supplemental analysis. Using different theoretical perspectives for pooling estimates provides flexibility to the reader to interpret which set of effects are appropriate for their purposes.

There are limitations to consider as well. Sources of heterogeneity in the relationship between childhood predictors and outcomes across countries could be due to seasonality effects, differences in interpretation, differences due to quality of translation, differences in mode of data collection, differences in the process and variables used for constructing respondent level weights, and other possible reasons depending on the specific construct of interest [15]. Most of the psychosocial constructs that were assessed only had a single item to represent the overall construct (e.g., sense of mastery), many of which were assessed with binary or ordinal response scales with few categories. Although it is not uncommon for such items to be used in large-scale epidemiologic studies such as the GFS and decisions about which items and response scales to use were guided by several phases of GFS survey development [12], some measures in the GFS survey may not be a suitable fit for answering certain research questions. The use of single-item assessments provides less construct coverage and generally lower true-score reliability, resulting in less power to detect effects of the candidate childhood predictors on well-being outcomes [67]. The anticipated effect sizes of the childhood predictors using a retrospective recall approach were small, leading to the need to have large sample sizes to detect these small effects. The obtained country sample sizes for wave 1 were relatively large (ranging from ~ 1500 to 38,000), which helps to identify small effects anticipated.

The use of retrospective recall is not without limitations and issues as well. The retrospective assessment of the childhood predictors allowed for a synthetic longitudinal design to be constructed using wave 1 of the GFS. The findings from these analyses may be influenced by recall bias [68] and common method bias [69]. To mitigate these sources of bias, future research might employ prospective longitudinal designs that follow individuals from childhood to adulthood and complement self-report survey responses with data derived from other sources (e.g., parents). One side effect of using a retrospective recall approach to artificially construct a longitudinal analysis is that the recalled characteristics can be highly correlated. It would be reasonable to expect one’s perceptions of their health growing up to be related to their family’s financial status growing up or that their perspectives on relationship with their parent will be related to how much love they feel from their parent. These limitations led to issues with inclusion of all initially planned childhood predictors, which was addressed using the approach described in the “ Childhood predictor variables” section.

Several limitations on the side of the statistical methods employed are noted next. The regression analyses used a common form (see the “ Regression analyses” section) that may not be optimal for all outcomes. For example, the illustrative analysis of regression sense of mastery on all childhood predictors assumes differences in the log(risk-ratio) of endorsing always/often over rarely/never can be modeled as a linear combination of all childhood predictors. Exploratory analyses to evaluate potential nonlinearities in this relationship could be valuable for each outcome in future works. Additionally, the precise implementations of the methods to account for the complex sampling design can sometimes be not fully transparent, especially in software packages that require a license (e.g., SAS, Stata). The use of several software packages helped to identify the effects of any software-specific peculiarities. A common issue we needed to deal with involved handling “lonely PSUs” [38], but we aimed to always use a “certainty” specification that fixed the variance contribution to zero in such cases when estimating variance components. This approach has the limitation of potentially underestimating the variance, or standard error, for a particular estimate. However, to the best of our knowledge, there is no generally agreed upon approach for handling such instances, and our aim is to be transparent about these decisions to reduce non-reproducibility because of unclear analytic decisions and researcher degrees of freedom [70].

The analyses outlined in this article are relatively straightforward, but also varied to allow for multiple interpretive lenses to be applied (e.g., within-country vs. cross-country patterns). Implementing these coordinated analyses has its challenges, such as complications implementing analyses using complex sampling weights, multiple imputation, modified Poisson regression, and meta-analysis across several statistical packages, and yet we found remarkably similar results across packages in spite of slightly different implementations [24].

Conclusions

The current article provides a description of the methods used in manuscripts reporting childhood predictors analyses that leverage currently available wave 1 data from the GFS, most of which are being considered for publication as a coordinated set of manuscripts based on the GFS. Using nationally representative data from 22 geographically and culturally diverse countries around the world, the current set of planned childhood predictors analyses provide a unique opportunity to (1) explore the relationship between childhood experiences and many construct indicators related to subsequent adult well-being and (2) identify potentially modifiable childhood factors that could be targeted through intervention to support population-level well-being. The interested reader is referred to our companion article, Analytic Methodology for Demographic Variation Analyses for Wave 1 of the Global Flourishing Study [71], for a description of the methods used in the coordinated set of demographic variation analyses of GFS outcomes.

Data availability

Data for Wave 1 of the GFS is available through the Center for Open Science (https://www.cos.io/gfs) upon submission of a pre-registration, and will be openly available without pre-registration beginning February 2025. Subsequent waves of the GFS will similarly be made available. Please see https://www.cos.io/gfs-access-data for more information about data access. Code for the GFS childhood predictor analyses in multiple software is openly available (https://doiorg.publicaciones.saludcastillayleon.es/10.17605/osf.io/vbype).

References

  1. Crabtree S, English C, Johnson BR, Ritter Z, VanderWeele TJ. Global Flourishing Study: Questionnaire development report. Gallup Inc., Washington, DC: 2021. [Retrieved on 2024-05-10 from https://osf.io/y3t6m].

  2. Crabtree S, English C, Johnson BR, Ritter Z, VanderWeele TJ. Global Flourishing Study: 2024 questionnaire development report. Gallup Inc., Washington, DC: 2024. [Retrieved on 2024-05-10 from https://osf.io/y3t6m].

  3. Johnson BR, VanderWeele TJ. The Global Flourishing Study: a new era for the study of well-being. Int Bull Mission Res. 2022;46(2):272–5. Available from: https://doiorg.publicaciones.saludcastillayleon.es/10.1177/23969393211068096.

  4. Adler MD, Fleurbaey M, (eds). The Oxford Handbook of Well-Being and Public Policy. New York, NY: Oxford University Press; 2016. Available from: https://doiorg.publicaciones.saludcastillayleon.es/10.1093/oxfordhb/9780199325818.001.0001.

  5. Crespo RF, Mesurado B. Happiness economics, eudaimonia and positive psychology: from happiness economics to flourishing economics. J Happiness Stud. 2015;16:931–46.

    Article  Google Scholar 

  6. Huppert FA, So TT. Flourishing across Europe: application of a new conceptual framework for defining well-being. Soc Indic Res. 2013;110:837–61.

    Article  PubMed  Google Scholar 

  7. Lomas T. Making waves in the great ocean: a historical perspective on the emergence and evolution of wellbeing scholarship. J Posit Psychol. 2022;17(2):257–70.

    Article  Google Scholar 

  8. Seligman ME. Flourish: a visionary new understanding of happiness and well-being. New York: Simon and Schuster; 2011.

  9. Trudel-Fitzgerald C, Millstein RA, Von Hippel C, Howe CJ, Tomasso LP, Wagner GR, VanderWeele TJ. Psychological well-being as part of the public health debate? Insight into dimensions, interventions, and policy. BMC Public Health. 2019;19(1):1–11.

    Article  Google Scholar 

  10. VanderWeele TJ, McNeely E, Koh HK. Reimagining health—flourishing. JAMA. 2019;321(17):1667–8.

    Article  PubMed  Google Scholar 

  11. Henrich J, Heine SJ, Norenzayan A. Most people are not WEIRD. Nature, 2010;466(7302):1–29. Available from: https://doiorg.publicaciones.saludcastillayleon.es/10.1038/466029a.

  12. Lomas T, Bradshaw M, Case B, Cowden R, Fogelman A, Johnson K, et al. The development of the Global Flourishing Study survey: charting the evolution of a new 109 item inventory of human flourishing. BMC Global and Public Health. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s44263-025-00139-9.

  13. Kim ES, Bradshaw M, Chen Y, Chopik WJ, Okuzono S, Wilkinson R, Padgett RN, Lachman ME, Johnson BR, VanderWeele TJ. Early echoes of empowerment: characterizing the childhood roots of adult sense of control in a cross-national analysis of 22 countries (in the Global Flourishing Study). 2025. OSF Preprint available from: https://doiorg.publicaciones.saludcastillayleon.es/10.31219/osf.io/v4j5b_v1.

  14. Ritter Z, Srinivasan R, Han Y, Chattopadhyay M, Honohan J, Johnson BR, VanderWeele TJ. Global Flourishing Study Methodology. Washington, DC: Gallup Inc; 2024. Available from: https://osf.io/k2s7u.

  15. Padgett RN, Cowden RG, Chattopadhyay M, Han Y, Honohan J, Ritter Z, Srinivasan R, Johnson BR, VanderWeele TJ. Survey sampling design in wave 1 of the Global Flourishing Study. European J of Epidemiology. In press. Available from: https://doiorg.publicaciones.saludcastillayleon.es/10.31234/osf.io/yuc4q.

  16. VanderWeele TJ. On the promotion of human flourishing. Proc Natl Acad Sci. 2017;114(31):8148–56.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  17. Harkness JA. Questionnaire translation. In: Harkness JA, Van de Vijver FJ, Mohler PP, editors. Cross-cultural survey methods, vol. 325. Hoboken: Wiley; 2003. p. 35–56.

  18. Johnson BR, Ritter Z, Fogleman A, Markham L, Stankov T, Srinivasan R, et al. The Global Flourishing Study. Center for Open Science. Charlottesville: 2024. Available from: https://doiorg.publicaciones.saludcastillayleon.es/10.17605/OSF.IO/3JTZ8.

  19. Tibshirani R. Regression shrinkage and selection via the lasso. J Roy Stat Soc B. 1996;58(1):267–288. Available from: https://doiorg.publicaciones.saludcastillayleon.es/10.1111/j.2517-6161.1996.tb02080.x.

  20. R Core Team. R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2024.

  21. StataCorp. Stata statistical software: release 18. College Station, TX: StataCorp LLC; 2023.

  22. SAS Institute Inc. SAS/STAT 9.3 User’s Guide. Cary, NC: SAS Institute Inc.; 2011.

  23. IBM Corp. Released 2023. IBM SPSS Statistics for Windows, Version 29.0.2.0 Armonk, NY: IBM Corp.

  24. Padgett RN, Cowden RG, Bradshaw M, Chen Y, Jang SJ, Shiba K, Johnson BR, VanderWeele TJ. On coordinating “simple” analyses of international survey across multiple statistical software packages: a case study from the Global Flourishing Study. Open Science Framework; 2024. Available from: https://doiorg.publicaciones.saludcastillayleon.es/10.31219/osf.io/6d2wf.

  25. Cowden RG, Skinstad D, Lomas T, Johnson BR, VanderWeele TJ. Measuring wellbeing in the Global Flourishing Study: insights from a cross-national analysis of cognitive interviews from 22 countries. Qual Quant. 2024. Available from: https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s11135-024-01947-1.

  26. Johnson KA, Moon JW, VanderWeele TJ, Schnitker S, Johnson BR. Assessing religion and spirituality in a cross-cultural sample: Development of religion and spirituality items for the Global Flourishing Study. Religion, Brain & Behavior. 2024;14(4):345–3538. Available from: https://doiorg.publicaciones.saludcastillayleon.es/10.1080/2153599X.2023.2217245.

  27. Zou G. A modified Poisson regression approach to prospective studies with binary data. Am J Epidemiol. 2004;159(7):702–6.

    Article  PubMed  Google Scholar 

  28. Lumley T. Analysis of complex survey samples. J Stat Softw. 2004;9(1):1–19.

    Google Scholar 

  29. Stata Corp. Stata 18 postestimation reference manual. Stata Press; College Station, TX.

  30. da Silva AR. %SURVEYGENMOD macro: an alternative to deal with complex survey design for the GENMOD procedure (paper 268–2017). SAS Global Forum; 2017. [Retrieved on 2024–05–17 from https://support.sas.com/resources/papers/proceedings17/0268-2017.pdf].

  31. Padgett RN, Chen Y. surveygenmod2: a SAS macro for estimating generalized linear model and conducting simple Wald-type tests under complex survey designs. ArXiv. 2024. Preprint at Available from: http://arxiv.org/abs/2406.07651.

  32. Albert A, Anderson JA. On the existence of maximum likelihood estimates in logistic regression models. Biometrika. 1984;71(1):1–10. Available from: https://doiorg.publicaciones.saludcastillayleon.es/10.1093/biomet/71.1.1.

  33. Lumley T. R: survey package regTermTest documentation. 2024. Available from: https://r-survey.r-forge.r-project.org/pkgdown/docs/reference/regTermTest.html.

  34. SAS Institute Inc. The SURVEYREG procedure. 2024. Available from: https://support.sas.com/documentation/cdl/en/statug/63347/HTML/default/viewer.htm#statug_surveyreg_sect007.htm.

  35. VanderWeele TJ, Li Y. Simple sensitivity analysis for differential measurement error. Am J Epidemiol. 2019;188(10):1823–9.

    Article  PubMed  PubMed Central  Google Scholar 

  36. VanderWeele TJ, Ding P. Sensitivity analysis in observational research: introducing the E-Value. Ann Intern Med. 2017;167(4):268.

    Article  PubMed  Google Scholar 

  37. Lumley T. Complex surveys: a guide to analysis using R. New York: John Wiley & Sons; 2010.

    Book  Google Scholar 

  38. Schneider B. How are R and Stata (mis)handling singleton strata? Practical significance; accessed: 2024–01–25. Available from: https://www.practicalsignificance.com/posts/bugs-with-singleton-strata/.

  39. Padgett RN, Bradshaw M, Chen Y, Jang SJ, Shiba K, Johnson BR, VanderWeele TJ. Global Flourishing Study statistical analyses code. Center for Open Science. Charlottesville: 2024. Available from: https://doiorg.publicaciones.saludcastillayleon.es/10.17605/osf.io/vbype.

  40. Sterne JAC, White IR, Carlin JB, Spratt M, Royston P, Kenward MG, et al. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ. 2009;338:b2393. Available from: https://doiorg.publicaciones.saludcastillayleon.es/10.1136/bmj.b2393.

  41. van Buuren S. Flexible imputation of missing data. 2nd ed. [Retrieved on February 7, 2024]. Available from: https://stefvanbuuren.name/fimd/.

  42. van Buuren S, Groothuis-Oudshoorn K. mice: multivariate imputation by chained equations in R. J Stat Softw. 2011;45(3):1–67. Available from: https://doiorg.publicaciones.saludcastillayleon.es/10.18637/jss.v045.i03.

  43. White IR, Royston P, Wood AM. Multiple imputation using chained equations: issues and guidance for practice. Stat Med. 2011;30(4):377–99.

    Article  PubMed  Google Scholar 

  44. Viechtbauer W. Conducting meta-analyses in R with the metafor package. J Stat Softw. 2010;36(3):1–48. Available from: https://doiorg.publicaciones.saludcastillayleon.es/10.18637/jss.v036.i03.

  45. Borenstein M, Hedges LV, Higgins JP, Rothstein HR. A basic introduction to fixed-effect and random-effects models for meta-analysis. Res Synth Methods. 2010;1(2):97–111. Available from: https://doiorg.publicaciones.saludcastillayleon.es/10.1002/jrsm.12.

  46. Frank MC, Braginsky M, Cachia J, Coles NA, Hardwicke TE, Hawkins RD, Mathur MB, Williams R. Experimentology: an open science approach to experimental psychology methods. Cambridge: MIT Press; 2024. Available from: https://doiorg.publicaciones.saludcastillayleon.es/10.25936/3JP6-5M50.

  47. Hunter JE, Schmidt FL. Fixed effects vs. Random effects meta‐analysis models: implications for cumulative research knowledge. Int J Select Assess. 2000;8(4):275–92. Available from: https://doiorg.publicaciones.saludcastillayleon.es/10.1111/1468-2389.00156.

  48. Paule RC, Mandel J. Consensus values and weighting factors. J Res Natl Bur Stand. 1982;87(5):377–85.

    Article  Google Scholar 

  49. Viechtbauer W. Bias and efficiency of meta-analytic variance estimators in the random-effects model. J Educ Behav Stat. 2005;30(3):261–93.

    Article  Google Scholar 

  50. Viechtbauer W, López-López JA, Sánchez-Meca J, Marín-Martínez F. A comparison of procedures to test for moderators in mixed-effects meta-regression models. Psychol Methods. 2015;20(3):360–74.

    Article  PubMed  Google Scholar 

  51. Mathur MB, VanderWeele TJ. New metrics for meta-analyses of heterogeneous effects. Stat Med. 2019;38(8):1336–42.

    Article  PubMed  Google Scholar 

  52. Mathur MB, VanderWeele TJ. Robust metrics and sensitivity analyses for meta-analyses of heterogeneous effects. Epidemiology. 2020;31(3):356–8.

    Article  PubMed  PubMed Central  Google Scholar 

  53. Wang C-C, Lee W-C. A simple method to estimate prediction intervals and predictive distributions: summarizing meta-analyses beyond means and confidence intervals. Res Synth Methods. 2019;10(2):255–66.

    Article  PubMed  Google Scholar 

  54. Rice K, Higgins JPT, Lumley T. A re-evaluation of fixed effect(s) meta-analysis. J R Stat Soc Series A. 2018;181(1):205–27. Available from: https://doiorg.publicaciones.saludcastillayleon.es/10.1111/rssa.12275.

  55. Vovk V, Wang R. Combining p-values via averaging. Biometrika. 2020;107(4):791–808.

    Article  Google Scholar 

  56. Wilson DJ. The harmonic mean p-value for combining dependent tests. Proc Natl Acad Sci USA. 2019;116(4):1195–200.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  57. Loughin TM. A systematic comparison of methods for combining p-values from independent tests. Comput Stat Data Anal. 2004;47(3):467–85. Available from: https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.csda.2003.11.020.

  58. Abdi H. Bonferroni and Šidák corrections for multiple comparisons. In: Encyclopedia of Measurement and Statistics. 2007;3(1):1–9.

  59. VanderWeele TJ, Mathur MB. Some desirable properties of the Bonferroni correction: is the Bonferroni correction really so bad? Am J Epidemiol. 2019;188(3):617–8.

    Article  PubMed  Google Scholar 

  60. Bandura A. Self-efficacy: the exercise of control. New York: W. H. Freeman and Company, New York, NY; 1997. p. ix, 604.

  61. Lachman ME, Neupert SD, Agrigoroaei S. The relevance of control beliefs for health and aging. In: Handbook of the Psychology of Aging. 7th ed. New York: Elsevier; 2011. p. 175–90.

  62. Elliot AJ, Mooney CJ, Infurna FJ, Chapman BP. Perceived control and frailty: the role of affect and perceived health. Psychol Aging. 2018;33(3):473–81. Available from: https://doiorg.publicaciones.saludcastillayleon.es/10.1037/pag0000218.

  63. Hong JH, Lachman ME, Charles ST, Chen Y, Wilson CL, Nakamura JS, VanderWeele TJ, Kim ES. The positive influence of sense of control on physical, behavioral, and psychosocial health in older adults: an outcome-wide approach. Prev Med. 2021;149:106612.

    Article  PubMed  Google Scholar 

  64. Infurna FJ, Gerstorf D, Ram N, Schupp J, Wagner GG. Long-term antecedents and outcomes of perceived control. Psychol Aging. 2011;26(3):559.

    Article  PubMed  PubMed Central  Google Scholar 

  65. Infurna FJ, Mayer A, Anstey KJ. The effect of perceived control on self-reported cardiovascular disease incidence across adulthood and old age. Psychol Health. 2018;33(3):340–60. Available from: https://doiorg.publicaciones.saludcastillayleon.es/10.1080/08870446.2017.1341513.

  66. Robinson SA, Lachman ME. Perceived control and cognition in adulthood: the mediating role of physical activity. Psychol Aging. 2018;33(5):769–81. https://doiorg.publicaciones.saludcastillayleon.es/10.1037/pag0000273.

    Article  PubMed  PubMed Central  Google Scholar 

  67. Zimmerman DW, Zumbo BD. Resolving the issue of how reliability is related to statistical power: adhering to mathematical definitions. J Mod Appl Stat Methods. 2015;14(2):9–26.

    Article  Google Scholar 

  68. Blome C, Augustin M. Measuring change in quality of life: bias in prospective and retrospective evaluation. Value Health. 2015;18(1):110–5. Available from: https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.jval.2014.10.007.

  69. Groves RM. Survey errors and survey costs. New York: Wiley; 2004.

  70. Silberzahn R, Uhlmann EL, Martin DP, Anselmi P, Aust F, Awtrey E, et al. Many analysts, one data set: making transparent how variations in analytic choices affect results. Adv Methods Pract Psychol Sci. 2018;1(3):337–56. Available from: https://doiorg.publicaciones.saludcastillayleon.es/10.1177/2515245917747646.

  71. Padgett RN, Bradshaw M, Chen Y, Cowden RG, Jang SJ, Kim E, et al. Analytic methodology for demographic variation analyses for wave 1 of the Global Flourishing Study. BMC Glob Public Health. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s44263-025-00140-2.

Download references

Acknowledgements

Not applicable.

Funding

The GFS was supported by funding from the John Templeton Foundation (grant #61665), Templeton Religion Trust (#1308), Templeton World Charity Foundation (#0605), Well-Being for Planet Earth Foundation, Fetzer Institute (#4354), Well Being Trust, Paul L. Foster Family Foundation, and the David and Carol Myers Foundation. The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of these organizations.

Author information

Authors and Affiliations

Authors

Contributions

TJV, BRJ, MB, YC, SJJ, RNP, and KS coordinated writing of code for different software and preparing analysis scripts; MB lead writing scripts for Stata; YC lead writing scripts for SAS; SJJ lead writing scripts for SPSS (which later turned into using R through SPSS); KS lead writing scripts for R; RNP contributed as needed for each package and wrote the meta-analysis online app; RNP and RGC drafted the initial version of this paper; KSE wrote the initial interpretation of the sense of mastery application; all authors reviewed and help revise the manuscript; and BRJ and TJV are the principal investigators for the Global Flourishing Study.

Corresponding author

Correspondence to Tyler J. VanderWeele.

Ethics declarations

Ethics approval and consent to participate

Ethical approval was granted by the institutional review boards at Baylor University (IRB Reference #: 1841317) and Gallup (IRB Reference #: 2021–11-02). Gallup is a multi-national corporation and its IRB covers all countries included in the GFS. All participants provided informed consent. The research conformed to the principles of the Helsinki Declaration.

Consent for publications

Consent by participants was given for their responses on the GFS to be used in publications.

Competing interests

Tyler J. VanderWeele reports partial ownership and licensing fees from Gloo Inc. The remaining authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Padgett, R.N., Bradshaw, M., Chen, Y. et al. Analytic methodology for childhood predictor analyses for wave 1 of the Global Flourishing Study. BMC Glob. Public Health 3, 29 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s44263-025-00142-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s44263-025-00142-0

Keywords