PAGE 1
1 THE IMPACT OF UNMODELED TIME SERIES PROCESSES IN WITHIN -SUBJECT RESIDUAL STRUCTURE IN CONDITIONAL LATENT GROWTH MODELING: A MONTE CARLO STUDY By YUYING SHI A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2009
PAGE 2
2 2009 Yuying Shi
PAGE 3
3 To my family in China
PAGE 4
4 ACKNOWLEDGMENTS First, I would like to express my sincere gratitude to my two professors, Dr. Leite and Dr. Algina. Dr. Leite is my advisor and the chair of the committee. He is always available to answer my questions and encourages me all through the four years study. I greatly appreciate his instructive advice, patience and valuable suggestions on my dissertation. His help is indispensable for my first publication and the dissertation. Dr. Algina is another professor that I should show my deepest gratitude to. A respectable, responsible and r esourceful scholar, he has provided me with valuable guidance in every stage of my doctoral study. With his expertise in many aspects, he enlightens me not only in th e dissertation but also in my future research. My great gratitude also goes to other two committee members, Dr. Miller and Dr. Huang. Dr. Miller is an exceptional professor. His keen insights, breadth of knowledge, flexibility and approachable manner have been an invaluable source to me. My thanks also go to Dr. Huang for his valuable suggesti ons and generous encouragement through this study. I would like to thank many friends at University of Florida for making my academic life more enjoyable. Special thanks should go to my t wo best friends, Hong and Feiqi. They are always there listening to me and supporting me. My true words are beyond the gratitude to my beloved family for their unconditional love, understanding and continuous care. They are the best gift that I can ever receive from the God.
PAGE 5
5 TABLE OF CONTENTS page ACKNOWLEDGMENTS .................................................................................................................... 4 LIST OF TABLES ................................................................................................................................ 8 LIST OF FIGURES ............................................................................................................................ 11 ABSTRACT ........................................................................................................................................ 12 CHAPTER 1 INTRODUCTION ....................................................................................................................... 14 2 LITERATURE REVIEW ........................................................................................................... 19 Latent Growth Model .................................................................................................................. 19 Unconditional Latent Growth Model ................................................................................. 20 Conditional Latent Growth Model ...................................................................................... 27 Latent growth model with a time invariant covariate ................................................ 27 Latent growth model with a time -varying covariate .................................................. 31 Latent growth model with a parallel process .............................................................. 33 Assumptions of Growth Modeling ..................................................................................... 37 Within person residual covariance st ructure .............................................................. 37 Measurement time and missing data ........................................................................... 39 Functional form of development ................................................................................. 39 Comparisons with Other Methods ............................................................................................. 39 Stationary Time Series Model .................................................................................................... 46 Autoregressive (AR) Model ................................................................................................ 47 Moving Average (MA) Model ............................................................................................ 49 Autoregressive Moving Average (ARMA) Model ............................................................ 51 Mode ling Time Series in the Error Structure in Longitudinal Data Analysis ................. 52 Studies on the Impact of Misspecifying the WithinPerson Error Structure ........................... 53 Significance of This Study .......................................................................................................... 56 Research Questions ..................................................................................................................... 58 3 METHOD .................................................................................................................................... 59 Design Factors ............................................................................................................................. 59 Number of Measurement Times ......................................................................................... 59 Sample Size .......................................................................................................................... 60 Time Series Parameters ....................................................................................................... 60 Time Coding ........................................................................................................................ 61 Population Values ....................................................................................................................... 61
PAGE 6
6 Within Person Residual Variance 2 ................................................................................ 62 Parameter and B in Between Person Equation ........................................................... 63 Residual Variance of Level Equation (i.e., ,) Residual Variance of Shape Equation (i.e., ), and Covariance of Level and Shape Residuals (i.e., ) ......... 63 Mean and Variance of Time Invariant Covariate .............................................................. 63 Parameters of Time Varying Covariate .............................................................................. 63 Effect of Time Invariant Predictor on Latent Level and Latent Shape in Growth Predictor Model (i.e., and in Equation 2 18) ...................................................... 63 Effect of the Time Varying Predictor Variable on the Outcome Variable in LGM with a time varying Covariant (i.e., t in Equation 2 27) ............................................. 64 Effect of the Intercept and Slope of the Predictor on the Intercept and Slope of the Outcome Variable in LGM with a parallel process Model ........................................... 64 Summary of Population Values .................................................................................................. 64 LGM with a Time Invariant Covariate ............................................................................... 64 LGM with a Time Varying Covariate ................................................................................ 65 LGM with a parallel process ............................................................................................... 65 Summary of Conditions .............................................................................................................. 65 Data Generation ........................................................................................................................... 66 Data Analysis ............................................................................................................................... 68 4 RESULTS .................................................................................................................................... 71 Convergence Rate and Non-Positive Definite Covariance Matrix Occurrence Rate ............. 71 Fixed Parameter Estimates ......................................................................................................... 76 LGM with a Time Invariant Covariate ............................................................................... 76 LGM with a Time Varying Covariate ................................................................................ 76 LGM with a parallel process ............................................................................................... 77 Standard Error of the Fixed Parameter Estimates ..................................................................... 78 LGM with a Time Invariant Covariate ............................................................................... 78 LGM with a Time Varying Covariate ................................................................................ 78 LGM with a parallel process ............................................................................................... 79 Summary of the Results for the Fixed Parameter Estimates together with Standard Error Estimates ................................................................................................................. 80 Variance Component Parameter Estimates ............................................................................... 80 AR (1) Within -Person Residual Covariance Matrix .......................................................... 80 MA (1) Within Person Resi dual Covariance Matrix ......................................................... 84 ARMA (1, 1) Within -Person Residual Covariance Matrix ............................................... 87 Summary of the Results for Variance Component Parameter Estimates ......................... 9 0 Standard Error Estimates of Variance Components ................................................................. 94 AR (1) Within -Person Residual Covariance Matrix .......................................................... 94 MA (1) Within Person Residual Covariance Matrix ......................................................... 96 ARMA (1, 1) Within -Person Residual Covariance Matrix ............................................... 96 Summary of Standard Error Estimates of the Variance Components .............................. 99 Chi -Square GOF Test and GOF Indexes ................................................................................. 101
PAGE 7
7 GOF Test ............................................................................................................................ 101 AR (1) within -person residual covariance matrix .................................................... 101 MA (1) within -person residual covariance mat rix ................................................... 103 ARMA (1, 1) within -person residual covariance matrix ......................................... 104 Summary of results for GOF test .............................................................................. 105 TLI and CFI ........................................................................................................................ 106 AR (1) within -person residual covariance matrix .................................................... 106 MA (1) within -person re sidual covariance matrix ................................................... 107 ARMA (1, 1) within -person residual covariance matrix ......................................... 109 Summary of results for CFI and TLI ........................................................................ 110 RMSEA and SRMR ........................................................................................................... 110 AR (1) within -person residual covariance matrix .................................................... 110 MA (1) within -person residual covariance matrix ................................................... 112 ARMA (1, 1) Within -Person Residual Covariance Matrix ..................................... 113 Summary of resu lts of SRMR and RMSEA ............................................................. 115 Summary of GOF test and GOF indexes ......................................................................... 116 5 DISCUSSION AND CONCLUSION ...................................................................................... 118 General Conclusions and Discussions ..................................................................................... 118 Summary of Impact of Each Factor ......................................................................................... 132 Impact of Analysi s Model Type ....................................................................................... 132 Impact of Time Series Parameter ...................................................................................... 133 Impact of Sample Size ....................................................................................................... 135 Impact of Length of Waves ............................................................................................... 137 Analytic Results of Variance Components Estimates ............................................................ 138 GOF Test and GOF Indexes ..................................................................................................... 148 Suggestions to Applied Researchers ........................................................................................ 149 Limitations and Suggestions for Future Research .................................................................. 150 APPEND IX: MPLUS CODE .......................................................................................................... 152 Latent Growth Model with a Time Invariant Covariate with an AR (1) Process ................. 152 Latent Growth Model with a Time Invariant Covariate with an MA (1) Process ................ 153 Latent Growth Model with a Time Invariant Covariate with an ARMA (1, 1) Process ...... 154 LIST OF REFERENCES ................................................................................................................. 156 BIOGRAPHICAL SKETCH ........................................................................................................... 161
PAGE 8
8 LIST OF TABLES Table page 4 1 Convergence rate for all conditions ...................................................................................... 72 4 2 Rate of occurrence of non-positive definite matrix under all conditions ........................... 74 4 3 Marginal mean relative biases of fixed parameter estimates for LGM with a time invariant covariate .................................................................................................................. 76 4 4 Mean relative biases of fixed parameter estimates for LGM wit h a time varying covariate .................................................................................................................................. 77 4 5 Mean relative biases of fixed parameter estimates for LGM with a parallel process ........ 78 4 6 Margi nal mean relative biases of standard error estimates of fixed parameters for LGM with a time invariant covariate .................................................................................... 78 4 7 Marginal mean relative biases of standard error estimates of fixed param eters for LGM with a time varying covariate ...................................................................................... 79 4 8 Marginal mean relative biases of standard error estimates of fixed parameters for LGM with a parallel process ................................................................................................. 79 4 9 Mean relative biases of estimates for three LGMs with an AR (1) within -person residual covariance matrix ..................................................................................................... 81 4 10 Mean relati ve biases of estimates for three LGMs with an AR (1) within -person residual covariance matrix ..................................................................................................... 82 4 11 Mean relative biases of estimates f or three LGMs with an AR (1) within person residual covariance matrix ..................................................................................................... 83 4 12 Mean relative biases of estimates for three LGMs with a MA (1) within person residual covariance matrix ..................................................................................................... 84 4 13 Mean relative biases of estimates for three LGMs with a MA (1) within person residual covariance matrix ..................................................................................................... 85 4 14 Mean relative biases of estimates for three LGMs with a MA (1) within person residual covariance matrix ..................................................................................................... 86 4 15 Mean relative biases of estimates for three LGMs with an ARMA (1, 1) within person residual covariance matrix ......................................................................................... 87
PAGE 9
9 4 16 Mean relative biases of estimates for three LGMs with an ARMA (1, 1) within person residual covariance matrix ......................................................................................... 89 4 17 Mean relative biases of estimates for three LGMs with an ARMA (1, 1) within pe rson residual matrix, collapsing across sample size ......................................................... 90 4 18 Mean relative biases of standard error estimates of for three LGMs with an AR (1) within -person residual covariance matrix ....................................................................... 95 4 19 Mean relative biases of standard error estimates of and for three LGMs with an AR (1) within -person residual covariance matrix ................................................... 96 4 20 Mean relative biases of standard error estimates of variance components for three LGMs with a MA (1) within -person residual covariance matrix ........................................ 96 4 21 Mean relative biases of standard error estimates of for three LGMs with an ARMA (1, 1) within -person residual covariance matrix ..................................................... 97 4 22 Mean relative biases of standard error estimates of for three LGMs with an ARMA (1, 1) within -person residual covariance matrix ..................................................... 98 4 23 Mean rela tive biases of standard error estimates of for three LGMs with an ARMA (1, 1) within -person residual covariance matrix ..................................................... 99 4 24 Percentage of p value below 0.05 for three LGMs with an AR (1) within -person residual covariance matrix ................................................................................................... 102 4 25 Percentage of p value below 0.05 for three LGMs with a MA (1) within-person residual covariance matrix ................................................................................................... 103 4 26 Percentage of p value below 0.05 for three LGMs with an ARMA (1, 1) withinperson residual covariance matrix ....................................................................................... 104 4 27 Percentage of TLI and CFI statistics that indicated adequate model fit for three LGMs with an AR (1) within person residual covariance matrix ..................................... 107 4 28 Percentage of TLI and CFI statistics that indicated ade quate model fit for three LGMs with a MA (1) within -person residual covariance matrix ...................................... 108 4 29 Percentage of TLI and CFI statistics that indicated adequate model fit for three LGMs with an ARM A (1, 1) within -person residual covariance matrix .......................... 109 4 30 Percentage of RMSEA and SRMR statistics that indicated adequate model fit for three LGMs with an AR (1) within person residual covari ance matrix ........................... 111
PAGE 10
10 4 31 Percentage of RMSEA and SRMR statistics that indicated adequate model fit for three LGMs with a MA (1, 1) within -person residual covariance matrix ........................ 112 4 32 Percentage of RMSEA and SRMR statistics that indicated adequate model fit for three LGMs with an ARMA (1, 1) within person residual covariance matrix ................ 114 5 1 Biases of obtained with three data sets for LGM with a parallel process with an ARMA (1, 1) within -person residual covariance matrix ................................................... 120 5 2 Biases of obtained with three data sets for LGM with a parallel process with an ARMA (1, 1) within -person residual covariance matrix ................................................... 121 5 3 Biases of obtained with three data sets for LGM with a parallel process with an ARMA (1, 1) within -person residual covariance matrix ................................................... 122 5 4 Biases of standard error estimates of obtained with three data sets for LGM with a parallel process with an ARMA (1, 1) within -person residual covariance matrix ........ 123 5 5 Biases of standard error estimates of obtained with three data sets for LGM with a parallel process with an ARMA (1, 1) within -person residual covariance matrix ........ 124 5 6 Biases of standard error estimates of obtained with three data sets for LGM with a parallel process with an ARMA (1, 1) within -person residual covariance matrix ........ 125 5 7 The frequency table for the standard e rror estimates of for LGM with a parallel process with an ARMA (1, 1) within person residual covariance matrix ........................ 126 5 8 Biases of standard error of estimates obtained with and without imposing starting values for LGM with a time invariant covariate with an AR (1) within person covariance matrix ..................................................................................................... 129 5 9 The frequency table for th e standard error estimates of under LGM with a time invariant covariate with an AR (1) within person covariance matrix ............................... 129 5 10 Biases of standard error estimates o f obtained with two data sets for LGM with a time invariant covariate with an AR (1) within -person residual covariance matrix ........ 130
PAGE 11
11 LIST OF FIGURES Figure page 2 1 Unconditional latent growth model ....................................................................................... 21 2 2 Latent growth model with a time invariant covariate .......................................................... 28 2 3 Latent growth model with a time varying covariate ............................................................ 32 2 4 Latent growth model with a parallel process ........................................................................ 34
PAGE 12
12 Abstract of Dissertation Presented to t he Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy THE IMPACT OF UNMODELED TIME SERIES PROCESSES IN WITHIN -SUBJECT RESIDUAL STRUCTURE IN CONDITIONAL LATENT GROW TH MODELING: A MONTE CARLO STUDY By Yuying Shi August 2009 Chair: Walter Leite Major: Research and Evaluation Methodology As latent growth modeling is a popular method for analyzing longitudinal data, it is worth y of methodologists attention to invest igate the consequence s of model misspecification. This study specifically investigated the impact of unmodeled time series process es in the within person residual covariance structure on the parameter estimates and standard error estimates, as well as on t he chi -square goodness of fit test and some commonly used fit indexes It was found that when the analysis model failed to include any type of time series process, all the fixed parameter estimates, together with their standard error estimates, were not a ffected. The variance components estimates were biased to different degree s under some conditions, depending on the type of within -person residual covariance structure. The standard error estimates of these variance components were not affected by model mi sspecification Based on the results, it is recommended that applied researchers consider alternative covariance structures. It was also found that w hen the within-person residual covariance structure is an AR (1) or a MA (1) process, the chi -square goodn ess of fit test and RMSEA can be used for model selection under many conditions However TLI could be used to detect model misspecification for only one condition while CFI and SRMR were not reliable in model differentiation. When
PAGE 13
13 the within -person residu al covariance structure was an ARMA (1, 1) process only RMSEA could be used for model selection under certain conditions.
PAGE 14
14 CHAPTER 1 INTRODUCTION Longitudinal data also called panel data, have been frequently encountered in social and behavioral scienc es. A longitudinal data set contains observations of a number of subjects (individuals, firms, countries, etc.) measured over two or more time periods For example, i n educational research, a typical longitudinal data set contains the academic scores of a number of students measured at different time periods. Such data sets provide a large number of observations for a single individual subject and therefore greatly increase the degree of freedom in model estimation. The most important advantage of a longitu dinal data set is that it allows researchers to investigate questions that could not be addressed by using just cross sectional data. For example, with a typical education longitudinal data set, researchers can measure the change or growth of the academic performance among students within a specified time period and can identify what factors affect their growth during this period. Such growth investigation could not be implemented with the cross -sectional data. The popularity of studying change has been re flected in the availability of many large scale national longitudinal data in social science. In the education field, widely used longitudinal data sets include the National Education Longitudinal Study (NELS), High School and Beyond (HSB), Early Childhood Longitudinal Study (ECLS), and National Longitudinal Study of Youth (NLSY). In economics fields, some prominent longitudinal data set such as the National Longitudinal Surveys of Labor Market Experience (NLS) and the University of Michigans Panel Study of Income Dynamics (PSID) have been widely analyzed. Accompan ying the widely available data sets a variety of methods for analyzing longitudinal data have emerged. The commonly used methods include analysis of variance (ANOVA), multivariate analysis of va riance (MANOVA), hierarchical linear model ing (HLM),
PAGE 15
15 generalized linear model (GLM), fixed effects model, random effects model, and latent growth model ing. Each method has its own advantages and limitations. Their applicability depends on the actual resea rch design and research questions that are of interest. Among all these m odel s, l atent growth model (LGM) also called latent curve model, growth curve model, emerged relatively recently but gained increasingly popularity. Moreover, with the recent develop ment of more complex LGM, such as the mixed effect LGM, multilevel LGM, multivariate LGM, latent growth modeling becomes a powerful tool in various situations involving longitudinal data analysis. To see how popular the latent growth modeling method is in social science research, a search in Academic Search Premier, Business Source Premier, EconLit, Professional Development Collection, Psychology and Behavioral Sciences Collection, PsycINFO, Psychology and Behavioral Sciences Collection, Sociological Collec tion using the key word latent growth model in the peer review articles ranging from January 2000 to December 2008 resulted in 931 articles, which is sound proof of the popularity of this method. LGM is composed of the trajectory equation (also called t he within -subject or within person equation) and the level and shape equation (also called the between -subject or between person equation). The trajectory equation describes the growth trend of each individual. It contains an error term that captures all t he unobserved characteristics for a single individual. The level and shape equation describes the latent level and latent shape respectively for all the individuals. In both the level and shape equation, a between -person error term is included to model the variation of growth level or growth trend between people. T he error in the within person equation describes the difference between the value of observed outcome variable and the value predicted by the trajectory equation. It captures all the unmeasured factors for an individual, such as his/her ability, education level, health status or an
PAGE 16
16 event that might affect this persons growth. LGM compared with traditional methods, such as ANOVA, MANOVA, gives substantial flexibility in specifying the within per son residual covariance structure. However, most applied researchers typical ly assume the within -person residuals are multivariate normally distributed with mean of zero and constant variance. That is, each individual has equal variance across time periods and the errors are independent across time Under this assumption, the correlation of observed scores at any two time points is due solely to the presence of between -person variation. This simplification brings some concerns. First, some of the important aspects of change might be captured by the within-person residuals (Biesanz, West, & Kwok, 2003; Hedeker & Mermelstein, 2007). For example, Hedeker and Mermelstein (2007) showed that mood change in the smokers could be reflected in the withinperson residu al covariance structure rather than in the average change. Second, as mentioned before, a nything unmeasured but specific to an individual could be reflected in the within -person error term. If these characteristics remain approximately constant over the sa mple period, then the independence assumption of the within-person r esidual seems reasonable. If these characteristics vary over the sample period, the assumption is less realistic. It is not an unreasonable conjecture that some of the events might affect the individual over time. Consider evaluating the reading ability of kids in elementary school. The reading performance of a child might be increasing at a relatively constant rate, but individual observations might deviate from this general trend due to a number of factors in the individuals growth period ( e.g., a health problem or a family crisis). Previous studies have shown that correlated measurement errors often exist in longitudinal data (e.g., Fitzmaurice, Laired, & Ware, 2004; Joreskog, 1979; Mars h, 1993; Rogosa, 1979; Sivo, 1997; Sivo & Willson, 1998) Therefore, the simple uncorrelated within -person error structure can not fully represent the data characteristics. A variety of more
PAGE 17
17 complex within -person residual structure s ha ve been identified, s uch as Toeplitz or moving average, autoregressive, compound symmetry and etc. (e.g. Goldstein, 1995; Wolfinger, 1993). Third, when the within -person residual covariance structure is misspecified, the parameter estimate might be affected and the inference based on these estimates might be inaccurate. Various studies have been conducted on the impact of assumption violations in the within -person residual covariance structure on model parameters estimates ( e.g., Yuan & Bentler, 2004; Ferron, Dailey, & Yi, 200 2; Singer & Willett, 2003). See Chapter 2 for a presentation of results. This study considers three time series within -person error structures: first -order autoregressive (AR) process, first -order moving average (MA) process and first order autoregressive and moving average (ARMA) process. The three times series are commonly encountered in time series analysis. The three kinds of residual covariance structures, although have been well discussed in field s like the econometrics, are relatively unpopular in ed ucation field. LGM could be classified as unconditional LGM and conditional LGM. The two types of models differ in whether covariates are added in the model. In unconditional LGM, no time varying or time invariant covariate is added in the model, whereas in conditional LGM at least one covariate is included. Within applications of LGM, most applied research is conducted within the framework of conditional LGM, because conditional LGM enables researchers to include predictors and thus to capture the relati onship between individual characteristics and growth parameter. However, most previous studies on model misspecification were conducted within the framework of unconditional LGM (e.g. Sivo, Fan & Witta, 2005; You 2006). Although these studies with uncondi tional LGM shed some light on the possible consequence of model misspecification, whether those results could be generalized to conditional LGM is
PAGE 18
18 unknown. In unconditional LGM, the parameters of interest are mean, variance and covariance of the latent int ercept and latent shape With the inclusion of time varying and time invariant predictors, conditional LGM involves more parameters estimates, for instance, the direct effect of the predictor on latent factors. Therefore, the impact of model misspecificati on might be different from those occur in unconditional LGM. Moreover, even though the AR process has been well discussed in the context of LGM, up to now, very few studies include a systematic discussion of AR, MA and ARMA at the same time. Given their po pularity and importance in time series analysis, they deserve a systematic application in longitudinal data analysis. Furthermore, n o studies have been conducted to investigate the cons e quence of three unmodeled time series processes on conditional LGM. Th e three conditional LGMs investigated in this study are: LGM with a time invariant covariate, LGM with a time varying covariate and LGM with a parallel process These three types of LGMs are representatives of the typical conditional LGMs in applied resear ch. They describe the standard way of including predictors and are commonly used. The goal of this study is to investigate the impact of unmodeled time series processes in latent growth modeling through a Monte Carlo simulation study. To be specific, this study aims to evaluate how the model parameters estimates and standard errors, as well as GOF test and fit indices are affected when the within -person residual covariance structure demonstrates a time series process but the researchers fail to model these processes. This is an area less investigated in LGM. This study is believed to be an important contribution in empirically examining the impact of model misspecification and could provide researchers with better understanding of the consequence of assump tion violation in growth modeling and provide useful information for handling these problems.
PAGE 19
19 CHAPTER 2 LITERATURE REVIEW This chapter is composed of s ix parts. The first part introduce s the unconditional LGM and three types of conditional LGMs, with a g eneral picture presented in t he beginning of the first part, and the basic assumptions in LGM introduced at the end of the first part Then the comparison between LGM with other method s is presented in the second part. The time series model s are introduc ed in the third part together with s tudies regarding modeling time series in the error structure in longitudinal data analysis F ollowed in the fifth part are p revious studies on the impact of model misspecifications The sixth part present s the research qu estions and discusse s the importance of this study. Latent G rowth M odel LGM can describe the individual change in a variety of ways: It can describe the individual initial status and growth trend, which can be linear, quadra tic or other functional forms; It can estimate the variability across individuals in both initial level and trajectories, and can provide a means for testing the contribution of other predictors to the initial status and growth trajectories. Latent growth modeling methods accomplish these functions by analyzing not only the covariance structure but also the mean structure of variables. In other words, it can simultaneously estimate the changes in covariances variance s and means. The covariance structure co ntains information about individual differences while mean structure captures information at the aggregate level. In LGM, there are three important latent factors: level, shape and error, which will be illustrated in the subsequent parts. The analytic int erest in LGM is not specifically on the indicators but on the latent factors. Each outcome variable measured at any time is a function of these three latent factors. One of the advantages of LGM is that it allows the level and shape to
PAGE 20
20 vary across individu als under the assumption that the conceptualization is correct. The level represents the status of individuals in terms of the outcome variable at the measurement time set as a reference. If the first measurement time is taken as reference, the level can also be interp reted as the intercept (Muthn & Khoo, 1998). The level of an individual keeps constant across all measurement times. For different people, the level can be different from the beginning. The shape factor, describes the rate of change across t ime. When the growth trend is linear, the shape is interpreted as a slope. The errors capture the deviation from the observed variables to the estimators obtained from the trajectory (within -person) model. The errors come from a variety of sources: it coul d be measurement error (e.g. the error caused by instrument or rater unreliability) or systematic error (e.g. the error due to unobserved variables or model misspecification of functional form). Unconditional L atent G rowth M odel As described in the introduction, the unconditional latent growth model refers to a model without predictors (See Figure 2 1). The trajectory equation (within -person equation) for this model is expressed as follows: ititiity (2 1 ) w here ity is the outcome variable measured for the i th individual at time t For a simple illustration, data are assumed to be collected in four equally spaced measurement times. All the subsequent introduced formulas would follow the four waves pattern. T herefore, t =1, 2, 3, 4. Parameter i refers to the level for the i th subject while parameter i is the shape for the i th subject. The i and i are considered latent factors. The parameters i and i are allowed to differ across individuals. The variable it is the trajectory equation error of i th individual at time t with ()0itE More about the itwill be discussed later.
PAGE 21
21 1 iy2 iyityi i i 1 i i 11 11 2 t 2 i it ii 1 Figure 2-1. Unconditional latent growth model Parametert refers to the factor load ing of latent shape. The t is fixed as 1 t across all measurement times. That is, t =0, 1, 2, 3, which means all the measurements are taken at equally spaced time points. If the measurement is not taken at equal intervals, for instance, it is taken at month 1, month 2, month 3.5, and month 6, the t are specified as 0, 2, 3.5, 6. When the loading is fixed to be zero, th e time the zero loading represents is called reference point of development. In the above example, month 1 is considered as reference point. In this case,
PAGE 22
22 parameter t represents the elapsed time from the reference point to time t The functional form is linear, which means for equal time periods a given individual is growing by the same amount. The individual level and shape can be decomposed into: ii iBBiB (2 2 ) w here and B are the mean level and mean shape respectively. The mean level represents the average individual initial status. The mean shape represents the average growth rate across all sampled individuals. A positive B indicates that on average individuals grow in the observed variable while a negative B indicates a average growth decrease in the observed variable. The parameters i and Bi are the disturbances of level and shape respectively with mean of zero and variances of and as well as covariance of In unc onditional models, the variances of these two disturbances (i.e., i and Bi ) also represent the variance of the level and shape respectively. However, the interpretation is not the same when predictors are i ncluded in the level and shape equations. When predictors are included (see the subsequent introduction of conditional LGM), the variances of these two disturbances become residual variance s which are interpreted as the variability leftover in the level a nd shape factor after controlling the effect s of predictors. A higher and indicate that sample subjects are more diverse. In the extreme case when i and Bi are all zero, there is no variability of level and shape across all people, which means all individuals have the same intercept and slope for their growth trajectories. A non -zero variance of i indicates that the sampled ind ividuals differ from each other from the beginning of the study. A non -zero variance of Bi indicates that individuals grow at different rates. Hence, adding predictors in the model can help to account for the variability of individual
PAGE 23
23 growth (Willet & Keiley, 2000). Therefore, the level and shape equation describes the individual difference across the whole sample. The covariance between i and iB represents the relationship between the level and growth trajectory. The equation 2 1 and equation 2 2 can be combined to a complete model: .it t itiity (2 3) This combined model is also called reduced form equation (Bollen & Curran, 200 5 ) in that that the endogenous te rm i and i are replaced by their exogenous predictors and disturbances. The variable ity is a combination of fixed component and random component, where the fixed component refers to the term t and random component refers to the term itiit It should be noted that here the random component is heteroscedastic across time due to the effect of t which varie s ov er time. Equation 2 1 describes a linear trajectory relationship between the measurement time and individual growth change. If we want to extend this linear relationship to the broader class of nonlinear relationship, a simple way is to add higher -order po lynomial terms. For example, a quadratic equation becomes: 2 12 itititiity (2 4) where 2 t is simply the squared value of time at measurement time t ; 1 i is the sl ope for the linear term and 2 i is the slope for the quadratic term of the curve. The interpretations of other components of the equation remain the same. Similarly, we can incorporate cubic, quartic or other higher -power terms of ti me in this model. In equation 2 4, as the function does not describe a linear relationship anymore, the change of y is not the same for equal time passage. For instance, assuming measurement at equal intervals and the reference point is time 1, the change
PAGE 24
24 of y from time 1 to time 2 is equal to 12 ii b ut the change of y from time 2 to time 3 is 125ii In the function describing linear relationship (equation 2 1), the change of y between any two time periods always equals i The level and shape equation corresponding to the equation 2 4 is: 111 222 ii ii ii (2 5) E quation 2 5 is similar to equation 2 2 except the addition of the equation for the quadratic slope 2 iB The 2 iB similarly as i and 1 iB is randomly varying across individuals. The structural equation form of the above linear trajectory equations could be expressed employing LISREL format (e.g. Muthn & Khoo, 1998; Singer & Willet, 2003, Bollen & Curran 2005 ). The LISREL formula is presented as follows: iii y (2 6) where iy is a T x 1 vector of repeated measure s, is a T x m matrix of factor loadings, where m is the number of latent factors, i is an m x 1 vector of latent factors, and i is a T x 1 vector of random errors. The matrix format of each term in equation 2 6 can be illustrated as follows, assuming four repeated measures: 11 22 33 4410 11 ,,,. 12 13ii i ii i ii i ii iiy y y y y (2 7) i can be expressed as:
PAGE 25
25 ii (2 8) where i i i B B The expression of y can be obtained by combi ning equation 2 7 and equation 2 8 : (). y (2 9) The model implied variance of the above equation is '(), (2 10) where is the covariance matrix of and represents the variance and covariance matrix of the residuals of the outcome variable y The elements of and are : (2 11) where var()i var()i and cov(,)ii and (still assuming T = 4) 1 2 3 4000 000 000 000e e e e (2 12) When the estimated model fits the data, the following equality holds: = () (2 13) w here is the population covariance matrix of the y s, () is the model implied covariance matrix of the 'y s. The elements of are
PAGE 26
26 112 14 212 24 31 3 34 4142 422 2 222 222 22 2...... ...... ...... ......yyyyy yyyyy yyyyy yyyyy (2-14) The model implied covariance matr ix for the observed variables is 122 1111 212122 22 112( ) () () () () 2tett tt tttte (2-15) Finally, the expected value of the outcome variable equals y (2-16) The model implied mean structure is () When the expected mean y is equal to the model implied mean structurey the following equation should be obtained, in vector notation: 11 22 y y yT T (2-17) The unconditional latent growth modeling is the simplest form of latent growth modeling. In practice, many researchers fit an unconditional growth model befo re fitting any type of more sophisticated LGM, such as conditional LGM, multilevel LGM, mixture LGM. The unconditional LGM could be used to establish the corr ect growth trajectory. Furthermore, the unconditional LGM describe the variability of the level and shape and serve as an assessment of whether adding predictors is justified. In general, the unconditional LGM is the first step in many LGM applied studies.
PAGE 27
27 Conditional L atent G rowth M odel As mentioned above, adding predictors in the model can help to account for the variability of individual growth. In many situations, researchers are interested in more complex research questions. The conditional growth model provides a convenient way to test various hypotheses. For instance, if we want to estimate the change of childrens math skill by controlling their social economics status (SES), SES can be added as a predictor in the model. LGM allo ws us to incorporate predictors in the model in extremely flexible ways, which will be illustrated in the subsequent examples. Predictors could be time invariant or time varying. Time invariant predictors refer to variables that are constant across time, s uch as gender, nationality and ethnicity. Time varying predictors, on the contrary, refer to predictors that change as time passes by, such as students test performance, marital status, individuals ability, and so on. The conditional LGM s that were inves tigated in this study were LGM with a time invariant predictor, LGM with a time varying covariate and LGM with a parallel process These are commonly used conditional LGM in applied research. Latent growth model with a time -invariant covariate For a simp le illustration, only one predictor measured without error is included (See Figure 2 2). In real situations, more than one predictor can be incorporated into the model. The trajectory equation is still the same as that in unconditional model: ititiity (2 18) The level and shape equation is different from that in unconditional model: iii iBBiBix Bx (2 19)
PAGE 28
28 where and B are the mean level and mean slope respectively when predictor x is set to zero.. The parameters i and Bi are the disturbances of leve l and shape respectively after controlling the effect of the predictor x As mentioned before, in unconditional models, the variances of these two disturba nces also represent the variance of the level and shape respectively. However, once pred ictors are included in the level and shape equations, the i and Bi can not be simply interpreted as the varian ce of the level and shap e respectively any more. The coefficients 1 and 1 B are the direct effects of x variable on level a nd shape respectively. 1 iy2 iyityi i i 1 i i i x 11 11 2 t 2 i it 1 Figure 2-2. Latent growth model with a time invariant covariate
PAGE 29
29 The structural equation form of the model is represented as follows (still assuming four waves): iii y (2 20) w here 1 2 3 4 i i i i iy y y y y 10 11 12 13 i i i 1 2 3 4 i i i ii iiix (2 2 1 ) w here i i i B 1 1 i i The combined model is obtained by substituting i in equation 2 21 into equation 2 20: ()i iiiyx (2 2 2 ) The implied mean structure is ()y (2 2 3 ) The model implied covariance structure could be derived by using deviation score to simplify the analytical expression of the implied covariance matrix (Bollen and Curran, 2005 ). The deviation score formula is presente d as follows: [()][()](())iy iii iiiy (2 24) The model implied covariance matrix is
PAGE 30
30 '' '' ''()() () ()() ()()()() ()()()() () iyiy yy yx xy xx ixix iyiy iyix ixiyixix xxyy E xx EyyEyx ExyExx '' xx xx xx (2 2 5 ) w here xx is the population covariance matrix of x s, and the meaning of the other symbols remain the same meanings as in the description of the unconditional model There are two ways to incorporate time invariant predictors. One way, as described above, is to let the predictor impose direct effect on latent curve factors but only has indirect effect on out come variables. This model is also called growth predictor model by Stoel, R.D., Van den Wittenboer, D. & Hox, J. (2004). This is a widely used model in social science research. Among the 267 peer reviewed journal articles found by using key word latent growth searching in databases of Academic Search Premier, Business Source Premier, EconLit, Professional Development Collection, PsycINFO, and Sociological Coll ection from 2004 to 2008 more than 30% of studies employed this model. Stoel, R.D., et al. (20 04) argued that although this model had the distinctive advantage that the effect of time invariant covariate on growth parameters could be captured directly, the appropriateness of this model was based on the assumption of full mediation. That is, the di rect effect of time invariant predictor on the outcome variable is equal to zero. If this assumption does not hold, the model is considered incorrect. Based on this argument, they proposed an other way to incorporate time invariant predictor s : regress predi ctors directly on outcome variables. This model was termed as direct effect model by Stoel, R.D., et al. (2004) The model trajectory equation is described as follows:
PAGE 31
31 itititiityx (2 2 6 ) where ix is the time invariant covariate for each individual and t is the regression coefficient between ix and ity .The subscript t for t indicates that the effect of ix on ity changes at different time. The level and shape equation is the same as equation 22: ii iBBiB (2 2 7 ) where all the symbols remain the same meaning as before. Although t his model is also widely used in applications, this study only focuses on growth predictor model. Latent g rowth m odel with a time -varying c ovariate The conditional model with a time varying covariate is more complex than model with a time invariant covariate in that the predictor varies with time (see F igure 2 3). The time varying covariate has to be added in the trajectory equation: ititititityx (2 28) w here all the terms are the same as we specified in equation 2 2 6 except that itx is a time varying covariate measured for individual i at time t and its effect on outcome variable ity is captured by coefficient t The variable ity is now a function o f level, shape, a time -specific influence of the covariate itx, plus a random error. The level and shape equation is the same as that in unconditional growth models: ii iBBiB (2 2 9 ) where all the symbols remain the same meaning as before.
PAGE 32
32 1 iy2 iyityi i 1 i 1 i i i x 11 11 2 t 2 i it 2 t it x 2 i x 1 i x 1 Figure 2-3. Latent growth mode l with a time varying covariate The structural equation form of this model is represented as follows: iiitiy (2-30) where 1 2 3 4i i i i i y y y y y 10 11 12 13 i i i 1 2 3 4 ,1 2 3 4i i it i i 1 2 3 4i i i i i
PAGE 33
33 ii (2 3 1 ) where i i i B i i According to this model, y is jointly affected by both the underlying random growth process and the time specific influences associated with the time varying covariate. A typical example of this model is the st udy co nducted by Curran, Muthn and Hartford (1998), where they investigated time -specific impact of becoming married on heavy alcohol use. He tried to find out whether becoming married for the first time would affect heavy alcohol use controlling the norm al development trend of alcohol use in early adulthood. This model is just appropriate for his research question. Latent growth model with a parallel process The previous two sections introduced two kinds of conditional LGMs that are also considered univar iate LGM. That is, although there are multiple measurements on the outcome variable, they are multiple measures of one dependent variable. Sometimes we are interested in the analysis on more than one outcome variable. Suppose we have a dataset reflecting s tudents academic performance at school. We might be interested in not only the growth trend in both individual mathematics and reading achievement but also whether the individual concurrent changes in the two areas are mutually interrelated. This allows u s to understand the change in several domains and how these domains relate to each other. When LGM includes the latent curve process on more than one outcome variable, this type of model is called multivariate LGM. In this paper, it is referred as LGM with a parallel process. An example of a parallel process model is presented in Figure 2 4.
PAGE 34
34 1 i y 2 i y it y iy iy yi 1 i yi 11 11 2 t y y 2 i it yy ii 1 i x 2 i x it x ix ix 1 1 1 x i x i x x 1 2 t 1 i 2 i it 1 2 1 1 1 x x ii Figure 2-4. Latent growth m odel with a parallel process
PAGE 35
35 F or a simple illustration, only two outcome variables were included. The model e quations for the variable y and for the variable x are described as: itiytiyyity (2 32) itixtixxitx (2 3 3 ) 1 12 iyyixyi iyByix ixByiB (2 3 4 ) and ixxxi ixBxBxiB (2 35) where iy and iyB represent the level and shape factor respectively for the outcome variable y ; parameters ix and ixB represent the level and shape factor respectively for the variable x ; parameters y and By are the mean level and mean slope respectively for the outcome variable y controlling all other terms in their sepa rate equation; parameters x and Bx are the mean level and mean slope respectively for the outcome variable x ; parameters yi and Byi are still the disturbance for the level iy and shape iyB respectively and the parameters xi and Bxi are the disturbances of level and shape for the level ix and shape ixB respectively The coefficient 1 indicates the effect of initial status of the x variable on the initial status of the y variable. If the 1 is positive, higher growth status of the x would anticipate higher growth status of the y variable, after controlling the impact of the growth shape of the x variable Parameter 1 captures the relationship between the level of the x variable and the shape o f the y variable when the ix is controlled. Parameter 2 represents the effect of growth shape of the x variable on the
PAGE 36
36 growth shape of the y variable controlling the impact of ix A p ositive value of 1 indicates that high growth status of the x variable would predict fast er growth on the y variable. A p ositive value of 2 would indicate that individuals growing quickly on the x varia ble would also tend to grow quickly o n the y variable. One difference between the univariate LGM and multivariate LGM is that the latent factors have to be subscripted with y or x to differentiate the repeated measure of interest. With two outcome variable s, the relationship between latent factors of one variable and the other one becomes much more complex. A point that is worthwhile to mention here is that there is no impact of the growth shape of the x variable on the level of the y variable. The rationa le is obvious: the growth shape of the x variable is obtained later than the level of the y variable. Therefore, a future estimated variable can not be used to predict the current variable. The structural form of the model could be represented as follows: iiyiy (2 3 6 ) where 1 2 3 4 i i i i iy y y y y 10 11 12 13 iy iy iy 1 2 3 4 i i i i i and iyy i (2 3 7 ) where iy iy iy y y y 12 12 ix ix B and ii (2 3 8 )
PAGE 37
37 where 1 2 3 4 i i i i i 10 11 12 13 ix ix 1 2 3 4 i i i ii There are several variations of the parallel process model. In the model described above, only the level and shape of the y variable are predicted by the level and shape of x variable, not vice versa. In many studies, the level and shape of the x and y variables were predicted by each other in a variety of combinations. For example, the shape of the x variable can be predicted by the level of the outcome variable (e.g. Cheong, Mackinnon & Khoo, 2003; Curran, 2000). Therefore, the meaning of the outcome variable and predictor variable get blurred here. The key concept is that different domains are interrelated and are not independent of each other. All the variables must be assessed in the same measurement occasions. As pointed out b y Muthn (2002), one advantage of growth modeling in a latent variable framework was the ease with which to carry out analysis of multiple processes, both parallel in time and sequential. A variety of applications of this model have been discussed recently (e.g. Hudson, 2008; Sim ons 2007; Mitchell, Kaufman, & Beals, 2005). Assumptions of G rowth M odeling The assumption s of growth modeling can be summarized in three aspects: within -person residual covariance structure, measurement time and missing data, and functional form of growth. Within p erson r esidual c ovariance s tructure When the outcome variable is continuous, it is commonly assumed that the within-person error it is multivariate normally distributed with mean of zero and cova riance matrix If the outcome variable is categorical, alternative estimation method would be used, such as weighted
PAGE 38
38 least squares with corrected means and variance (Muthn & Khoo, 1998). Under the condition of categorical outcome variable, the assumption of multivariate normality should be relaxed. In a fashion analogous to the assumption in regr ession analysis, all the variables in the right hand side of the trajectory equation are uncorrelated with the erro r. More formally, take equation 2-1 as an example, that is, cov(,)0iti and cov(,)0iti for all i and t. The variance ofit could be constant or non constant, depending on the data char acteristic and re al situation. Although it is mentioned in the in troduction part that LGM allows the measurement error to be correlated across different time, it is not a general assumption. Many studies assume that the errors are not correlated over time, i.e.,,cov(,)0itits for s 0. It is also assumed that the errors of different indivi duals at different time ar e uncorrelated, that is ,cov()0itjts for ij and for s0. When the errors are assumed to be un correlated over time, the assumption about the residuals is expressed as the follows: 11 2200 0 00 0 000 0ie ie itetN (2-39) Regarding the level and shape equation, the unconditional LGM was used for a simple illustration: ii iBBiB (2-40) The disturbances i and Bi are normally distributed with mean of zero and variance of i and Bi They are also correlated with each other with covariance Furthermore, the two disturbances are assumed to be uncorrelated with the errorit
PAGE 39
39 Measurement time and missing d ata It is commonly assumed in growth modeling that the repeated measures for individuals are equally numbered and equally spaced for all individuals and there is no missing data (Duncan, Duncan, Strycker, Li, & Alpert 1999). This assumption is considered a serio us limitation of LGM (e.g., Willett & Sayer 1994). However, MacCallum Kim, Malarkey, and Kiecolt Glaser (1997) argued that the development of full information maximum likelihood can relax this assumption. This method defines the likelihood function using individual score instead of variance and covariance matrix. Therefore, even the measurement time is irregular and/or there is missing data, the estimation of LGM can still be accomplished by using full information maximum likelihood. However, this method is limited in its application by the available software. Functional form of d evelopment The fundamental assumption, also considered the most serious limitation of LGM, is that all subjects have to follow the same functional form of growth. That is, all ind ividuals (firms, countries, etc.) have to keep the same linear, quadratic or other form of trend (Hertzog & Nesselroad, 2003; Lawrence & Hancock, 1998). Therefore, although LGM allows all subjects to have different growth trajectory, their basis functional form has to be the same. With the development of multiple group SEM, individuals can be separated into different groups if sufficient information about the separation is known before Then different groups can be described by different functional forms. However, within each separated group the trajectory equation has to take the same functional form for all individuals Comparison s w ith Other Methods The fact that LGM can explicitly model measurement error is a potential advantage of SEM over other more t raditional methods such as ANOVA and MANOVA. ANOVA has the
PAGE 40
40 most stringent constraints on covariance matrix for the observed variables The co variance matrix has to meet the sphericity assumption: the variance of difference scores for each pair of time poi nts are equal MANOVA does not require the sphericity assumption but it has the same disadvantage of ANOVA: they treat the differences among individuals in their growth trajectory as error variance. Multilevel modeling is another powerful tool in longitudinal data analysis. It also offers great flexibility in modeling covariance structure. The relationship between structural equation modeling and multilevel modeling has been extensively investigated (e.g. Curran, 2003; Raudenbush & Bryk, 2002). LGM, within the framework of SEM, is comparable with HLM in many aspects. It is believed that when repeated observations are nested within individuals, SEM and HLM are analytically equivalent methods (Curran, 2003). In HLM, the level one equation (also called the wi thin person equation), using notation from Raudenbush & Bryk (2002), is presented as follows: 1 itoiiitityae (2 4 1 ) Where ity is the outcome variable for person i at time t ; oi is the individuals initial status and 1 i is the growth rate; ita represents the time of measurement and ite is the error. Within SEM framework, oi and 1 i correspond to the level and shape factor respectively, and ita corresponds to the factor loading t The level two equation (also called the between person equation) describes the variability in initial status and growth rate across individuals. The equation is presented as follows: 00 1101 oi oi ii (2 4 2 )
PAGE 41
41 where 00is the mean initial status and 10 is the mean growth rate; 0 i and 1 i is the random variance component for level and shape respectively; the covariance of level and shape is captured by the covariance between 0 i and 1 i Note that equation 2 4 1 and 2 4 2 are similar with the unconditional LGM equation 2 1 and 2 2 Under general conditions, the two modeling methods, HLM and LGM, are approaching the same problem from a different perspective (Curran, 2003). Whe n a time invariant covariate is included in the model, it is added in the level two equation as shown in equation 2 43, and the level -one equation is the same as equation 2 4 1 00011 1101111 oi ioi i iiX X (2 4 3 ) w here 1 iX represents the time invariant covariate, 01and 11 capture the effect of 1 iXon initial status and growth rate respectively. This equation is comparable to equation 2 1 9 introduced in the section entitled LGM with a T ime Invariant C ovariate Therefore, if we have only time invariant covariate, the HLM model equation is analytically the same as LGM with a time invariant covariate. When time varying covariate is added in the model, the level one equation becomes 12 itoiijiiitityaxe (2 4 4 ) where itx represents the time varying covariate and 2 i is the effect of time varying covariate on outcome variable controlling the influ ence of time. In LGM, the trajectory equation with time varying covariate is presented in equation 2 2 8 The difference between equation 2 2 8 and equation 2 4 4 lies in the regression coefficient for the time varying covariate. In equation 2 44, the coeffic ient 2 i is a constant, which means the effect
PAGE 42
42 of the time varying covariate on the outcome variable remain the same across different time periods. In equation 2 2 8 as mentioned before, the regression coefficient t has a subscript t which shows that the effect of time varying covariate on outcome variable varies at different time period. A point worthwhile to mention is that the 2 i in equation 2 44 can also be allowed to change with time, although it is not introduced here. The level two model for the HLM is still the same as equation 2 4 2 Therefore, under the situation when a time varying covariate is added, the LGM and HLM is comparable but the researchers should decide whether th e regression coefficient of the time varying covariate varie s with time. According to Raudenbush and Bryk (2002), within the context of modeling change, the difference between HLM and SEM lies in the limitations of software rather than the real model diff erence. They recommended using SEM approach when correlated error existed because the available SEM software allows for easy specification and estimation of correlations between errors. The above comparison of HLM and SEM focuses on the situation when rep eated observations are nested within individuals. However, when data structure was individual level nested in group level, e.g. students nested in schools, using SEM to implement the analysis is a data management nightmare and error prone process (Curran, 2003 ). The interpretation of parameter estimates using SEM requires special care and attention. For example, the latent factor means in the SEM are regression coefficients in HLM. It is recommended by Curran (Curran, 2003) that using HLM would be a better approach when no other element s of the SEM are incorporated in a multilevel model. The analysis of longitudinal data has also been well investigated in econometrics. The most commonly used m odel s are fixed effect s model and random effect s model (also call ed
PAGE 43
43 variance component s model) (e.g., Hsiao 2003). For simplicity, only one independent variable is considered The fixed effects model is given by: itiitityxu (2 45) where i =1,, N, t =1,,T, ity and itx are the endogenous variable and exogenous variables respectively measured for the i th individual unit at time t The i captures the specific effect for a n individual unit and is assumed to be constant over time. The is a constant parameter representing the relationship between x and y and is constant across all the individual units and time periods. The itu is identically and independently distributed wit h mean of zero and variance 2 u The itu represents the effects of the omitted variables th at are peculiar to both the individual units and time periods The fixed effects model assumes that both the i and are non random variables whereas in LGM both i and i are random v ariable s. The two models all assume that each individual unit has different intercept s ( i.e., i is different across individuals) In fixed effects model, parameter is a slope parameter and is assumed to be the same across all individuals, while in LGM i is differe nt across individuals. The factor loading t in LGM can be represented in fixed effects model by including time code s for each period as an extra explanatory variable For example, an extra itx which is a v ector of (0, 1,,T) could be added for a linear time effect The coefficient of the time varying covariate in LGM varies with time but remains constant in fixed effect model. The random effects model formula is : itititititityxvxdu (2 46)
PAGE 44
44 whe re itititvdu i td and itu are random variables with 2(0,)iiid: dt : 2(0,)diid and 2(0,)it uuiid: and t he three random variables are jointly independent. Furthermore, 222 2cov(,) for = for itis duvv ts ts (2 47) a nd 2cov(,) = for = 0 f or itjsdvv ijts ijts (2 48) The i captures individual differences that endure over time and is a constant over time Therefore the i is a time invariant individual effect. The tdrepresents factors that are peculiar to specific time periods but affect individuals equally. So it is an individual invariant effect. The p arameter i in random effects model is no longer a fixed variable as in fixed effects model. The i and td like itu are treated as random variables in random effects model Parameter i is termed as permanent component and itu is a transitory disturbance. (MaCurdy, 1982). To make the comparison between random eff ects model and LGM with a time varying covariate easier to understand equation 2 2 8 and equation 2 29 for LGM with a time varying covariate is combined to the following equation it tittBitBiityx (2 49) With two more explanatory variable s itx added, t he random effects model is comparable to LGM with a time varying covariate One itx represents the time dummies for each period as described before, which represents the factor loading t in LGM. The regression coefficient of this dummy variable is the mean slope (i.e., B in equation 2 49). Another variable itx is a vector of
PAGE 45
45 constant one that is, itx = ( 1,). The length of this vector equals to the total number of time periods. The coefficient of this constant variable is comparable to the mean level in the LGM (i.e., in equation 2 49). The i in t he random effects model is comparable to the i and 2var()i The td can represent the effect of the product of t and Bi Therefore 22var()tdtd The residual itu is comparable to the within person equation residual it However, as in the fixed effects model, t the coefficient of the time varying co v ariate of itx in LGM, is comparable to the in random effects model except that t changes with time but remains constant. The above two kinds of models a ll assume different intercept s for different individual units and the slope coefficient constant for either the time dummy or other time varying covariates. Therefore, both fixed effects model and random effects model belong to the category of variable int ercept model. There are models in econometrics that assume coefficient to be random, that is, models that allow the coefficients to differ from unit to unit and/or from time to time. The general specification of the variable coefficient model is assuming only cross -sectional differences are present : ',itiitityxu (2 50) or assuming only time period differences are present : ittitityxu (2 51) where i or t ea ch is a K x 1 vector of parameters and itx is a K x 1 vector of independent variables.
PAGE 46
46 Because the variable coefficient model is not as widely used in empirical work as the variable intercept model due to the computational complexi ties (Hsiao, 2003) it is not introduced here. Stationary Time S eries M odel As mentioned before, correlated residuals often present in longitudinal data. Some of these correlations actually followed the stationary time series process (e.g. Sivo 2001; Siv o and Willson, 2000). A time series refers to a set of observations generated sequentially in time (Box and Jenkins, 1976). The time series could be strongly stationary or weakly stationary. In a strongly stationary time series, the joint probability dis tribution does not depend on time itself but on the difference of time points. In other words, those series whose statistical properties such as mean, variance, covariance, etc. do not depend on time t that is, its statistical p roperties are all constant over time. P arameters such as the mean and variance of the outcome variable at time 1 should be equal to those at time 2, 3 and so forth. Furthermore, the covariance between any two of the observations, say, ty and tsy is assumed not dependent on time t but only on the time periods between the two observations. A time series could be stationary in one statistics, e.g. the mean, (termed as mean stationary) but not stationary in another characteristic, e.g. variance. A time series is weakly stationary when it is both mean and variance stationary. With the stationar ity assumption, one can simply predict that the statistical properties will be the same in the future as they have been i n the past. According to Box and Jenkins (1976), stationary time series data often may often be modeled by two distinct stochastic processes: autoregressive (AR) and moving average (MA). The a utoregressive moving average (ARMA) process is a mixture of the two processes. Box and
PAGE 47
47 Jenkins (1976) introduced three linear stationary models accordingly. They are AR model, MA model and ARMA model These models would be introduced in the following sections. Autoregressive (AR) M odel The idea of this model is that each measure at time t is a function of measures of previous time. The equation is as follows: 1122....ttt ptptyyyy (2 52) w here ty is the outcome variable at time t is the correlation between two outcome at different time and ||1 The variable t is called while noise ( Box and Jenkings,1976), which consists of a series independently distributed random shocks with ()0tE and 2var()t The process defined by equation 2 52 is called autoregressive process of order p also termed as AR( p ) process. The first order autoregressive mod el AR(1) refers to the model in which the outcome variable in time t is only affected by its immediate previous variable at time t 1. Under AR(1) assumption, the equation 44 would be simplified as 11.tttyy (2 53) For convenience in determining other properties of a time series process, there is no intercept included in the model, which means the mean of the ty is equal to zero. The zero mean of an AR(1) process can be obtained by taking the expected value in equation 2 53: 111()() tt EyEy. (2 54) Since an AR process is stationary, the mean at all time period s are equal, that is, 1()()...ttEyEy The following equation could be obtained: 111()().ttEyEy (2 5 5 ) As is not equal to one, t herefore by equation 2 55, the is equal to zero. The variance of ty is
PAGE 48
48 22 1111var()var()var()var()tyttttyyy (2-56) 1since and are independenttty The covariance is the same for any outcome variables with one period apart, which is 111 1 1 1 22 11 11cov(,)[()()][][()] =()()()tttt tt ttt tttyyyEyyEyyEyy EyEEy (2-57) The covariance between t y and 2 t y is given by 222 1 1 2 22 11221cov(,)[()()]()[()] =()()tt tt tt ttt ttttyyyEyyEyyEyy EyyEy (2-58) Similarly, the covariance between t y and tk y could be derived as 2 1cov(,)k ttkyyy (2-59) Therefore, the covariance matrix associated with an AR (1) model is defined as: 2 111 1 111 2 22 11 1 12 1111 1 1 1t t t y ttt (2-60) This covariance matrix is symmetric with constant 2 y in the diagonal. A matrix of this form is called autocovariance matrix and th e corresponding correlation matrix is called autocorrelation matrix (Box & Jenkins, 1976). Th e AR (1) process is sometimes called the Markov process because the distribution of t y given 1 t y ,2 t y ,3 t y is exactly the same as the distribution of t y given 1 t y Similarly, the AR (2) model is defined as a m odel in which the outcome variable in time t is only affected by its immedi ate two previous variables:
PAGE 49
49 1122 ttttyyy (2 61) In terms o f practical importance, only an AR(1) or an AR(2) model are given considerable attention in application. For example, McCleary and Hay (1980) conducted a study to investigate the effect of community crime prevention program on the purse snatchings in Hyde Park, Chicago from January 1969 to September 1973. The number of purse snatchings followed an AR(2) time series. Moving A verage (MA) M odel In t his model explains a construct measured at one time is affected by autocorrelated residuals: 1122....tttt qtqy (2 62) w here ty is the outcome variable at time t denotes the correlation between two residuals at some lag and ||1 The sequence of resid uals {}t is a white noise series with zero mean and constant variance 2 The process defined in equation 2 62 is called moving average process of order q and could be abbreviated as MA( q ). This process is useful in describing phenomena in which some random events introduce an immediate effect that only lasts for a short period of time. The first order moving average model, MA (1) model, similar to an AR(1 ) model, refers to a model in which the outcome variable is only affected by its residual and the immediate previous residual. The formula is presented as follows: 11 ttty (2 63) The mean of the ty for an MA ( 1) model is equal to 0 as {}t is a series with mean of zero and variance 2. The variance of ty is given by the following equation:
PAGE 50
50 222 2222 11111var()()(2 )(1)tytttttyEyE (2-64) Therefore, by equation 2-64, the relationship between 2 and 2 y is obtained as follows: 222 1/(1)y (2-65) The covariance between t y and 1 t y is derived as: 111 11 1 2 222 1 111 2 1cov(,)()[()()] () 1tttttttt tyyyEyyE E (2-66) The covariance between t y and 2 t y is derived as: 221 12 1 3cov(,)()[()()]0tttt ttttyyEyyE (2-67) The covariance between t y and tk y for2k could be derived similarly and are all equal to zero. It seems that a MA (1) process has a memory only of one period while it is not true for an AR(1) process. The covariance matrix associated with a MA (1) process is: 1 2 1 11 22 11 2 1 2 1100 1 10 11 010 1 0001y (2-68) The second order moving average process is defined as: 1122 tttty (2-69) The MA (1) model and MA (2) model, just like the AR(1) model and AR(2) model, are particularly important in practice.
PAGE 51
51 Autoregressive M oving A verage (ARMA) M odel This model captures the process when the above two situations happen at the same time. 1122 1122.... ....ttt ptpttt qtqyyyy (2 70) w here all the symbols remain the same meaning as described above. This pr ocess is referred as ARMA ( pq ) process. It may be thought as a p th autoregressive process and a q th order moving average process. The first order autoregressive moving average ARMA (1, 1) model is defined as follows: 1111 ttttyy (2 71) Taking the expected value for both sides of equation 2 71, the mean of ty for ARMA (1,1) is equal to: 111 ()() tt EyEy. (2 72) Since () t Ey = 1()tEy = and ||1 equation 2 72 tells us that the value of is equal to zero, which is the same as the in an AR or an MA model. The variance of the outcome variable in an ARMA process is: 22 2 1111 22 222 111111 22 2222 1111var()()[( )] 2() =2tytttt y tt yyEyEy Ey (2 73) w here 22 111211211()[( )]()tt tttttEyEy E and 1 t is not correlated with 2 ty or 2 t According to equation 2 73, 2 22 1 2 111(1) 12y (2 74) The covariance between tyand 1 ty could be derived as:
PAGE 52
52 111111 22 2 1111111cov(,)[( )] =()(,)tttttt tttyyyEyy EyEy (2-75) Substituting 2 in equation 2-74 into equation 2-75 lead to the following result: 22 2 1111 111 2 111(1-)(-) cov(,) 1-2tty yyy (2-76) Similarly, the covariance between t y and 2 t y could be derived as: 221111112 1212112cov(,)[( )](,) [( )] tttttt tt ttttyyEyy Eyy Eyy 22 11122 222 111 11111 () (1-)(-) = 1yt t yEy 2 2 111, -2y (2-77) and the covariance between t y and tk y could be derived as: 1 21122 11111 111 2 111(1-)(-) cov(,) 1-2k kkk ttkyyyy (2-78) The covariance matrix associated with an ARMA (1,1) process: 2 2 111 11 1111 2 3 111 1 1111 2 1111 2 2 4 2 111 11 1 111 1111 2 234 111 111 111112 1 ()(1) 12 11 ()(1) ()(1) 12 12 ()(1) 12 ()(1)t t y t ttt (2-79) Modeling Time Series in the Error Structure in Longitudinal Data Anal ysis The use of time series process to mode l the error structure can be found in many longitudinal data analyses. For example, in ec onometrics literature, many studies estimated variance component models where transitory components followed time series structure (e.g.,
PAGE 53
53 David, 1971; Hause 1977; Lillard & Willis 1978; Lillard & Weiss 1979; MaCurdy 1982). As introduced in the model comparison part, the transitory component is comparable to the within person residu als in LGM Although time series is relatively unpopular in education al research it has gained increasing popularity lately Researchers have attempted to either directly integrate time series model into growth model (e.g. Curran & Bollen, 2001; Sivo, et al 2005) or capture the time series process in within -person residual covariance structure un der the framework of HLM or SEM (e.g., Ferron, et al ., 2002; You 2006; Kwok, West and Green ,2007). The most common time series process in a within -person res idual structure is the AR (1) error structure (e.g., Wolfinger, 1993; Ferron, et al ., 2002; Kwok, et al ., 2007). It is taken as an alternative error structure in many studies investigating misspecification of within-person error covariance structure (e.g., Ferron, et al ., 2002; Singer and Willet 2003; Kwok, et al ., 2007). Mplus, the commonly used SEM software even include s the AR (1) within -person residual structure in its demo nstration examples Although the use of MA or ARMA model is relatively less inve stigated than the AR model, they are not difficult to be located For example, Sivo (1997) pointed out that if measurement error correlations were found in longitudinal data sets, these correlations usually were found at a particular lag nearest to the diagonal of the error covariance matrix, indicating a MA or an ARMA process. In other studies that investigat ed the effects of misspecifying the within -person residual structure, the MA ( 1) and/or ARMA(1,1) model were also selected as alternative error struct ure s (e.g., Kwok, et al ., 2007; Singer and Willet 2003). S tudies on the Impact of Misspecifying the Within-Person Error Structure As mentioned before, although researchers often assume the within person error residuals are uncorrelated in many applied st udies, previous literature indicated that thi s assumption was
PAGE 54
54 often violated. Failing to take account of the residual structures among repeated measures might bias the model estimates and lead to incorrect inferences (e g., Fitzmaurice et al., 2004; Singer & Willett, 2003). Therefore, it deserves the methodologists attention to investigate the impact of independence assumption violation on parameter estimates. It is generally believed that in linear mixed models, the fixed effects estimates are consistent no matter whether the random effects part of the model is correctly specified (Verbeke & Molenberghs, 2000). However, when the random effects are not correct, the standard errors usually computed for the fixed effects estimates may no longer be appropriat e. Ferron, Dailey, and Yi (2002) investigated the impact of misspecifying the within -person residual structure under the framework of HLM through a series simulation studies. The models examined included multiple predictors in a level two equation, non -l inear growth curve, or missing or unequally spaced observations. It was found that when the residual covariance structure was simply assumed to be a diagonal matrix with constant variance, but it actually followed an AR (1) or a MA ( 1) process, under mos t conditions, except the nonlinear model with unbalanced design, the estimates of the fixed effect remain unbiased and the tests of the fixed effects were robust to the model misspecification. However, w hen model failed to include the AR (1) structure, int ercept and slope variance estimates were inflated while their covariance were deflated. Model fit criteria frequently failed to identify the correct model when the length of measurement periods was short. Based on the opposites -naming data, Singer and Wi llet (2003) compared the following six residual covariance structures in multilevel model: unstructured, compound symmetric, heterogeneous compound symmetric, autoregressive, heterogeneous autoregressive and toeplitz. They found that, except for the toepli tz and unstructured residual structure, other residual
PAGE 55
55 structure did not make a strong improvement in model fit. However, the precision of the fixed effect estimates improved for all the error structures except for the toeplitz unstructured and standard r esidual structure. Their overall conclusion is consistent with those from Verbeke and Molenberghs (2000): E stimates of the fixed effect are unbiased regardless of the error structure but the standard error of the fixed effect estimates would be affected b y the selection of error structure. However, conclusions from this study might not be tenable due to the small sample size (35 participants) and short measurement periods (only 4 occasions were examined on each individual). Yuan and Bentler (2004, 2006) an alytically showed that the intercept and slope parameters in linear growth curve models could be estimated consistently even when the covariance structure was misspecified. You (2006) evaluated how the growth model estimates were affected when both the ho moscedasticity and independence assumption were violat ed Her simulation design was conducted with a linear unconditional latent growth modeling. Results indicated that the misspecification of error structure had no impact on the estimates of the intercept and slope of the growth trajectory, which was consistent with those from Yuan and Bentler (2004, 2006). For the variance components estimates You (2006) found that under most conditions, the variance estimates of the intercept and slope were generally in flated, while the covariance of intercept and slope was generally deflated. Kwok, West and Green (2007) conducted a Monte Carlo study to investigate the impact of misspecifying the within-subject covariance structure in longitudinal multilevel models und er the multilevel model framework. The multilevel model they employed is a random regression coefficient model with only time variable included in the level one equation. This model, as
PAGE 56
56 discussed before, is comparable to the unconditional latent growth mod el. It was found that when the within -subject covariance matrix was an AR(1) structure, misspecifying it as a diagonal matrix with constant variance resulted in over estimates of random effect s (e.g., 00 01 ). Regardless of the effect of misspecifying the within -subject error matrix, the fix ed effect parameter estimates were unbiased but their standard err or estimates were overestimated, which was consistent with those from Verbeke and Molenberghs (2000). Si gnificance of This S tudy The above literature review has demonstrated the consequence of unmodeled time series processes in longitudinal study. It was shown that when the within -person error structure was misspecified, fixed effect parameter estimates were unbiased, the standard error estimates of fixed effects were possible affected, and the random effects were biased. The major focus of this study is to examine the effect of misspecification of the residual structure on the estimation and testing of the f ixed effects and the random effects of the conditional LGMs The motivation of doing this study is based on the following three reasons : First, t he above literature review has shown that latent growth modeling is a powerful tool in assessment of change, ow ing to its flexibility in including time varying and time invariant covariate, its less strict requirement on residual covariance structure assumption and its various model formats. Despite these advantage s if the model w ere misspecified, there would exis t possibilities of biased estimates. Therefore, it is important to examine the extent to which latent growth modeling is robust to the model misspecification. However, there are limited simulation studies to examine this issue empirically, especially in t he framework of LGM. Furthermore, although m ost applications have been conducted within the framework of conditional LGM since only conditional LGM provides a platform to test more complex research questions previous
PAGE 57
57 studies about model misspecifications are mainly performed within the framework of unconditional LGM. Whether the results from unconditional LGM could be generalized to conditional LGM is unknown, given the much more complex nature of conditional LGM. Therefore, it is hoped that this study ma y contribute to the knowledge about the impact of model misspecification in LGM and make the results more generalizable. This study is hoped to result in recommendations to applied researchers in analyzing longitudinal data. Second, a lthough time series pr ocess has been extensively investigated in econometrics, it is still relatively unpopular in structural equation modeling. Due to the unique characteristic of longitudinal data, it is common to identify the presence of time series processes. Although AR pr ocess has been well discussed in the context of LGM, up to now, very few studies include a systematic discussion of the AR, MA and ARMA processes at the same time. Furthermore, there is no systematic study conducted to investigate the three unmodeled time series processes in the error structure of conditional LGM. Therefore, this study fills in the framework gap between the methodology issue and applied research, and aims to provide more insightful information to applied researchers. Third, m odeling time s eries process in latent growth modeling is a relatively new area. T here are limited studies trying to integrate the time series process in latent growth modeling.. For one reason or another, some SEM researchers view the analysis of time series as somethin g that uses fundamentally different concept and methods, which inhibits the interchange between the two ways of thinking. They might be mutually productive and beneficial to each other. This study hopes to contribute to the wider understanding and better a pplication of time series analysis in educational and behavioral research.
PAGE 58
58 Research Q uestions This study aims to investigate how unmodeled time series processe s in the error structure of latent growth curve model s affect the parameter estimates and their standard error estimates, as well as model fit indexes and GOF test The parameters that are of interest are : and B (i.e., the mean of the level and mean of the shape controlling all other terms in betw een -person equation), (i.e the residual variance of level equation) (i.e., the residual variance of shape equation) and (i.e., the covariance of level and shape residuals) as well as the following path coefficients depending on different models: 1. The direct effects of time invariant predictors on the level and shape factors in LGM with a time invariant predictors (i.e., and in equation 18); 2. The direct effect of time varying predictors on the outcome variables in LGM with a time varying predictors (i.e., t in equation 27). 3. In LGM with a parallel process the direct effects of the level and shape for the time varying predictor on the level and shape for the outcome variable (i.e., 1 1 and 2 B in equation 32). The fundamental research question this study aims to a ddress is : a re LGM parameter estimates and standard errors affected when within-person residual covariance structure fail s to include the time series process? Other research questions include whether commonly used fit indexes and GOF test can differentiat e between two analysis models differing in within -person covariance structures and whether the parameter s and their standard error estimates are affected by design factors.
PAGE 59
59 CHAPTER 3 METHOD Monte Carlo simulations have been widely used in social scienc e in investigating the possible effect of assumption violation. When an analytical approach is difficult or impossible to implement, Monte Carlo simulation offers researchers an alternative way to address research questions. In this study, the simulation w as conducted through software R version 2.7.1 (R Development Core Team, 2008). A total of 5000 replications were simulated for each condition. The models investigated in this study were: 1. LGM with one time invariant covariate. 2. LGM with one time varyin g covariate. 3. LGM with a parallel process This method section present s the following contents: (a) t he simulation conditions, which include the design factors and the population parameters ; (b) t he data generation procedure ; and (c) t he data analysis cr iteria. Design F actors Number of M easurement Times It is believed that the precision of parameter estimates tends to increase along with the number of observations for each individual (Duncan, et al., 1999). Kwok, et al (2007) found that among the longit udinal studies published in Developmental Psychology in 2002, 52% of these studies collected three or four waves of data while 48% collected 8 waves. Among the 267 peer reviewed journal articles obtained by searching from Academic Search Premier, Business Source Premier, EconLit, Professional Development Collection, PsycINFO, Sociological Collection from 2004 to 2008, the number of waves ranged from three to eight. Around 50% of these studies had three or four waves of data. Around 30% had five waves of da ta and around 20% had more than six waves of data. For model identification purpose, a minimum of four measurement occasions are required in growth modeling assuming the errors are not identical ( Muthn & Khoo, 1998). Hence, four was considered the minimu m number of measurement periods in this
PAGE 60
60 simulation design. The rationale of using eight waves is that large number of measurement periods would make more obvious the parameter estimates difference if such difference exists. Hence, four waves, and eight wav es were u sed in this study to represent a small, and a large number of repeated measures respectively. Sample S ize Hamilton Gagne and Hancock (2003) argued that a sample size between 100 and 200 was the minimum requirement for univariate LGM. It was also reco mmended by Anderson and Gerbing (1998) and Jackson (2003), that a sample size of 150 or 200 was necessary when maximum likelihood estimation method was used in growth modeling. Fan (2003b) recommended a minimum sample size of 150 for univariate growt h modeling, together with 500 and 1000 representing the medium and large sample size respectively. Kline (1998) recommended a ratio of 10:1, that is, for each parameter estimated, there should be 10 observations. In our study, assuming eight waves, when ti me invariant covariate wa s included in the model, the total number of parameters was 15. When time varying covariate was included in the model, the total number of parameters was 23. In the model with a parallel process the total number of parameters was 29. Hence, the sample size in this study should range from 150 to 290 according to the 10:1 ratio rule. Therefore, a sample size of 200, 500 and 2000 was to be simulated to represent a small, medium and large size. Time S eries P arameters The time series parameters refer to the correlation coefficient in time series model. They are the autocorrelation coefficient in AR (1) model, the MA parameter in MA (1) model and and in ARMA (1,1) model. In this study, only data following an AR(1) or a MA (1) or an ARMA (1) process were simulated in that (a) the common characteristics of longitudinal data
PAGE 61
61 is that the variable is affected most by its immediately p receding variable and (b) assuming the lag one process can free a significant number of degrees of freedom. According to the range of values used in past simulation studies (Ferron, et al 2002; Hamaker, Dolan, & Molenaar, 2002; Sivo & Willson, 2000), the value of AR/MA correlation coefficient was set as follows: 1 When the within -person residual covariance structure followed an AR (1) process, the AR parameter was set to be 0.8 and 0.5 to represent high and moderate autocorrelation coefficient respectively 2 When the within -person residual covariance structure followed a MA (1) process, the MA was set to be 0.8 and 0.5 to represent high and moderate moving average parameters respectively. 3 When the within -person residual covariance structure followed an ARMA (1 1 ) process, the and was set to be 0.2 and 0.8 or 0.5 and 0.45 respectively. The values of and were chosen to be quite different from one an other or quite close to one a other. The rational e for choosing the values is based on the following reasoning (McCleary & Hay, 1980): t he ARMA model is an integration of AR model and MA model. If and are not equal but are close to each other the ARMA (1,1) model reduces approximately to an MA (2) model when is not small and reduces approximately to an MA(1) model when is small. Therefore, it deserves out attention to investigate the results with two different types of ARMA parameter value. Time Coding The simulation study assumed linear conditional LGM with equally spaced time intervals for the dependent variable in all models and for the time varying covariate in the LGM with a time varying covariate and in the parallel process LGM. Therefore, the factor loadings for the level were all set equal to 1. The shape loadings were set from 0 to 1, 2, 3, 4 with the base time as the reference point. Population V alues The population parameters were based on the analysis of data obtained from the Early Childhood Longitudinal Study Kindergarten Cohort (ECLS -K) and values used in other simulation research This data set provides descriptive information about the status of children
PAGE 62
62 from kindergarten to 8th grade. It is the first large national study that followed a cohort of children from their kindergarten to middle school. Information was collected in a total of seven measurement periods: the fall and the spring of kindergarten (199899), the fall and spring of 1st grade (19992000), the spring of 3rd grade (2002), 5th grade (2004), and 8th (2007) grade a total of seven measurement periods. Participants included childrens teachers, schools and their parents. Information was collected on a variety of factors such as children's cognitive, social, emot ional, and physical development. Its longitudinal nature and multifaceted character enables researchers to conduct various studies based on this data set (e.g., Bodovski & Farkas 2007; Hong & Raudenbush, 2006; Kaplan, 2005). The population parameter in this study was obtained by analyzing the following five waves data: The fall and the spring of kindergarten ( 199899), the spring of 1st grade (2000), the spring of 3rd grade (2002) and the spring of 5th grade (2004). The outcome variable was childrens math performance in each of the five periods. The time invariant covariate was childrens SES measured at kinde rgarten and the time varying covariate was childrens reading score measured at the same time as the math score. A s the measurement time in this study could be eight times, the additional population parameters were extrapolated according to the parameters obtained from the above five waves ECLS -K data. Based on the analysis of the ECLS -K data, t he population parameters were defined as follows: Within -P erson R esidual V ariance 2 To make the design simple, the within -person residual v ariance for the outcome variable measured at different periods was specified to be equal with a constant value of 50.
PAGE 63
63 Parameter and B in Between -P erson Equation Parameter and B refers to the mean of the level and mean of the shape controlling all other terms in between -person equation. Based on the parameter estimates obtained from ECLS K, the was set to be 5 and B was set to be 4. Residual Variance of Level Equation (i.e., ) R esidual V ariance of S hape Equation (i.e., ), and Covariance of Level and Shape Residuals (i.e., ) T he para meter i and Bi are the disturbances of level and shape respectively with mean of zero and variances of and as well as covariance of T he parameter w as set to be 80, the parameter was set to be 60 and the parameter was set to be 35 respectively, based on the analysis of ECLS -K data. M ean and V ariance of Time I nvariant C ovariate According to the data analysis of ECLS -K data, the time invariant covariate was generated to follow a normal distribution with a mean of 50 and standard deviation of 10. Parameters of T ime V arying C ovariate The time varying covariate was generated the same way as was the outcome variable. The time varying covariate was assumed to be measured at the same time as the outcome variable. The mean level and mean shape was specified to be 30 and 20 respectively. The and and for the time varying covariate were 85, 30 and 23, respectively. The trajectory equation residual was assumed to be normally distributed with mean of zero and constan t variance of 70. Effect of Time I nvariant P redictor on Latent Level and L atent S hape in G rowth P redictor M odel (i.e., and in E quation 2 -18) The regression coefficients for the time invariant predictor on latent level (i.e., ) and latent shape (i.e., ) were both set to be 0.5.
PAGE 64
64 Effect of the Time V arying P redictor V ariable on the O utcome V ariable in LGM with a time varying C ovariant (i.e., t in Equation 2 -27) To simplify the design, all the regression coefficients between the outcome variable and the time varying covariate were set to be equal to each other. T he regression coefficient was set as 0.4 Effect of the I ntercep t and S lope of the P redictor on the I ntercept and S lope of the O utcome V ariable in LGM with a parallel process M odel The effect of latent intercept of the predictor on the intercept of the outcome variable (i.e., 1 in equation2 3 3 ) was set to be 0.6. The effect of the latent intercept of the predictor variable on latent intercept of the outcome variable (i.e., 1 in equation 2 3 3 ) and the effect of the latent slope of the predictor on the latent slope of the outcome variable (i.e., 2 B in equation 2 3 3 ) were set to be 0.5 and 0.6 respectively. S ummary of P opulation V alues The and B in the between -person equation were 5 and 4 res pectively. The residual variance of level equation (i.e., ), the residual variance of shape equation (i.e., ) and the covariance of level and shape residuals (i.e., ) were 80, 60 and 35 respectively. The within p erson residual variance for the outcome variable was 50. These numbers were the same for all three LGMs. Other population values were presented as follows: LGM with a Time I nvariant C ovariate The time invariant covariant w as normally distributed with mean of 50 and standard deviation of 10. The effects of the time invariant covariate on the level and shape of the outcome variable were both equal to 0.5.
PAGE 65
65 LGM with a T ime V arying C ovariate The effect of the time varying covari ate on the outcome variable was 0.4. The mean level and mean shape of the time varying covariate was specified to be 30 and 20 respectively. The and as well as covariance of of the time varying covariant was set to be 85, 30 and 23 respectively. The trajectory equation residual variance 2 was normally distributed with mean of zero and a constant variance of 70. LGM with a parallel process The predi ctor variable (i.e., the time varying predictor) was set the same way as was in LGM with a time varying covariate. That is: t he mean level was 30 and the mean shape was 20; t he variance of the level was 85 and the variance of the shape was 30; t he covarian ce between the level and the shape was 23; t he within -person residual for the predictor variable was normally distributed with mean of zero and a constant variance of 70. The effect of the intercept of the predictor on the intercept of the outcome variabl e was 0.6 while on the slope of the outcome variable was 0.5. The effect of the slope of the predictor variable on the slope of the outcome variable was 0.6. Summary of C onditions The conditions included three different LGMs, three types of within -person covariance structures, three different sample sizes, and two different number of measurement periods. Within each type of residual covariance structure there were two different time series parameters,. Above factors were fully crossed, resulting in a total of 108 (3x3x3x2x2) conditions. The time series parameters, the sample sizes and the measurement periods were design factors. The values of the three design factors are summarized as follows: 1 the sample size (200, 500 and 2000);
PAGE 66
66 2. the length of waves (4 and 8); 3. the time series parameters: (a) the AR parameter (0.8 and 0.5), (b) the MA parameter (0.8 and 0.5), and (c) the ARMA parameter and (0.2 and 0.8 and 0.5 and 0.45). Data Generation The data were generated by R version 2.7. 1 (R Development Core Team, 2008). The matrix equation of each LGM was used in the da ta simulation with population value filled in. The data sets were generated according to di fferent model type and different within-person residual covariance structure. Unde r a certain kind of LGM, there were three different generating models, which had the same matrix format but differed in the within person residua l covariance structure. When the residuals follow an AR (1) process, that is, ,1 itititu (3-1), where it and ,1it are the within person residuals at time t and t1 respectively with zero mean and constant variance 2 is the autocorrelation coefficient, and itu is the residual at a give time t with ()0itEu and 2var()ituu the within -person residual covariance matrix is 2 1 2 22 121 1 1 1t t t ttt (3-2) where 2 was equal to 50 as described before. The was set equal to 0.8 or 0.5. When the residuals followed an MA (1) process, that is, ,1 ititituu (3-3)
PAGE 67
67 where it is still the within-person residual at time t, denotes the moving average parameter and {}itu is a zero mean series with constant variance 2 u The within -person residual covariance matrix is 2 22 2 2100 1 10 11 010 1 0001 (3-4) The 2 was set to be 50 and the was set equal to 0.8 or 0.5 When the residual followed an ARMA (1, 1) process, ,1 ,1itititituu (3-5) all the terms are defined the same as in the AR (1) and MA (1) models. The accompanied withinperson covariance structure is 2 2 2 3 2 2 24 2 2 23412 1 ()(1) 12 11 ()(1) ()(1) 12 12 ()(1) 12 ()(1)t t t ttt (3-6) The within person residual 2 was 50. The ARMA parameter and was set to be 0.2 and 0.8, or 0.5. and 0.45. The data sets were generated according to the three LGMs and three different within person residual structures. A tota l of 9 (3x3) generating models w ith different values of design
PAGE 68
68 factors were formulated. Under each condition, a data set was generated and analyzed by a n incorrect analysis model. Then another data set with the same condition was g enerated and then analyzed by a correct analysis model. The incorrect analysis model failed to consider the time series process in within person level residual structure. The within -person residual covariance matrix under the incorrect analysis model was a diagonal matrix with non -constant variance. The within person residual covariance matrix under the correct analysis model was the same as in the above generating model. In Mplus 5.2, the default estimation covariance structure assumes uncorrelated errors T he time series process in the residual structure was modeled using constraint command. A total of 5000 replications were simulated for each of 108 conditions As two data sets were generated under the same condition for two analysis models, a total of 1,080,000 (5000x108x2) datasets were generated. The data were simulated in R and were saved to disk. The Mplus software then was used to fit the models on the generated data. Under each condition, the 5000 replications did not all converged. Therefore the non convergence rate was calculated and extra data sets were further simulated and analyzed until 5000 converged results were obta ined. The Mplus output all parameter estimates, the accompanied standard error estimates, fit index, as well as warning messa ges. These output s were saved for later analysis. Data A nalysis There are several criteri a to evaluate the performance of the latent growth models when the residual covariance structure i s misspecified. In this simulation study, a high convergence rate wa s expected for estimating the LGM s However, it is very likely that estimations will not converge for all replications of all conditions. Therefore, the convergence rate was calculated for each condition. M oreover, Mplus provides warning message regarding the occurrence of a non positive definite latent variable covariance matrix The occurrence of a non-positive definite
PAGE 69
69 matrix indicates improper solutions, which includes a negative variance for a latent variable, a correlation greater than or equal to one between two latent variables. The percentage of occurrence of the non-positive definite matrix un der each of 5000 replications was reported. To evaluate the performance of the models with unmodeled time series process, the relative parameter bias and relative standard e rror bias were calculated. Relative bias was calculated for both parameter estimates and standard error estimates. The relative parameter bias was calculated by using the following formula (Hoogland & Boomsma, 1998): ()ijMean B (3-7) whereij is the average parameter estimate obtai ned for replication i of condition j ()ijMean is the mean of estimates of ij under condition j, and is the population parameter. To evaluate the standard errors, the estimat ed values were compared with empirical standard errors, which were obt ained by computing the standard deviation of the parameter estimates from all the simulated datasets in a condition. The relative standard error bias was calculated using the following equation: ijij ij ijSESD BS SD (3-8) where ijSE is the average estimated standard error for ij across all 5000 replications under condition j, andijSD is the empirical standard error, calculated as th e standard deviation of the 5000 estimates of under condition j. According to Hoogland and Boomsma (1998), the
PAGE 70
70 acceptable cut off values for the relative parameter bias and relative standard error bias are 0.05 and 0.1 respectively. Values beyond this range would be considered unacceptable. Results for chi -square goodness of fit (GOF) test were also reported. The percentage of p value that was below 0.05 under each selected condition would be reported. In SEM, a p value equal to or greater than 0.05 indicates adequate model fit. It is expected that p value is sensitive to model misspecification. That is, with the correct analysis model p value should be at least equal to 0.05, and with the incorrect analysis model p value is less than 0.05. T here exist many fit indexes i n SEM, which are important criteria in evaluating whether the model fits th e data adequately In t his study four commonly used fit indexes were selected for evaluation : the comparative fit index (CFI) the Tucker Lewis index (TLI), the standardized root mean square residual (SRMR) and the root mean -square error of approximation (RMSEA). The criterion that suggests adequate model fit for each of the four fit index are (Hu & Bentler, 1999) : CFI is greater than .95, TLI is greater than 0.95 SRMR is less than 0.08 and RMSEA is less than 0.0 6. In the results section, the percentage of replications that met each of the four criteria would be present ed. That is, for each criterion and condition, the percentage of replications in which the criterion was met was calculated. It should be mentioned that the criteria used to suggest adequate model fit are not unique and are always controversial in the literature review (Marsh, Hau, & Grayson, 2005). The criteria used in this study were chosen simply to allow the examination of the effect of model misspecification. Other model selection criteri a such as Akaikes Information Criterion (AIC) or Schwartzs Bayesian Criterion (SBC) w ere not compared here. It has been shown in previous literature that these criteri a do not always lead to the correct selection of the covariance structure (e.g. Keselma n, Algina, Kowalchuk, & Wolfinger, 1998; Ferron et al ., 2002).
PAGE 71
71 CHAPTER 4 RESULTS This chapter is composed of six sections. The first section present s the con vergence rate and the occurrence rate for non -positive definite matri ces The second section thr ough the sixth section report s results for fixed parameter estimates, standard error of fixed parameter estimates, variance components estimates, standard error of variance components estimates and chi -square GOF test and GOF indexes in this order. In th e two sections for fixed parameter estimates and standard error estimates of fixed parameter, results are presented with latent growth model (LGM) with a time invariant covariate the first, with LGM with a time varying covariate the second, and with LGM w ith parallel process the third. From section four (variance components estimates) to section six (GOF test and GOF indexes ), results are presented according to different wit hin -person covariance structures, with an AR ( 1) error structure the first, a MA (1) error structure the second, and an ARMA (1 1 ) error structure the last. Except for section one, a summary is presented at the end of each section for easy understanding Tables of relative biases are displayed according to combinations of conditions t hat show differences in the acceptability of the relative biases. T he combinations of conditions are based on the following factors: sample size, time series parameters, analysis model type and number of waves The four factors were fully crossed, resultin g in a total number of 24 conditions (3x2x2x2). If there was any unacceptable bias under each of the 24 conditions, all the mean relative biases under the 24 conditions were reported If the relative biases were all acceptable, the marginal mean relative b iases aggregated under analysis model type would be reported. Convergence Rate and Non-Positive Definite Covariance Matrix Occurrence Rate The convergence rate for all misspecified analysis model s was 100% for all LGMs (see table 4 1) The convergence rate for the correct analysis model depended on the within-person
PAGE 72
72 covariance structure and the four factors With the correct analysis model t he convergence rate with a MA (1) error structure was the highest, with an ARMA ( 1 1 ) the lowest, and with an AR (1) in between. This is the expected result as the ARMA (1, 1) covariance structure was the most complex and the MA (1) error structure was the least complex among the three. The correct a nalysis model s with a MA (1) error structure led to a convergence rate more than 99%. For the correct analysis model with an AR (1) structure, the convergence rate ranged from 74% to 87%, under conditions in which number of waves was four the AR parameter was 0.8 and the sample size was 200 or 500; the convergence rate unde r all other conditions was more than 97%. With an AR (1) error structure, more measurement periods, a larger sample size, or a smaller AR parameter resulted in more converged solution s, holding other conditions constant. The analysis model with an ARMA (1, 1) error structure caused less convergence rate than with an AR (1) or a MA(1) error structure, especially under the conditions in which the ARMA parameter value was equal to 0.5 and 0.45, where the convergence rate ranged from 52% to 78%. With an ARMA (1 1 ) error structure, a larger sample size did not necessarily lead to higher convergence rate but more measurement periods did, with other condition fixed. The convergence rate did not differ much across the three kinds of LGMs under the same condition. T able 4 1. C onvergence rate for all conditions LGM 1 LGM 2 LGM 3 LGM 1 LGM 2 LGM 3 Model Size Parameter Wave = 4 Wave = 8 Incorrect AR(1) 200 0.5 100% 100% 100% 100% 100% 100% 500 0.5 100% 100% 100% 100% 100% 100% 2000 0.5 100% 100% 100% 100% 100% 100% 200 0.8 100% 100% 100% 100% 100% 100% 500 0.8 100% 100% 100% 100% 100% 100% 2000 0.8 100% 100% 100% 100% 100% 100% MA(1) 200 0.5 100% 100% 100% 100% 100% 100% 500 0.5 100% 100% 100% 100% 100% 100% 2000 0.5 100% 100% 100% 100 % 100% 100%
PAGE 73
73 Table4 1. Continued. LGM 1 LGM 2 LGM 3 LGM 1 LGM 2 LGM 3 Model Size Parameter Wave = 4 Wave = 8 200 0.8 100% 100% 100% 100% 100% 100% 500 0.8 100% 100% 100% 100% 100% 100% 2000 0.8 100% 100% 100% 100% 100% 100% ARMA 200 0.2, 0.8 100% 100% 100% 100% 100% 100% (1,1) 500 0.2, 0.8 100% 100% 100% 100% 100% 100% 2000 0.2, 0.8 100% 100% 100% 100% 100% 100% 200 0.5, 0.45 100% 100% 100% 100% 100% 100% 500 0.5, 0.45 100% 100% 100% 100% 100% 100% 2000 0.5, 0.45 100% 10 0% 100% 100% 100% 100% Correct AR(1) 200 0.5 99% 98% 98% 100% 100% 100% 500 0.5 100% 100% 100% 100% 100% 100% 2000 0.5 100% 100% 100% 100% 100% 100% 200 0.8 76% 76% 74% 97% 97% 97% 500 0.8 85% 87% 84% 100% 100% 100% 2000 0.8 98% 98% 98% 100 % 100% 100% MA(1) 200 0.5 99% 100% 100% 100% 100% 100% 500 0.5 99% 100% 100% 100% 100% 100% 2000 0.5 100% 100% 100% 100% 100% 100% 200 0.8 99% 100% 100% 100% 100% 100% 500 0.8 100% 100% 100% 100% 100% 100% 2000 0.8 100% 100% 100% 100% 100% 100% ARMA 200 0.2, 0.8 85% 88% 85% 97% 97% 95% (1,1) 500 0.2, 0.8 90% 90% 87% 98% 97% 95% 2000 0.2, 0.8 95% 91% 86% 97% 97% 94% 200 0.5, 0.45 59% 63% 65% 67% 64% 73% 500 0.5, 0.45 58% 60% 62% 66% 67% 72% 2000 0.5, 0.45 56% 56% 52% 72% 78 % 74% Note: LGM 1, LGM 2 and LGM 3 represent LGM with a time invariant covariate, LGM with a time varying covariate and LGM with a parallel process respectively. Parameter refers to the time series parameter. The occurrence rate of non-positive definite covariance matri ces in each condition i s presented in Table 4 2. Provided that the number of measurement waves was eight, failing to include the time series process in the analysis model did not result in any occurrences of nonpositive definite matrices When the number of waves was four and the time series process was
PAGE 74
74 not included in the analysis model, occurrence rates depended on the model used to generate the data. When the generating model was an AR (1), the occurrence rate was zero except when the sample size was 200 and the parameter vale was 0 .8. Even then the occurrence rate did not exceed 2%. When the generating model was a MA (1), the occurrence rate was at least 20% when the parameter value was 0 .8 but 24% or less when the parameter value was 0 .5. In both cases occurrence of non-positive definite matrices decreased as the sample size increased. When the generating model was an ARMA (1, 1) there were non non-positive define matrices when the parameter values and were 0 .5 and 0 .45, respectively; when the parameter values were 0 .2 and 0 .8, non occurrence rates were less than 19% and declined as the sample size increased. When the analysis model was correct non-occurrence rates again depended on the time -series process. With the MA (1) model the occurrence rate was zero except in one condition: parallel process model, sample size of 200 and a parameter value of .8. Even then the occurrence rate was only 1%. For the AR (1) model the occurrence r ates were less than 1 8 % and were smaller whe n the number of waves was eight and the parameter value was 0 .5, and tended to decline as the sample size increased. For the ARMA (1 1 ) model occurrence rates were less than 20% and were smaller when the number of waves was larger and the parameter values and were 0.2 and 0.8, and the sample size was larger. Similar to the result s for the convergence rate, the occurrence rate did not differ much across the th ree LGMs under the same condition. Table 4 2 Rate of occurrence of nonpositive definite matrix under all conditions LGM 1 LGM 2 LGM 3 LGM 1 LGM 2 LGM 3 Model Size Parameter Wave = 4 Wave = 8 Incorrect AR(1) 200 0.5 0% 0% 0% 0% 0% 0% 500 0 .5 0% 0% 0% 0% 0% 0% 2000 0.5 0% 0% 0% 0% 0% 0% 200 0.8 1% 1% 2% 0% 0% 0% 500 0.8 0% 0% 0% 0% 0% 0%
PAGE 75
75 Table 4-2. Continued. LGM 1 LGM 2 LGM 3 LGM 1 LGM 2 LGM 3 Model Size ParameterWave = 4 Wave = 8 2000 0.8 0% 0% 0% 0% 0% 0% MA(1) 200 0.5 20% 24% 22% 0% 0% 0% 500 0.5 8% 8% 9% 0% 0% 0% 2000 0.5 0% 0% 0% 0% 0% 0% 200 0.8 43% 46% 44% 0% 0% 0% 500 0.8 36% 37% 35% 0% 0% 0% 2000 0.8 21% 21% 20% 0% 0% 0% ARMA 200 0.2, 0.8 15% 18% 16% 0% 0% 0% (1,1) 500 0.2, 0.8 5% 5% 5% 0% 0% 0% 2000 0.2, 0.8 0% 0% 0% 0% 0% 0% 200 0.5, 0.45 0% 0% 0% 0% 0% 0% 500 0.5, 0.45 0% 0% 0% 0% 0% 0% 2000 0.5, 0.45 0% 0% 0% 0% 0% 0% Correct AR(1) 200 0.5 13% 14% 14% 0% 0% 0% 500 0.5 5% 4% 5% 0% 0% 0% 2000 0.5 0% 0% 0% 0% 0% 0% 200 0.8 16% 15% 15% 10% 11% 10% 500 0.8 16% 17% 17% 3% 3% 3% 2000 0.8 11% 12% 11% 0% 0% 0% MA(1) 200 0.5 0% 0% 0% 0% 0% 0% 500 0.5 0% 0% 0% 0% 0% 0% 2000 0.5 0% 0% 0% 0% 0% 0% 200 0.8 0% 1% 0% 0% 0% 0% 500 0.8 0% 0% 0% 0% 0% 0% 2000 0.8 0% 0% 0% 0% 0% 0% ARMA 200 0.2, 0.8 5% 6% 7% 1% 0% 1% (1,1) 500 0.2, 0.8 1% 0% 1% 0% 0% 0% 2000 0.2, 0.8 0% 0% 0% 0% 0% 0% 200 0.5, 0.45 15% 13% 20% 6% 12% 14% 500 0.5, 0.45 9% 8% 14% 3% 7% 9% 2000 0.5, 0.45 5% 6% 10% 1% 2% 3% Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate, LGM with a time varying covariate and LGM with a parallel process respectively. Parameter refers to the time series parameter.
PAGE 76
76 Fixed Parameter Estimates LGM with a T ime I nvariant C ovar iate The fixed parameters in LGM with a time invariant covariate refer to and B (i.e., the mean of the level and mean of the shape in the between -person equation) and the direct effect of the time invar iant predictors on the level and the shape factors ( i.e., and in equation 2 1 9 ). With each of the three kinds of residual covariance structures, the relative biases under all 24 conditions were accept able. Therefore, only the marginal mean relative biases aggregated under the analysis model type are reported. Results in Table 4 3 indicate that regardless of the residual covariance structures, all the relative biases were trivial, ranging from -.0 06 to .003. The relative biases for and were almost zero, indicating that estimates of these two parameters were quite close to their respective population values. Table 4 3 Marginal m ean relative biases o f fixed parameter estimates for LGM with a time invaria n t covariate Model AR(1) Incorrect 0.004 0.003 0.000 0.000 Correct 0.006 0. 001 0.000 0.000 MA(1) Incorrect 0.003 0.000 0.000 0.000 Correct 0.000 0.001 0.000 0.000 ARMA(1,1) Incorrect 0.003 0.000 0.000 0.000 Correct 0.002 0.003 0.000 0.000 LGM with a T ime V arying C ovariate The fixed parameters in the LGM with a time var ying covariate refer to and B (i.e., the mean of the level and mean of the shape in the between -person equation), and the direct effect of a time varying predictor on outcome variables (i.e., t in equation 2 2 8 ).
PAGE 77
77 With each of the three kinds of residual covariance structures, the relative biases under each of the 24 conditions were acceptable. All the marginal mean relative biases were trivial ranging from 0.001 to 0 (see Ta ble 4 4). Most marginal mean relative biases were zero, indicating that the parameter estimates were quite close to their population values Table 4 4 Mean relative biases of fixed parameter estimates for LGM with a time varying covariate Model B t AR(1) Incorrect 0.000 0.000 0.000 Correct 0.001 0.000 0.000 MA(1) Incorrect 0.000 0.000 0.000 Correct 0.001 0.000 0.000 ARMA(1,1) Incorrect 0.001 0.000 0.000 Corr ect 0.000 0.000 0.000 LGM with a parallel process The fixed parameter s in LGM with a time varying covariate are and B (i.e., the mean of the level and mean of the shape controlling all other terms in the between -person equation), and the direct effects of the level and the shape of the time varying predictor on the level and shape for the outcome variable respectively (i.e., 1 1 and 2 B in equation 2 3 3 ). The relative bias under each of the 24 conditions was less than 0.05, and therefore only the marginal mean relative biases are reported. Results in Table 4 5 indicate that all the relative biases were acceptable. The abso lute marginal mean relative biases of either or B were larger than those of 1 1 and 2 B. 75% of the absolute marginal mean relativ e biases for parameter or B were great er than .01 while t he marginal mean relative biases for the three regression coefficients (i.e. 1 1 and 2 B) were trivial, ranging from 0 .00 3 to 0 .0 09.
PAGE 78
78 Table 4 5 M ean relative biases of fixed parameter estimates for LGM with a parallel process Model B 1 1 2 B AR(1) Incorrect 0.011 0.009 0.003 0.005 0.003 Correct 0.014 0.012 0.004 0.005 0.003 MA(1) Incorrect 0.012 0.011 0.003 0.004 0.002 Correct 0.008 0.005 0.002 0.002 0.001 ARMA (1,1) Inco rrect 0.011 0.013 0.003 0.005 0.002 Correct 0.018 0.015 0.009 0.008 0.000 Standard Error of the Fixed Parameter Estimates LGM with a Time I nvaria n t C ovariate The relative bias es under each of the 24 conditions w ere less than 0.1 and therefore wer e all acceptable Accordingly only the marginal mean relative biases were reported. Under each of the three covariance structures, t he marginal mean relative biases of standard error of the fixed parameter estimates were trivial, ranging from 0 .0 12 to 0 .0 0 3 (see Table 4 6) There were only 2 out of the 24 mean relative biases with absolute values great than 0.01, indicating that the standard error estimates of the fixed parameter were close to their respective empirical standard error s Table 4 6 Margi nal m ean relative biases of standard error estimates of fixed parameter s for LGM with a time invariant covariate Model AR(1) Incorrect 0. 001 0.011 0.000 0.012 Correct 0.002 0.002 0.002 0.002 MA(1) Incorrect 0.002 0.002 0.001 0.003 Correct 0.004 0.003 0.004 0.003 ARMA(1,1) Incorrect 0.003 0.004 0.003 0.004 Correct 0.008 0.006 0.007 0.004 LGM with a Time V arying C ovariate With each of the three residual covariance structures, a ll the absolute relative biases of the standard error estimates of the fixed parameter were less than 0 .1 and therefore were all acceptable T he range of the marginal mean relative biases was from 0 .0 09 to 0 .00 2 indicating
PAGE 79
79 that the estimated standard errors were quite close to their respective empirical standard errors (see Table 4 7) Table 4 7 Marginal m ean relative biases of standard error estimates of fixed parameter s for LGM with a time varying covariate Model B t AR(1) I ncorrect 0.006 0.009 0.002 C orrect 0.008 0.01 0.000 MA (1) I ncorrect 0.002 0.007 0.001 C orrect 0.001 0.000 0.001 ARMA(1, 1) I ncorrect 0.003 0.006 0.001 C orrect 0.005 0.004 0.006 LGM with a parallel process The relative biases under each of the 24 conditions were all acceptable and therefore only the marginal mean relative biases were reported. Results in Table 4 8 indicate that the range of the marginal mean relative biases was from 0.018 to 0.003, with only 6 out 30 (20%) absolute marginal mean biases larger than 0.01. Based on the 0.10 criterion for the relative bias of standard error estimates, all the marginal mean relative biases were trivial, indicating that the estimated standard errors were quite close to their respective empirical standard errors. Table 4 8 Marginal m ean relative biases of standard error estimates of fixed parameter s for LGM with a paral lel process Model B 1 1 2 B AR(1) I ncorrect 0.007 0.003 0.007 0.002 0.000 C orrect 0.004 0.005 0.004 0.018 0.016 MA(1) Incorrect 0.01 0.003 0.01 0.012 0.016 C orrect 0.002 0.000 0.004 0.009 0.009 ARMA(1,1) I ncorrect 0.007 0.003 0.007 0.002 0.000 C orrect 0.004 0.005 0.004 0.018 0.016
PAGE 80
80 Summary of the R esults for the F ixed P arameter Esti mates toge ther with S tandard Error Estimates When the generating model included each of the three types of within -person residual covariance structure s t he relative bias es of each fixed parameter estimates and the standard error estimates were acceptable under all conditions None of the four factors had an impact on the acceptability of these biases. Many of the biases were trivial, indicating that even when the within -person covariance structure was misspecified, the estimation of the fixed parameters or tests of the fixed effects were not affected. Variance Component Parameter Estimates The variance components refer to the res idual variance of the latent level (i.e., ), the residual variance of the latent slope (i.e., ) as well as the covariance between the residual of latent intercept and the residual of latent slope (i.e., ). B eginning with th e present section, t he results are organized by the within -perso n residual covariance stru ctures rather than by the LGM. Under each type of residual covariance structure, the results for are presented first, followed by the results for and the results for are p resented the last. AR (1) Within -P erson R esidual C ovariance Matrix When the generating model was an AR (1), not all the relative biases were acceptable, therefore the relative biases under each of the 24 conditions are reported (see Table 4 9). When the analysis model failed to include the AR (1) time series process all the relative biases of were inflated and non e of the biases were acceptable. However, t he analysis model that included the AR (1) process also resulted in so me unacceptable negative biases These unacceptable biases were observed under conditions in which the AR parameter value was 0.5, the number waves was four and the sample size was 200 or 500, or the AR parameter was 0.8, but excluding
PAGE 81
81 the condition when the number of waves was eight and the sample size was 2000. The magnitudes of these unacceptable biases were smaller than those obtained with the incorrect analysis model holding other factors constant. Table 4 9 Mean relative biases of estimates for three LGMs with an AR (1) within person residual covariance matrix M odel AR W ave S ize LGM 1 LGM 2 LGM 3 I ncorrect 0.5 4 200 0.379 0.390 0.370 0.5 4 500 0.387 0.390 0.380 0.5 4 2000 0.39 0 0.390 0.390 0.5 8 200 0.353 0.330 0 .340 0.5 8 500 0.36 0 0.330 0.360 0.5 8 2000 0.363 0.340 0.370 0.8 4 200 0.55 0 0.550 0.540 0.8 4 500 0.553 0.560 0.550 0.8 4 2000 0.559 0.560 0.560 0.8 8 200 0.61 0 0.570 0.610 0.8 8 500 0.62 0 0.580 0.620 0.8 8 2000 0.624 0.580 0.620 C orrect 0.5 4 200 0.327 0.280 0.320 0.5 4 500 0.088 0.070 0.090 0.5 4 2000 0.015 0.010 0.020 0.5 8 200 0.022 0.010 0.020 0.5 8 500 0.009 0.010 0.010 0.5 8 2000 0.001 0.000 0.000 0.8 4 200 0.276 0.250 0.280 0.8 4 500 0.374 0.380 0.380 0.8 4 2000 0.232 0.230 0.240 0.8 8 200 0.248 0.230 0.250 0.8 8 500 0.081 0.070 0.080 0.8 8 2000 0.013 0.010 0.020 Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate, LGM with a time varying covariate and LGM with a parallel process respectively. N umbers in bold indicate unacceptable bias Holding other factors constant, a higher value of AR parameter tended to result in larger biases than a lower value of AR parameter. The three LGMs did not differ much in the estimates
PAGE 82
82 of in terms of the number of unacceptable mean relative biases and the magnitudes of these biases. For the estimates of r esults in Table 4 10 indicate that when the correct analysis model was used, all the mean relative biases of were acceptable. When the incorrect analysis model was used, o nly biases observed with eight waves were acceptable, while none of the biases obtained with four waves was acceptable and these unacceptable biases were inflated. These unacceptable biases increased as the sample size increased. Similar to the results for the biases of estimates of did not differ much across the thre e LGMs. Table 4 10. Mean relative biases of estimates for three LGMs with an AR (1) within -person residual covariance matrix Model AR Wave Size LGM 1 LGM 2 LGM 3 Incorrect 0.5 4 200 0.071 0.079 0.052 0.5 4 500 0.081 0.083 0. 075 0.5 4 2000 0.083 0.085 0.083 0.5 8 200 0.017 0.016 0.004 0.5 8 500 0.02 0 0.019 0.015 0.5 8 2000 0.023 0.02 0 0.021 0.8 4 200 0.064 0.069 0.046 0.8 4 500 0.069 0.072 0.065 0.8 4 2000 0.074 0.073 0.072 0.8 8 200 0.018 0.022 0.011 0.8 8 5 00 0.024 0.021 0.02 0 0.8 8 2000 0.027 0.024 0.026 Correct 0.5 4 200 0.02 0.009 0.039 0.5 4 500 0.009 0.004 0.017 0.5 4 2000 0.002 0.001 0.004 0.5 8 200 0.013 0.003 0.019 0.5 8 500 0.004 0.003 0.009 0.5 8 2000 0.001 0 .000 0.002 0.8 4 200 0.001 0.003 0.021 0.8 4 500 0.001 0.002 0.008 0.8 4 2000 0.001 0.002 0.003 0.8 8 200 0.011 0.004 0.018
PAGE 83
83 Table 4-10. Continued. Model AR Wave Size LGM 1 LGM 2 LGM 3 0.8 8 500 -0.005 -0.002 -0.009 0.8 8 2000 -0.002 0.000 -0.002 Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate, LGM with a time varying covariate and LGM with a parallel process respectivel y. Numbers in bold indi cate unacceptable bias. Results in Table 4-11 indicate that under each of the three LGMs, the incorrect analysis model resulted in unacceptable and ne gative biases of the estimates of while the correct Table 4-11. Mean relative biases of estimates for three LGMs with an AR (1) within-person residual covariance matrix Model AR Wave Size LGM 1 LGM 2 LGM 3 Incorrect 0.5 4 200 -0.213 -0.207 -0.215 0.5 4 500 -0.205 -0.207 -0.206 0.5 4 2000 -0.204 -0.205 -0.204 0.5 8 200 -0.148 -0.124 -0.154 0.5 8 500 -0.146 -0.127 -0.147 0.5 8 2000 -0.143 -0.125 -0.142 0.8 4 200 -0.158 -0.158 -0.165 0.8 4 500 -0.158 -0.157 -0.154 0.8 4 2000 -0.152 -0.153 -0.155 0.8 8 200 -0.176 -0.149 -0.177 0.8 8 500 -0.175 -0.147 -0.176 0.8 8 2000 -0.171 -0.146 -0.172 Correct 0.5 4 200 0.014 0.019 0.008 0.5 4 500 0.002 0.004 0.002 0.5 4 2000 0.002 0.003 0.001 0.5 8 200 -0.006 -0.004 -0.007 0.5 8 500 -0.002 0.000 -0.004 0.5 8 2000 0.000 0.000 0.000 0.8 4 200 -0.035 -0.035 -0.044 0.8 4 500 -0.014 -0.013 -0.013 0.8 4 2000 0.000 -0.001 -0.001 0.8 8 200 -0.005 0.005 -0.010 0.8 8 500 -0.001 0.001 0.000 0.8 8 2000 -0.001 0.001 0.000 Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate, LGM with a time varying covariate and LGM with a parallel process respectivel y. Numbers in bold indi cate unacceptable bias.
PAGE 84
84 analysis model did not result in any unacceptable mean relative biases. The pattern and the magnitudes of mean relative biases did not differ much across the three LGMs. MA (1) Within Person Residual Covariance Matrix Results in Table 4 12 indicate that a ll t he relative biases of estimates obtained with the incorrect analysis model were un acceptable and were negatively biased. The relative biases obtained with the correct an alysis model were all acceptable, except those obtained under conditions in which the number of waves was four the sample size was 200 and the MA parameter was 0.8. The biases obtained with the misspecified analysis model were larger than those obtained with the correct analysis model and the estimates observed with the four waves were more negatively biased than with the eight waves. Table 4 1 2 Mean relative biases of estimates for three LGMs with a MA (1) within -person resi dual covariance matrix M odel MA Wave Size LGM 1 LGM 2 LGM 3 I ncorrect 0.5 4 200 0.362 0.362 0.368 0.5 4 500 0.357 0.357 0.36 0 0.5 4 2000 0.353 0.353 0.353 0.5 8 200 -0.199 -0.199 -0.2 00 0.5 8 500 0.193 0.193 0.193 0.5 8 2000 0.188 0.188 0.188 0.8 4 200 0.448 0.448 0.449 0.8 4 500 0.439 0.439 0.44 0 0.8 4 2000 0.436 0.436 0.436 0.8 8 200 0.237 0.237 0.24 0.8 8 500 0.231 0.231 0.232 0.8 8 2000 0.228 0.228 0.227 Correct 0.5 4 200 0.047 0.047 0.042 0. 5 4 500 0.02 0.02 0.019 0.5 4 2000 0.004 0.004 0.004 0.5 8 200 0.009 0.009 0.014 0.5 8 500 0.002 0.002 0.006 0.5 8 2000 0.001 0.001 0.003 0.8 4 200 0.068 0.068 0.066
PAGE 85
85 Table 4 12. Continued. Model MA Wave Size LGM 1 LGM 2 LGM 3 0.8 4 500 0.037 0.037 0.037 0.8 4 2000 0.016 0.016 0.015 0.8 8 200 0.016 0.016 0.018 0.8 8 500 0.006 0.006 0.009 0.8 8 2000 0.001 0.001 0.001 Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate, LGM with a t ime varying covariate and LGM with a parallel process respectively. N umbers in bold indicate unacceptable bias The relative biases obtained with the correct analysis model were all acceptable while the biases of observed with the incorrect analysis model and four waves were un acceptable (see Table 4 13). These unacceptable biases were all negative. Table 4 13. Mean relative biases of estimates for three LGMs with a MA (1) within -person residual co variance matrix Model MA Wave Size LGM 1 LGM 2 LGM 3 Incorrect 0.5 4 200 0.093 0.093 0.109 0.5 4 500 0.086 0.086 0.093 0.5 4 2000 0.083 0.083 0.083 0.5 8 200 0.019 0.019 0.03 0 0.5 8 500 0.016 0.016 0.017 0.5 8 2000 0.012 0.012 0.013 0.8 4 200 0.111 0.111 0.131 0.8 4 500 0.107 0.107 0.113 0.8 4 2000 0.105 0.105 0.105 0.8 8 200 0.021 0.021 0.032 0.8 8 500 0.017 0.017 0.02 0 0.8 8 2000 0.013 0.013 0.014 Correct 0.5 4 200 0.022 0.022 0.038 0.5 4 50 0 0.011 0.011 0.016 0.5 4 2000 0.002 0.002 0.004 0.5 8 200 0.009 0.009 0.017 0.5 8 500 0.003 0.003 0.007 0.5 8 2000 0.001 0.001 0.003 0.8 4 200 0.029 0.029 0.046 0.8 4 500 0.015 0.015 0.019 0.8 4 2000 0.005 0.005 0.006 0.8 8 200 0.006 0.006 0.018
PAGE 86
86 Table 4 13. Continued. Model MA Wave Size LGM 1 LGM 2 LGM 3 0.8 8 500 0.003 0.003 0.007 0.8 8 2000 0 .000 0 .000 0.002 Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate, LGM with a time va rying covariate and LGM with a parallel process respectively. N umbers in bold indicate unacceptable bias A ll the biases of obtained with the incorrect analysis model were unacceptable and positive, while all the biases of obtained with the correct analysis model were acceptable (See Table 4 14). With the incorrect analysis model, the estimates observed with four waves were more positively biase d than those with eight waves and the biases tended to i ncrease with the increase of the sample size. Table 4 1 4 Mean relative biases of estimates for three LGMs with a MA (1) within -person residual covariance matrix Model MA Wave Size LGM 1 LGM 2 LGM 3 Incorrect 0.5 4 200 0.255 0 .255 0.25 0 0.5 4 500 0.261 0.261 0.26 0 0.5 4 2000 0.263 0.263 0.262 0.5 8 200 0.062 0.062 0.06 0 0.5 8 500 0.067 0.067 0.069 0.5 8 2000 0.071 0.071 0.071 0.8 4 200 0.323 0.323 0.31 0 0.8 4 500 0.329 0.329 0.32 0 0.8 4 2000 0.329 0.329 0.328 0.8 8 200 0.077 0.077 0.072 0.8 8 500 0.084 0.084 0.079 0.8 8 2000 0.085 0.085 0.086 Correct 0.5 4 200 0.021 0.021 0.008 0.5 4 500 0.007 0.007 0.005 0.5 4 2000 0.001 0.001 0 .000 0.5 8 200 0.006 0.006 0.01 0.5 8 500 0 .000 0 .000 0.002 0.5 8 2000 0.001 0.001 0.002 0.8 4 200 0.038 0.038 0.032 0.8 4 500 0.021 0.021 0.022 0.8 4 2000 0.011 0.011 0.01 0 0.8 8 200 0.004 0.004 0.009
PAGE 87
87 Table 4 14. Continued. Model MA Wave Size LGM 1 LGM 2 LGM 3 0.8 8 500 0.002 0.002 0.005 0.8 8 2 000 0.001 0.001 0.001 Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate, LGM with a time varying covariate and LGM with a parallel process respectively. N umbers in bold indicate unacceptable bias ARMA (1, 1) Within -Person Res idual Covariance Matrix Results in Table 4 1 5 indicate that the magnitude of the relative biases of w as affected by the ARMA parameter value, the number of waves and the analysis model type For the LGM with a time invariant cova riate a ll the relative biases observed with an ARMA parameter of 0 .2 and 0 .8 were unacceptable regardless of the analysis model type. The biases observed with an ARMA parameter value of 0.5 and 0.45 and with the correct analysis model were unacceptable. Fo r the LGM with a time varying covariate, the pattern of the acceptability of relative biases were similar to that for the LGM with a time invariant covariate, except that the relative biases observed with eight waves and the correct analysis model were ac ceptable regardless of the value of the ARMA parameter For the LGM with a parallel process all the biases observe d with an ARMA parameter of 0.2 and 0.8 were unacceptable while the biases observed with an ARMA parameter o f 0.5 and 0.45 were all acceptab le (except one bias that was barely acceptable). Across the three LGMs, with other factors ho l ding constant, the absolute biases observed with an ARMA parameter of 0.2 and 0.8 were higher than those observed with an ARMA parameter of 0.5 and 0.45, and t he absolute unacceptable biases observed with four waves were higher than those observed with eight waves, Table 4 1 5 Mean relative biases of estimates for three LGMs with an ARMA (1 1 ) within person residual covariance matrix A RMA M odel W ave S ize LGM 1 LGM 2 LGM 3 0.2 0.8 I ncorrect 4 200 0.350 0.335 0.353 Incorrect 4 500 0.342 0.340 0.345 Incorrect 4 2000 -0.339 -0.339 -0.338
PAGE 88
88 Table 4 15. Continued. ARMA Model Wave Size LGM 1 LGM 2 LGM 3 Incorrect 8 200 0. 219 0.219 0.221 Incorrect 8 500 0.212 0.213 0.214 Incorrect 8 2000 0.209 0.214 0.210 C orrect 4 200 0.214 0.204 0.263 Correct 4 500 -0.202 -0.204 -0.251 Correct 4 2000 0.200 0.207 0.241 Correct 8 200 0.073 0.038 0.102 C orrect 8 500 0.067 0.032 0.087 Correct 8 2000 0.062 0.025 0.071 0.5 0.45 I ncorrect 4 200 0.026 0.040 0.025 Incorrect 4 500 0.038 0.038 0.037 Incorrect 4 2000 0.040 0.041 0.041 Incorrect 8 200 0.023 0.030 0.015 Incorrect 8 500 0.030 0 .032 0.027 Incorrect 8 2000 0.032 0.034 0.031 C orrect 4 200 0.162 0.124 0.054 Correct 4 500 0.140 0.099 0.021 Correct 4 2000 0.130 0.079 0.003 Correct 8 200 0.053 0.039 0.019 Correct 8 500 0.050 0.035 0.010 Correct 8 2000 0.046 0.038 0.007 Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate, LGM with a time varying covariate and LGM with a parallel process respectively. N umbers in bold indicate unacceptable bias Results in Table 4 1 6 indicat e that for all the three LGM s, all the unacceptable biases were occurred when the ARMA parameter was equal to 0.2 and 0.8 and the number of waves was four. Under these conditions the biases of in the three LGMs were all unacce ptable and negative when the analysis model was misspecified but were acceptable when the analysis model was correct (a few relative biases for the LGM with a time invariant covariate and one relative bias for the LGM with a parallel process w ere barely u n acceptable ). The biases under other conditions were all acceptable.
PAGE 89
89 Table 4 1 6 Mean relative biases of estimates for three LGMs with an ARMA (1, 1) within person residual covariance matrix ARMA M odel Wave Size LGM 1 LGM 2 LG M 3 0.2 ,0.8 I ncorrect 4 200 -0.088 -0.079 -0.104 Incorrect 4 500 0.083 0.077 0.087 Incorrect 4 2000 0.078 0.078 0.079 Incorrect 8 200 0.022 0.017 0.03 Incorrect 8 500 0.016 0.015 0.019 Incorrect 8 2000 0.013 0.015 0.014 C orrect 4 200 0.055 0.037 0.054 Correct 4 500 0.051 0.035 0.029 Correct 4 2000 0.054 0.039 0.024 Correct 8 200 0.017 0.007 0.024 Correct 8 500 0.008 0.004 0.013 Correct 8 2000 0.005 0.002 0.007 0.5,0.45 Incorrect 4 200 0.004 0.002 0 .024 Incorrect 4 500 0.005 0.004 0.003 Incorrect 4 2000 0.007 0.008 0.005 Incorrect 8 200 0.009 0 .000 0.016 Incorrect 8 500 0.002 0.001 0.004 Incorrect 8 2000 0.001 0.002 0 .000 C orrect 4 200 0.021 0.011 0.039 Correct 4 500 0.011 0.0 05 0.014 Correct 4 2000 0.004 0.003 0.003 Correct 8 200 0.013 0.006 0.019 Correct 8 500 0.005 0.001 0.007 Correct 8 2000 0.001 0.001 0.001 Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate, LGM with a time vary ing covariate and LGM with a parallel process respectively. N umbers in bold indicate unacceptable bias Results in Table 4 1 7 indicated that unacceptable biases for were observed only when the ARMA parameter was equal to 0.2 a nd 0.8. With these parameter values, bias was unacceptable when the analysis model was incorrect, regardless of the number of waves, but the bias was larger when the number of waves was eight. With the 0.2 and 0.8 parameter values, bias was also unaccept able when the analysis model was correct and the number of waves was four but only for the LGM with a time invariant covariate and a time varying covariate. It was noticed that the unacceptable biases increased with the increase of the sample size.
PAGE 90
90 Table 4 1 7 Mean relative biases of estimates for three LGMs with an ARMA (1, 1) within -person residual matrix collapsing across sample size ARMA M odel Wave Size LGM 1 LGM 2 LGM 3 0.2 ,0.8 I ncorrect 4 200 0.230 0.229 0.220 Incor rect 4 500 0.232 0.238 0.229 Incorrect 4 2000 0.235 0.235 0.235 Incorrect 8 200 0.07 0 0.085 0.066 Incorrect 8 500 0.076 0.09 0 0.075 Incorrect 8 2000 0.079 0.088 0.076 C orrect 4 200 0.097 0.087 0.031 Correct 4 500 0.112 0.088 0.042 Correct 4 2 000 0.134 0.101 0.054 Correct 8 200 0.017 0.013 0.021 Correct 8 500 0.022 0.007 0.026 Correct 8 2000 0.022 0.008 0.027 0.5 ,0.45 Incorrect 4 200 0.034 0.026 0.032 Incorrect 4 500 0.024 0.022 0.023 Incorrect 4 2000 0.023 0.022 0.024 Inc orrect 8 200 0.02 0.016 0.024 Incorrect 8 500 0.016 0.013 0.017 Incorrect 8 2000 0.014 0.013 0.015 C orrect 4 200 0.006 0.016 0.001 Correct 4 500 0.011 0.009 0.002 Correct 4 2000 0.006 0.004 0.001 Correct 8 200 0.008 0.005 0.016 C orrect 8 500 0.003 0.002 0.011 Correct 8 2000 0.001 0 0.009 Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate, LGM with a time varying covariate and LGM with a parallel process respectively. N umbers in bold indicate unaccept able bias Summary of the R esults for V ariance C omponent P arameter Estimates The above results indicated that unlike those of fixed parameter and accompanied standard error estimates not all the relative biases of variance components estimates were accep table. Whether the biases were acceptable depended on the type of within person residual covariance structure. When the residual covariance structure was an AR (1) or a MA (1) process, the analysis model type had the most important impact on the mean relat ive biases. It wa s either the analysis model type, or the interaction of the analysis model type with other factor s (e.g the
PAGE 91
91 number of waves or the time series parameter s ) that had an impact on the relative biases. In comparison, when the residual covar iance structure was an ARMA (1 1 ) process, it was the ARMA parameter played an important role: all the relative biases were affected by either the ARMA parameter or the interaction of the ARMA parameter with other factors (i.e., the analysis model type or the number of waves, or both). With an AR(1) or a MA (1) within -person residual covariance structure, in terms of the impact of the analysis model type, when the incorrect analysis model was used, all the estimates of were bias ed and some estimates of and were biased; when the correct analysis model was used, all the biases of and were acceptable but many biases of were unexpectedly unacceptable. As some unacceptable biases were found with the correct analysis model, future investigations were conducted and were discussed later. The general pattern of the impact of the model type on variance co mponents estimates was: under the same condition, the correct analysis model always led to better estimates than the incorrect analysis model. With an AR (1) or a MA (1) within -person error structure, t he sample size only had an impact on the acceptabilit y of biases of estimate of with a correct analysis model not on the estimates of and These unacceptable biases of occurred with the correct ana lysis model were affected by the sample size only under conditions in which the AR parameter was 0.5, the number of waves was four and the sample size was 200 and 500, and under conditions in which the AR parameter was 0.8, the number of waves was eight and the sample size was 200 and 500. With these conditions, the unacceptable biases of could be avoided by increasing sample size to 2000. With a MA (1) structure and with a correct analysis model, the only bias unacceptable occurr ed under the condition when the MA parameter was 0.8, the sample size was 200 and the number of waves was four. A ll other biases observed with the correct analysis model
PAGE 92
92 were acceptable. On average, the magnitude of the biases of the variance components de creased with the increase of sample size. However, exception s w ere noted for the estimates of with an AR (1) error structure and the estimates of with a MA (1) error structure: t he biases observed with the incorrect analysis model were slightly increasing with the increase of the sample size. With an AR (1) or a MA (1) within -person residual covariance structure, none of the magnitude of biases obtained with four waves were smaller than those obtained wi th eight waves holding other conditions equal The length of waves played an important role in the estimates of : t he incorrect analysis model resulted in unacceptable biases only when t he number of waves was four When the number of waves was eight, all the biases of were acceptable regardless of the analysis model type. The AR parameter and the MA parameter only affected acceptability of the biases of the estimates of the Holding other conditions constant, a higher value of AR or MA parameter resulted in higher absolute biases for the estimates of than a lower value of AR or MA parameter did. The pattern of biases with an ARMA (1, 1) within -pers on residual structure was different f rom that with an AR ( 1) or a MA (1) error structure. A ll the relative biases were affected by the value of ARMA parameter It was found that controlling other factors, an ARMA parameter value of 0.2 and 0.8 always led t o more biased estimates than its counterpart value of 0.5 and 0.45. The analysis model type affected the relative biases through its interaction with the ARMA parameter. Without controlling the effect of the ARMA parameter, the relative biases
PAGE 93
93 observed wi th the correct analysis model were no better than those observed with the incorrect analysis model. Under the condition when the ARMA parameter was 0.2 and 0.8, the unacceptable biases caused by the incorrect analysis model were worse than those caused by the correct analysis model. For the estimates of and regardless of the analysis model type, a ll the unacceptable relative biases were observed only when the ARMA parameter was 0.2 and 0.8 and all th e relative biases observed when the ARMA parameter was 0.5 and 0.45 were acceptable. For the estimates of unacceptable biases were observed with both the ARMA parameter values. However when the ARMA parameter value was 0.2 an d 0.8, the unacceptable biases were observed regardless of the analysis model type; when the ARMA parameter value was 0.5 and 0.45, only biases observed with the correct analysis model was unacceptable. With an ARMA (1, 1) error structure, the unacceptable biases of estimates of increased with the increase of t he sample size which was unexpected. In terms of the impact of the length of waves, controlling other factors, biases observed with the four waves were always worse than bi ases observed with eight waves. Similar to the AR (1) or MA (1) error structure, the length of waves played an important role in the estimates of : When the ARMA parameter was 0.2 and 0.8, as long as the length of waves was eight the biases of were acceptable no matter what type the analysis model was. Whenever the biases were unacceptable: 1.when the residual covariance structure was an AR (1) process, estimates of and were inflated and estimates of were deflated; 2.when the residual covariance structure was a MA (1) or an ARMA (1, 1), estimates of and were deflated and estimates of were inflated.
PAGE 94
94 Based on the number of acceptable biases and the magnitude of these biases as a measure of sensitivity to model misspecification under the same condition, were more sensitive to model misspecification than and and was the least sensitive one among the three Regarding the performance of the three LGMs, there was no substantial dif ference between these three LGMs. Standard Error Estimates of Variance Component s The standard error estimates of the variance component s refer to the standard error estimates for and The structure of this part follows the same pattern as that in previous section for the variance components estimates. Within each residual covariance structure, the results for the standard error of w ill be presented first, followed by the results for the standard error of The results of the standard error of w ill be presented last. AR (1) Within -Person Residual Covariance Matrix Results in Table 4 1 8 indicate that for the three LGMs, the relative biases of the standard error estimates of were all acceptable when the analysis model was misspecified but were not all acceptable when the analysis model was correct All the un acceptable biases were observed with a sample size of 200 or 500. However, not all biases occurred with the sample size of 200 or 500 were unacceptable: when the AR parameter was 0.5 and the number of waves was eight, biases occurred with a sample size of 200 or 500 were still acceptable. The biases observed with a sample size of 2000 were all acceptable (except that one bias for the LGM with a parallel process was barely unacceptable). It was noticed that w hen the sample size increas ed from 200 to 500, the sign of some of the unacceptable biases changed from a positive value to a negative value, and the magnitude of the biases increased which was out of expectation. It was noticed that there were some estimates that were substantially positively biased ( i nflated more than
PAGE 95
95 100%) under the condition of an AR parameter of 0.8 and four waves. The magnitudes of the unacceptable biases observed with an AR value of 0.5 were larger than those observed with an AR value of 0.8. Table 4 1 8 Mean relative biases of s tandard error estimates of for three LGM s with an AR (1) within -person residual covariance matrix Model AR W ave S ize LGM 1 LGM 2 LGM 3 Incorrect 0.5 4 200 0.014 0.001 0.001 0.5 4 500 0.008 0.010 0.004 0.5 4 2000 0.016 0 .006 0.019 0.5 8 200 0.014 0.006 0.006 0.5 8 500 0.016 0.013 0.020 0.5 8 2000 0.007 0.008 0.001 0.8 4 200 0.015 0.005 0.015 0.8 4 500 0.009 0.008 0.010 0.8 4 2000 0.012 0.002 0.015 0.8 8 200 0.002 0.009 0.021 0.8 8 500 0.01 8 0.004 0.004 0.8 8 2000 0.010 0.009 0.006 Correct 0.5 4 200 0.146 0.086 0.125 0.5 4 500 0.435 0.115 0.407 0.5 4 2000 0.025 0.014 0.004 0.5 8 200 0.013 0.002 0.006 0.5 8 500 0.024 0.001 0.007 0.5 8 2000 0.015 0.001 0.004 0.8 4 200 2.245 2.061 2.167 0.8 4 500 1.136 1.350 1.316 0.8 4 2000 0.070 0.097 0.131 0.8 8 200 0.101 0.127 0.116 0.8 8 500 0.193 0.223 0.162 0.8 8 2000 0.004 0.012 0.011 Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate LGM with a time varying covariate and LGM with a parallel process respectively. N umbers in bold indicate unacceptable bias For the three conditional LGMs, the biases of the standard error estimates of or were all acceptable and therefore only the marginal means based on the analysis model type are reported (see Table 4 1 9 ). The range of the mean relative biases of standard error estimates of
PAGE 96
96 and was from -.004 to .009, indicating that the estimates were quite close to their empirical values. Table 4 1 9 Mean relative biases of standard error estimates of and for three LGM s with an AR (1) within -person residual covariance matrix LGM 1 LGM 2 LGM 3 Model Incorrect 0.004 0.000 0.000 0.001 0.003 0.004 Correct 0.009 0.005 0.007 0.011 0.008 0.005 Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate, LGM with a time varying covariate and LGM with a parallel process respectively. MA (1) Within Perso n Residual Covariance Matrix The biases of estimates of standard error of all three variance components were acceptable and trivial, ranging from 0 .0 06 to 0 .00 1 indicating that the estimates were quite close to their respective empirical value s (see T able 4 20). Table 4 20. Mean relative biases of standard error estimates of variance components for three LGM s with a MA (1) within -person residual covariance matrix LGM 1 LGM 2 LGM 3 Model Incorrect 0.000 0.003 0.005 0.000 0.003 0.005 0.005 0.003 0.006 Correct 0.001 0.001 0.004 0.001 0.001 0.004 0.004 0.000 0.004 Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate, LGM with a time varying covariate and LGM with a parallel process respectively. ARMA (1, 1) Within -Pers on Residual Covariance Matrix Results in Table 4 2 1 indicate that the misspecified model did not lead to any unacceptable biases. However, as long as the analysis model was correct, the biases were unacceptable (with one exception for the LGM with a time invariant covariant). The magnitudes of these unacceptable biases were much larger with an ARMA parameter value of 0.5 and 0.45 than those with an ARMA parameter value of 0.2 and 0.8. Many biases observed with a value of 0.5 and 0.45 and the correct analysi s model were bigger than 1. The unacceptable biases observed with
PAGE 97
97 an ARMA parameter value of 0.2 and 0.8 demonstrated unexpected trends: the magnitude of these biases increased on average with the increase of sample size, and these biases observed with fou r waves changed sign when the sample size changed from 200 to 500. Table 4 21. Mean relative biases of standard error estimates of for three LGMs with an ARMA (1, 1) within -person residual covariance matrix Model ARMA Wave Size LGM 1 LGM 2 LGM 3 Incorrect 0.2,0.8 4 200 0.002 0.000 0.004 0.2,0.8 4 500 0.022 0.005 0.006 0.2,0.8 4 2000 0.008 0.002 0.017 0.2,0.8 8 200 0.026 0.002 0.011 0.2,0.8 8 500 0.007 0.005 0.000 0.2,0.8 8 2000 0.01 0.012 0.011 0.5,0.45 4 200 0.004 0.009 0.000 0.5,0.45 4 500 0.032 0.000 0.006 0.5,0.45 4 2000 0.011 0.011 0.024 0.5,0.45 8 200 0.007 0.013 0.016 0.5,0.45 8 500 0.005 0.009 0.004 0.5,0.45 8 2000 0.003 0.012 0.019 Correct 0.2,0.8 4 200 0.036 0.229 0.130 0.2, 0.8 4 500 0.266 0.183 0.251 0.2,0.8 4 2000 0.552 0.562 0.588 0.2,0.8 8 200 -0.199 -0.113 -0.207 0.2,0.8 8 500 0.409 0.220 0.346 0.2,0.8 8 2000 0.673 0.500 0.604 0.5,0.45 4 200 2.120 1.076 0.793 0.5,0.45 4 500 1.749 1.062 0.891 0.5 ,0.45 4 2000 1.522 0.995 0.548 0.5,0.45 8 200 1.365 1.192 1.136 0.5,0.45 8 500 1.748 1.317 1.103 0.5,0.45 8 2000 1.260 1.018 0.791 Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate, LGM with a time varying covariate and LGM w ith a parallel process respectively. N umbers in bold indicate unacceptable bias Results in Table 4 2 2 indicate that for the three LGM s, the only una cceptable biases occurred under the condition when the analysis model was correct, the ARMA parameter was
PAGE 98
98 equal to 0.2 and 0.8, the number of waves was four and the sample size was 2000 Biases obtained under a ll other conditions were acceptable. Table 4 2 2 Mean relative biases of standard error estimates of for three LGM s with an ARMA ( 1 1 ) within -person residual covariance matrix Model ARMA Wave Size LGM 1 LGM 2 LGM 3 Incorrect 0.2,0.8 4 200 0.016 0.008 0.004 0.2,0.8 4 500 0.003 0.003 0.011 0.2,0.8 4 2000 0.004 0.001 0.003 0.2,0.8 8 200 0.012 0.003 0.006 0.2,0. 8 8 500 0.007 0.000 0.019 0.2,0.8 8 2000 0.02 0.003 0.006 0.5,0.45 4 200 0.003 0.014 0.007 0.5,0.45 4 500 0.000 0.003 0.002 0.5,0.45 4 2000 0.004 0.002 0.016 0.5,0.45 8 200 0.01 0.007 0.007 0.5,0.45 8 500 0.025 0.004 0.003 0.5,0 .45 8 2000 0.006 0.006 0.031 Correct 0.2,0.8 4 200 0.043 0.039 0.020 0.2,0.8 4 500 0.077 0.058 0.046 0.2,0.8 4 2000 0.256 0.242 0.178 0.2,0.8 8 200 0.001 0.001 0.003 0.2,0.8 8 500 0.002 0.001 0.018 0.2,0.8 8 2000 0.027 0.017 0.0 44 0.5,0.45 4 200 0.013 0.023 0.024 0.5,0.45 4 500 0.004 0.009 0.030 0.5,0.45 4 2000 0.017 0.015 0.043 0.5,0.45 8 200 0.010 0.009 0.007 0.5,0.45 8 500 0.015 0.007 0.001 0.5,0.45 8 2000 0.007 0.006 0.020 Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate, LGM with a time varying covariate and LGM with a parallel process respectively. N umbers in bold indicate unacceptable bias The relative biases of standard error estimates of wer e all acceptable when the analysis model was misspecified (see Table 4 2 3 ). However, a few of unacceptable biases were observed under conditions in which the analysis model was correct the ARMA parameter value was 0 2 and 0 8 and the sample size was 500 or 2000. These unacceptable biases were negative
PAGE 99
99 and t he magnitude of these unacceptable biases increased with the increased sample size, which was out of expectation. Table 4 2 3 Mean relative biases of standard error estimates of for three LGMs with an ARMA (1, 1) within -person residual covariance matrix M odel ARMA W ave S ize LGM 1 LGM 2 LGM 3 Incorrect 0.2, 0.8 4 200 0.016 0.001 0.007 0.2 ,0.8 4 500 0.004 0.011 0.009 0.2 0.8 4 2000 0.022 0.015 0.013 0.5, 0.45 8 20 0 0.001 0.004 0.000 0.5, 0.45 8 500 0.004 0.015 0.005 0.5, 0.45 8 2000 0.010 0.008 0.001 0.2, 0.8 4 200 0.014 0.005 0.012 0.2 ,0.8 4 500 0.014 0.000 0.004 0.2 0.8 4 2000 0.012 0.013 0.011 0.5, 0.45 8 200 0.014 0.006 0.020 0.5, 0. 45 8 500 0.003 0.003 0.009 0.5, 0.45 8 2000 0.002 0.005 0.016 Correct 0.2, 0.8 4 200 0.053 0.068 0.026 0.2 ,0.8 4 500 0.135 0.143 0.070 0.2 0.8 4 2000 0.405 0.398 0.262 0.5, 0.45 8 200 0.094 0.025 0.039 0.5, 0.45 8 500 0.164 0 .031 0.088 0.5, 0.45 8 2000 0.376 0.147 0.240 0.2, 0.8 4 200 0.018 0.030 0.085 0.2 ,0.8 4 500 0.013 0.007 0.084 0.2 0.8 4 2000 0.018 0.026 0.087 0.5, 0.45 8 200 0.013 0.022 0.034 0.5, 0.45 8 500 0.013 0.010 0.011 0.5, 0.45 8 2000 0.015 0.018 0.041 Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate, LGM with a time varying covariate and LGM with a parallel process respectively. N umbers in bold indicate unacceptable bias Summary of Standard Error Estimates of the Variance Components It was found that across the three LGMs, the standard error estimates of the variance components were all acceptable when the analysis model failed to consider the time series process on the within-person residual cova riance structure. Therefore, the model misspecification did not affect the standard error estimates of the variance components. However, some biases
PAGE 100
100 were found to be unacceptable with the correct analysis model with an AR (1) or an ARMA (1, 1) error struct ure. All the biases observed with a MA (1) error structure were acceptable, regardless of the analysis model type. When the within -person residual covariance structure was AR (1) and the correct analysis model was employed, the unacceptable biases were ob served only for the standard error estimates for not for the standard error estimate for or When the residual covariance structure was an ARMA (1, 1) and the correct an alysis model was used the unacceptable biases were observed for the standard error estimates of all three variance components. Furthermore, some suspicious problems were noticed. W ith an AR (1) error structure, the magnitude s of some of the unacceptable b iases increased as the sample size increased and the sign of these unacceptable biases changed from positive to negative. W ith an ARMA (1, 1) error structure the magnitude of the unacceptable biases observed with an ARMA parameter value of 0.2 and 0.8 inc reased with the increase of sample size. For the standard error estimates of some of the unacceptable biases changed sign with the change of sample size. There were a few biases obtained with the correct analysis model were l arger than 1. A ll the unacceptable biases of and were observed only with a sample size of 500 or 2000, not with a sample size of 200. All these above unexpected findings would be investigated and dis cussed later. With the correct analysis model, t he absolute values of unacceptable biases observed with an AR parameter of 0.8 were higher than those observed with an AR parameter of 0.5, and the unacceptable biases of the standard error estimates for and were observed only with a value of ARMA parameter of 0.2 and 0.8 not with a value of ARMA parameter of 0.5 and 0.45, but for the standard error estimates of both parame ter values led to unacceptable biases, and
PAGE 101
101 these biases observed with a 0.5 and 0.45 value were larger than those observed with a 0.2 and 0.8 value. On average, e stimates obtained with four waves were more biased than those obtained with eight waves (i.e. 8). T he three LGMs show ed little difference in the performance in terms of the pattern of the acceptability and the magnitude of the relative biases. Chi -S quare GOF Test and GOF Indexes The chi -square GOF test and GOF indexes are used to detect model misspecification. Therefore results report ed in this section addressed whether the GOF test and GOF indexes could successfully differentiate the correct analysis model from the incorrect analysis model The ir performance was presented according to the foll owing sequence: GOF test ( p value) CFI and TLI, RMSEA and SRMR. GOF Test The null hypothesis in a GOF test is that the targeted model fits the data exactly. A Type I error rate is committed when the null hypothesis is rejected. Therefore, rejecting a cor rect model in GOF test results in a Type I error rate. A s a common criterion is to control the Type I error rate within 5%, t he following section reports the percentage of p value that is below 0.05 for each of the 24 conditions. Therefore t he percentage r eported here is the estimated Type I error when the null hypothesis is true and the estimated power when the null is false. The p value was said to be able to differentiate between the two types of analysis models when the Type I error rate was less than 5 % and the power was larger than 80%. AR (1) within -p erson r esidual c ovariance m atrix Results in Table 4 2 4 indicate that under conditions in which the length of waves was four and the sample size was 2000, or the length of waves was eight the Type I e rror rate was 5% or
PAGE 102
102 slightly inflated, and the power was 100% for all three LGMs. For the LGM with a time invariant covariate, when the AR parameter was 0.5, the length of waves was four and the sample size was 500, the power was 82%. The power under other conditions were all less than 80%. Table 4 2 4 Percentage of p value below 0.05 for three LGMs with an AR (1) within-person residual covariance matrix Model AR Wave Size LGM 1 LGM 2 LGM 3 Incorrect 0.5 4 200 39% 31% 22% 0.5 4 500 82% 73% 56% 0.5 4 2000 100% 100% 100% 0.8 4 200 34% 28% 21% 0.8 4 500 78% 70% 52% 0.8 4 2000 100% 100% 100% 0.5 8 200 100% 100% 100% 0.5 8 500 100% 100% 100% 0.5 8 2000 100% 100% 100% 0.8 8 200 100% 100% 100% 0.8 8 500 100% 100% 100% 0.8 8 2000 100% 100% 1 00% Correct 0.5 4 200 5% 6% 6% 0.5 4 500 5% 5% 6% 0.5 4 2000 5% 5% 6% 0.8 4 200 5% 6% 6% 0.8 4 500 5% 5% 6% 0.8 4 2000 5% 5% 5% 0.5 8 200 5% 8% 8% 0.5 8 500 5% 6% 7% 0.5 8 2000 5% 5% 5% 0.8 8 200 6% 9% 9% 0.8 8 500 5% 6% 6% 0.8 8 20 00 5% 6% 5% Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate, LGM with a time varying covariate and LGM with a parallel process respectively.
PAGE 103
103 MA (1) w ithin-p erson r esidual c ovariance m atrix Results for the p value were simil ar to those obtained with an AR (1) residual covariance structure (see Table 4 2 5 ). The power was 100% for every condition in which the number of waves was eight, or the number of waves was four and the sample size was 2000. All the Type I error rate was 5 % or slightly inflated For the LGM with a time invariant covariate, the power was more than 80% with four waves and a sample size of 500. Table 4 2 5 Percentage of p value below 0 .05 for three LGMs with a MA (1 ) within person residual covariance matrix M odel MA Wave Size LGM 1 LGM 2 LGM 3 Incorrect 0.5 4 200 37% 31% 23% 0.5 4 500 80% 72% 56% 0.5 4 2000 100% 100% 100% 0.8 4 200 50% 42% 31% 0.8 4 500 92% 86% 74% 0.8 4 2000 100% 100% 100% 0.5 8 200 100% 100% 100% 0.5 8 500 100% 100% 100% 0.5 8 2000 100% 100% 100% 0.8 8 200 100% 100% 100% 0.8 8 500 100% 100% 100% 0.8 8 2000 100% 100% 100% Correct 0.5 4 200 7% 8% 7% 0.5 4 500 6% 11% 5% 0.5 4 2000 6% 10% 6% 0.8 4 200 7% 8% 7% 0.8 4 500 6% 8% 6% 0.8 4 2000 6% 6% 6% 0.5 8 200 6% 8% 10% 0.5 8 500 6% 6% 7% 0.5 8 2000 5% 5% 6% 0.8 8 200 6% 9% 9% 0.8 8 500 6% 6% 7% 0.8 8 2000 5% 5% 5% Note: Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate, LGM with a time varying covariate and LGM with a parallel process respectively.
PAGE 104
104 For the LGM with a time varying covariate, the power was 86% when the MA parameter was 0.8, the number of waves was four and the sample size was 500. AR MA (1 1 ) w ithin p erson r esidual c ovariance m atrix Results in Table 4 2 6 indi cate that with an ARMA (1, 1) within -person residual covariance structure included in the generating model, the T ype I error rate s ranged from 6% to 9% with an ARMA parameter value of 0.5 and 0.45 but ranged from 9% to 44% with an ARMA parameter value of 0 .2 and 0.8. T he Type I error rate obtained with both four waves and eight waves and with the value of 0.2 and 0.8 were similar, but larg er number of waves result ed in an increase of Type I error rate for the value of 0.5 and 0.45 and with a sample size of 200 and 500. With a value of 0.5 and 0.45, the Type I error rate tended to increase with the increase of sample size. The power to detect the incorrect analysis model was more than 95% when the ARMA parameter was e qual to 0.5 and 0.45 and the number of waves was eight and the power was more than 88% when the number of waves was eight the ARMA parameter value was equal to 0.5 and 0.45, and the sample size was 2000. The power under other conditions was less than 43%. Table 4 2 6 Percentage of p value below 0.05 for three LGMs with an ARMA (1, 1) withinperson residual covariance matrix Model ARMA Wave Size LGM 1 LGM 2 LGM 3 Incorrect 0.2, 0.8 4 200 6% 6% 6% 0.2 ,0.8 4 500 5% 5% 5% 0.2 0.8 4 2000 6% 6% 5% 0.5, 0.45 4 200 18% 16% 12% 0.5, 0.45 4 500 43% 35% 26% 0.5, 0.45 4 2000 98% 96% 88% 0.2, 0.8 8 200 7% 10% 10% 0.2 ,0.8 8 500 7% 8% 7% 0.2 0.8 8 2000 18% 15% 11% 0.5, 0.45 8 200 100% 99% 95% 0.5, 0.45 8 500 100% 100% 100%
PAGE 105
105 Table 4 26. Continued. Model ARMA Wave Size LGM 1 LGM 2 LGM 3 0.5, 0.45 8 2000 100% 100% 100% Correct 0.2, 0.8 4 200 7% 7% 7% 0.2 ,0.8 4 500 6% 6% 6% 0.2 0.8 4 2000 6% 6% 6% 0.5, 0.45 4 200 9% 10% 10% 0.5, 0.45 4 500 13% 14% 13% 0.5, 0.45 4 2000 34% 33% 40% 0.2, 0.8 8 200 7% 9% 8% 0.2 ,0.8 8 500 7% 6% 7% 0.2 0.8 8 2000 7% 6% 7% 0.5, 0.45 8 200 26% 16% 33% 0.5, 0.45 8 500 27% 16% 36% 0.5, 0.45 8 2000 27% 16% 44% Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate, LGM with a time varying covariate and LGM with a parall el process respectively. Summary of r esults for GOF test The p value performed differently with the three types of error structures. When the generating model included an AR (1) or a MA (1) within -person residual covariance structure, the P value statistic can be used to differentiate between the two analysis models under conditions as long as the length of waves was eight or the sample size was 2000 based on the Type I error rate and power criterion for the three LGMs For the LGM with a time invariant c ovariate and with a time varying covariate, a few more conditions met the above criterion The situation was different for an ARMA (1, 1) residual covariance structure Under the conditions when the power was more than 80%, the Type I error rate was more than 16%. Under the conditions when the Type I error rate was less than 10%, the power was less than 18%. Therefore, the p value could not be used to differentiat e between the two types of analysis models with an ARMA (1, 1) error structure. With a value o f 0.5 and 0.45, the Type I error rate s
PAGE 106
106 were large and tended to increase w ith the increase of sample size, and the eight waves led to higher Type I error rate than the four waves when the sample size was 200 or 500. In summary w hen the within person resid ual covariance structure demonstrated an AR (1) or a MA (1) structure GOF test can be used to differentiate between the two analysis models under conditions when the length of waves was eight or the sample size was 2000 for all three LGMs When the within -person residual covariance structure demonstrated an ARMA (1, 1) structure, GOF test was of little use in model selection. TLI and CFI For TLI and CFI, criteria for adequate model fit are: CFI is greater than 0 .95, TLI is greater than 0.95. The percenta ge of CFI and TLI that met the above criterion was reported in this section. AR (1) w ithin -p erson r esidual c ovariance m atrix The TLI and CFI suggested adequate model fit for all replications under all the conditions for the LGM with a parallel process and in at least 98% of the replications in all conditions for the LGM with a time invariant covariate (see Table 4 2 7 ). This was true for both the correct and the incorrect analysis models. For LGM with a time varying covariate, CFI and TLI suggested adeq uate model fit for all replications when the analysis model was correct. When the analysis model was incorrect CFI and TLI could not differentiate the correct analysis model from the incorrect analysis model if the number of waves was four. Differentiation was minimal with eight waves and a 0.5 parameter value, but more substantial with a .8 parameter value. Nevertheless, with eight waves perfect differentiation did not occur for CFI and occurred for TLI only when the sample size was 2000 and the AR parame ter was 0.8.
PAGE 107
107 Table 4 2 7 Percentage of TLI and CFI statistics that indicated adequate model fit for three LGMs with an AR (1) within person residual covariance matrix LGM 1 LGM 2 LGM 3 Model AR Wave Size TLI CFI TLI CFI CFI TLI Incorrect 0.5 4 200 100% 100% 100% 100% 100% 100% 0.5 4 500 100% 100% 100% 100% 100% 100% 0.5 4 2000 100% 100% 100% 100% 100% 100% 0.8 4 200 100% 100% 100% 100% 100% 100% 0.8 4 500 100% 100% 100% 100% 100% 100% 0.8 4 2000 100% 100% 100% 100% 100% 100% 0.5 8 200 100% 100% 97% 99% 100% 100% 0.5 8 500 100% 100% 100% 100% 100% 100% 0.5 8 2000 100% 100% 99% 100% 100% 100% 0.8 8 200 99% 98% 32% 52% 100% 100% 0.8 8 500 100% 100% 23% 56% 100% 100% 0.8 8 2000 98% 100% 0% 61% 100% 100% Correct 0.5 4 200 100% 100 % 100% 100% 100% 100% 0.5 4 500 100% 100% 100% 100% 100% 100% 0.5 4 2000 100% 100% 100% 100% 100% 100% 0.8 4 200 100% 100% 100% 100% 100% 100% 0.8 4 500 100% 100% 100% 100% 100% 100% 0.8 4 2000 100% 100% 100% 100% 100% 100% 0.5 8 200 100% 100% 100% 100% 100% 100% 0.5 8 500 100% 100% 100% 100% 100% 100% 0.5 8 2000 100% 100% 100% 100% 100% 100% 0.8 8 200 100% 100% 100% 100% 100% 100% 0.8 8 500 100% 100% 100% 100% 100% 100% 0.8 8 2000 100% 100% 100% 100% 100% 100% Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate, LGM with a time varying covariate and LGM with a parallel process respectively. MA (1) w ithin-p erson r esidual c ovariance m atrix Results obtained for TLI and CFI with a MA (1) within person residual covariance structure w ere similar to those obtained with an AR (1) within -person residual covariance structure (see Table 4 28). The two statistics suggested adequate model fit in more than 99 % of the replications under all the conditions for the LGM with a t ime invariant covariate and with a parallel process, even when the analysis model was incorrect For the LGM a with time varying
PAGE 108
108 covariate, CFI and TLI suggested adequate model fit for all replications when the analysis model was correct. With the incorrec t analysis model, CFI could not detect model misspecification when the number of waves was four, or when the number of waves was eight and the MA parameter was 0.5, and the differentiation was minimal when the number of waves was eight and the MA parameter was 0.8. With the incorrect analysis model, TLI detected model misspecification in 95% of the replications when the number of waves was eight, the sample size was 2000 and the AR parameter was 0.8. Its differentiation was minimal under all other conditio ns. Table 4 28. Percentage of TLI and CFI statistics that indicated adequate model fit for three LGMs with a MA (1) within -person residual covariance matrix LGM 1 LGM 2 LGM 3 Model MA Wave Size TLI CFI TLI CFI CFI TLI Incorrect 0.5 4 200 100% 100 % 96% 100% 100% 100% 0.5 4 500 100% 100% 100% 100% 100% 100% 0.5 4 2000 100% 100% 100% 100% 100% 100% 0.8 4 200 100% 100% 93% 100% 100% 100% 0.8 4 500 100% 100% 99% 100% 100% 100% 0.8 4 2000 100% 100% 100% 100% 100% 100% 0.5 8 200 100% 99% 99% 100% 100% 100% 0.5 8 500 100% 100% 100% 100% 100% 100% 0.5 8 2000 100% 100% 100% 100% 100% 100% 0.8 8 200 100% 100% 76% 88% 100% 100% 0.8 8 500 100% 100% 92% 99% 100% 100% 0.8 8 2000 100% 100% 5% 100% 100% 100% Correct 0.5 4 200 100% 100% 100% 100% 100% 100% 0.5 4 500 100% 100% 100% 100% 100% 100% 0.5 4 2000 100% 100% 100% 100% 100% 100% 0.8 4 200 100% 100% 100% 100% 100% 100% 0.8 4 500 100% 100% 100% 100% 100% 100% 0.8 4 2000 100% 100% 100% 100% 100% 100% 0.5 8 200 100% 100% 100% 100% 100% 100% 0.5 8 500 100% 100% 100% 100% 100% 100% 0.5 8 2000 100% 100% 100% 100% 100% 100% 0.8 8 200 100% 100% 100% 100% 100% 100%
PAGE 109
109 Table 4 28. Continued. LGM 1 LGM 2 LGM 3 Model MA Wave Size TLI CFI TLI CFI CFI TLI 0.8 8 500 100% 100% 100% 100% 100% 100% 0.8 8 2000 100% 100% 100% 100% 100% 100% Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate, LGM with a time varying covariate and LGM with a parallel process respectively. AR MA (1 1 ) w ithin p erson r esidual c ovariance m atrix Results in Table 4 29 indicate that the two statistics suggested adequate model fit in almost 100% of the replications under all the conditions, even when the incorrect analysis model was used. Table 4 29. Percenta ge of TLI and CFI statistics that indicated adequate model fit for three LGMs with an ARMA (1 1 ) within -person residual covariance matrix LGM 1 LGM 2 LGM 3 Model ARMA Wave Size TLI CFI TLI CFI CFI TLI Incorrect 0.2, 0.8 4 200 100% 100% 99% 100% 100% 100% 0.2 ,0.8 4 500 100% 100% 100% 100% 100% 100% 0.2 0.8 4 2000 100% 100% 100% 100% 100% 100% 0.5, 0.45 4 200 100% 100% 100% 100% 100% 100% 0.5, 0.45 4 500 100% 100% 100% 100% 100% 100% 0.5, 0.45 4 2000 100% 100% 100% 100% 100% 100% 0.2 0.8 8 200 100% 100% 100% 100% 100% 100% 0.2 ,0.8 8 500 100% 100% 100% 100% 100% 100% 0.2 0.8 8 2000 100% 100% 100% 100% 100% 100% 0.5, 0.45 8 200 100% 100% 100% 100% 100% 100% 0.5, 0.45 8 500 100% 100% 100% 100% 100% 100% 0.5, 0.45 8 2000 100% 100% 100% 100% 100% 100% Correct 0.2, 0.8 4 200 100% 100% 100% 100% 100% 100% 0.2 ,0.8 4 500 100% 100% 100% 100% 100% 100% 0.2 0.8 4 2000 100% 100% 100% 100% 100% 100% 0.5, 0.45 4 200 100% 100% 100% 100% 100% 100% 0.5, 0.45 4 500 100% 100% 100% 100% 100% 100% 0.5, 0.45 4 2000 100% 100% 100% 100% 100% 100% 0.2, 0.8 8 200 100% 100% 100% 100% 100% 100% 0.2 ,0.8 8 500 100% 100% 100% 100% 100% 100% 0.2 0.8 8 2000 100% 100% 100% 100% 100% 100%
PAGE 110
110 Table 4 29. Continued. LGM 1 LGM 2 LGM 3 Model ARMA Wave Size TLI CFI TLI CFI CFI TLI 0.5, 0.45 8 200 100% 100% 100% 100% 100% 100% 0.5, 0.45 8 500 100% 100% 100% 100% 100% 100% 0.5, 0.45 8 2000 100% 100% 100% 100% 100% 100% Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invarian t covariate, LGM with a time varying covariate and LGM with a parallel process respectively. Summary of r esults for CFI and TLI When the within -person residual covariance structure demonstrated each of the three time series processes, CFI and TLI could n ot be used to differentiate between the two types of analysis models under all the conditions, with the only one exception for TLI. When the generating model included an AR(1) or a MA(1) residual covariance structure, f or the LGM with a time varying covari ate, TLI differentiated in more than 95% of the replications under the condition when the number of waves was eight sample size was 2000 and the AR or MA parameter was 0.8 and RMSEA and SRMR For RMSEA and SRMR criteria for adequate model fit are: SRMR is less than 0.08 and RMSEA is less than 0.06. The percentage of the two statistics that met the criteria is presented in the following subsections. AR (1) w ithin -p erson r esidual c ovariance m atrix Results in Table 4 3 0 indicate that the percentage of repl ications in which the fit of the model was considered acceptable by SRMR was 100% under all conditions for each of the three types of LGMs. When the analysis model was correct RMSEA indicated adequate model fit in more than 90% of the replications of each condition for each of the three LGMs. When the analysis model was incorrect, results depended on the type of LGM s Under LGM with a time invariant covariate and LGM with a time varying covariate, RMSEA perfectly differentiated
PAGE 111
111 between the two analysis mode ls when the number of waves was eight. Under LGM with a parallel process RMSEA differentiated between the correct and incorrect models in 100% of the replications when the number of waves was eight and the MA parameters was 0.8 and in more than 97% of the replications when the number of waves was four, the MA parameter was 0.5 and the sample size was 500 or 2000. One unexpected finding was noticed for the LGM with a time varying covariate and the LGM with a parallel process with the incorrect analysis model and four waves, RMSEA became less capable of rejecting the model as the sample size got larger Table 4 3 0 Percentage of RMSEA and SRMR statistics that indicated adequate model fit for three LGMs with an AR (1) within person residual covariance matri x LGM 1 LGM 2 LGM 3 Model AR Wave Size RMSEA SRMR RMSEA SRMR RMSEA SRMR Incorrect 0.5 4 200 47% 100% 67% 100% 88% 100% 0.5 4 500 43% 100% 79% 100% 99% 100% 0.5 4 2000 29% 100% 94% 100% 100% 100% 0.8 4 200 53% 100% 71% 100% 89% 100% 0.8 4 500 50% 100% 82% 100% 99% 100% 0.8 4 2000 41% 100% 97% 100% 100% 100% 0.5 8 200 0% 100% 0% 100% 13% 100% 0.5 8 500 0% 100% 0% 100% 3% 100% 0.5 8 2000 0% 100% 0% 100% 0% 100% 0.8 8 200 0% 100% 0% 100% 0% 100% 0.8 8 500 0% 100% 0% 100% 0% 100% 0.8 8 2000 0% 100% 0% 100% 0% 100% Correct 0.5 4 200 91% 100% 94% 100% 98% 100% 0.5 4 500 100% 100% 100% 100% 100% 100% 0.5 4 2000 100% 100% 100% 100% 100% 100% 0.8 4 200 92% 100% 95% 100% 98% 100% 0.8 4 500 100% 100% 100% 100% 100% 100% 0.8 4 2000 100% 100% 100% 100% 100% 100% 0.5 8 200 100% 100% 100% 100% 100% 100% 0.5 8 500 100% 100% 100% 100% 100% 100% 0.5 8 2000 100% 100% 100% 100% 100% 100% 0.8 8 200 100% 100% 100% 100% 100% 100%
PAGE 112
112 Table 4 30. Continued. LGM 1 LGM 2 LGM 3 Model AR Wave Size RMSEA SRMR RMSEA SRMR RMSEA SRMR 0.8 8 500 100% 100% 100% 100% 100% 100% 0.8 8 2000 100% 100% 100% 100% 100% 100% Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate, LGM with a time varying covariate and LGM with a pa rallel process respectively. MA (1) w ithin-p erson r esidual c ovariance m atrix Results in Table 4 3 1 indicate that, for each LGM and every condition, SRMR indicated adequate model fit in 100% of the replications. Under LGM with a time invariant covariate and with a time varying covariate, when the analysis model was incorrect, RMSEA differentiated between the two analysis models in more than 94% of the replications when the number of waves was eight and in 95% of the replications only for LGM with a time invariant covariate under the condition in which the number of waves was four, the MA parameter was 0.5 and the sample size was 2000. Under LGM with a parallel process RMSEA differentiated between the two analysis models in more than 96% replications when the number of waves was eight and the MA parameter was 0.8. For the LGM with a time varying covariate and the LGM with a parallel process with the incorrect analysis model and four waves, RMSEA became less capable of indicating inadequate model fit as th e sample size became larger. For the LGM with a parallel process the same problem also occurred when the MA parameter was 0.5 and the number of waves was eight. Table 4 3 1 Percentage of RMSEA and SRMR statistics that indicated adequate model fit for thr ee LGMs with a MA (1, 1) within -person residual covariance matrix LGM 1 LGM 2 LGM 3 Model MA Wave Size RMSEA SRMR RMSEA SRMR RMSEA SRMR Incorrect 0.5 4 200 38% 100% 67% 100% 88% 100% 0.5 4 500 24% 100% 78% 100% 99% 100% 0.5 4 2000 5% 100% 95% 10 0% 100% 100% 0.8 4 200 50% 100% 56% 100% 83% 100%
PAGE 113
113 Table 4 31. Continued. LGM 1 LGM 2 LGM 3 Model MA Wave Size RMSEA SRMR RMSEA SRMR RMSEA SRMR 0.8 4 500 45% 100% 61% 100% 96% 100% 0.8 4 2000 32% 100% 67% 100% 100% 100% 0.5 8 200 0% 100% 6% 100% 46% 100% 0.5 8 500 0% 100% 0% 100% 55% 100% 0.5 8 2000 0% 100% 0% 100% 62% 100% 0.8 8 200 0% 100% 0% 100% 4% 100% 0.8 8 500 0% 100% 0% 100% 0% 100% 0.8 8 2000 0% 100% 0% 100% 0% 100% Correct 0.5 4 200 89% 100% 92% 100% 98% 100% 0.5 4 500 99% 100% 99% 100% 100% 100% 0.5 4 2000 100% 100% 100% 100% 100% 100% 0.8 4 200 90% 100% 93% 100% 98% 100% 0.8 4 500 100% 100% 99% 100% 100% 100% 0.8 4 2000 100% 100% 100% 100% 100% 100% 0.5 8 200 100% 100% 100% 100% 99% 100% 0.5 8 500 100% 100% 100% 100% 100% 100% 0.5 8 2000 100% 100% 100% 100% 100% 100% 0.8 8 200 99% 100% 100% 100% 100% 100% 0.8 8 500 100% 100% 100% 100% 100% 100% 0.8 8 2000 100% 100% 100% 100% 100% 100% Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate, LGM with a time varying covariate and LGM with a parallel process respectively. ARMA (1 1 ) Within -Person Residual Covariance Matrix Results in Table 4 3 2 indicate that, for each LGM and every condition, SRMR indicated adequate model fit in 1 00% of the replications. For the LGM with a time varying covariate and the LGM with a parallel process with the correct analysis model, RMSEA indicated adequate model fit in more than 93% of the replications. When the analysis model was incorrect, RMSEA failed to reject the model in more than 45% of the replications for the LGM with a time varying covariate and failed in more than 89% of the replications for the LGM with a parallel process For the LGM with a time invariant covariate, when the analysis model was correct, RMSEA indicated adequate model fit in more than 78% of the replications. When the analysis model was
PAGE 114
114 incorrect, RMSEA could detect the model misspecifications only under conditions in which the number of waves was eight and the ARMA parame ter value was 0.5 and 0.45, where RMSEA rejected the model in more than 98% of the replications. Table 4 3 2 Percentage of RMSEA and SRMR statistics that indicated adequate model fit for three LGMs with an ARMA ( 1 1 ) within person residual covariance ma trix LGM 1 LGM 2 LGM 3 Model ARMA Wave Size RMSEA SRMR RMSEA SRMR RMSEA SRM R Incorrect 0.2, 0.8 4 200 89% 100% 83% 100% 94% 100% 0.2, 0.8 4 500 99% 100% 96% 100% 100% 100% 0.2, 0.8 4 2000 100% 100% 100% 100% 100% 100% 0.5, 0.45 4 200 72% 1 00% 93% 100% 97% 100% 0.5, 0.45 4 500 83% 100% 100% 100% 100% 100% 0.5, 0.45 4 2000 97% 100% 100% 100% 100% 100% 0.2, 0.8 8 200 99% 100% 45% 100% 89% 100% 0.2, 0.8 8 500 100% 100% 50% 100% 100% 100% 0.2, 0.8 8 2000 100% 100% 51% 100% 100% 100% 0.5, 0.45 8 200 2% 100% 100% 100% 100% 100% 0.5, 0.45 8 500 0% 100% 100% 100% 100% 100% 0.5, 0.45 8 2000 0% 100% 100% 100% 100% 100% Correct 0.2, 0.8 4 200 89% 100% 90% 100% 96% 100% 0.2, 0.8 4 500 99% 100% 99% 100% 100% 100% 0.2, 0.8 4 2000 100% 100% 100% 100% 100% 100% 0.5, 0.45 4 200 85% 100% 93% 100% 98% 100% 0.5, 0.45 4 500 98% 100% 100% 100% 100% 100% 0.5, 0.45 4 2000 100% 100% 100% 100% 100% 100% 0.2, 0.8 8 200 100% 100% 98% 100% 98% 100% 0.2, 0.8 8 500 100% 100% 98% 100% 100% 100% 0.2, 0.8 8 2000 100% 100% 99% 100% 100% 100% 0.5, 0.45 8 200 80% 100% 100% 100% 100% 100% 0.5, 0.45 8 500 79% 100% 100% 100% 100% 100% 0.5, 0.45 8 2000 78% 100% 100% 100% 100% 100% Note: LGM 1, LGM 2 and LGM3 represent LGM with a time invariant covariate, LGM with a time varying covariate and LGM with a parallel process respectively.
PAGE 115
115 For each LGM, when the analysis model was incorrect, the ability of RMSEA to reject the model tended to decrease with the increase of the sample size for all condi tions except when the ARMA parameter value was 0.5 and 0.45 and number of waves was eight. Summary of r esults of SRMR and RMSEA The ab ove results indicate that SRMR non-discriminately suggested adequate model fit in 100% of the replications under all condi tions for each of the three types of LGMs with each of the three within -person residual covariance structures. The performance of RMSEA depended both on the type of within-person error structures and on the type of LGMs. When the generating model included an AR (1) error structure, for the LGM with a time invariant covariate and the LGM with a time varying covariate, RMSEA can perfectly differentiate between the two analysis models when the number of waves was eight and the AR parameter was 0.5 for the thre e LGMs but performed perfectly when the number of waves was four and the AR parameter was 0.8 only for the LGM with a time invariant covariate and the LGM with a time varying covariate. When the generating model included a MA (1) error structure, for the L GM with a time invariant covariate and the LGM with a time varying covariate, RMSEA was able to differentiate between the two types of analysis models with good accuracy under conditions in which the number of waves was eight. For the LGM with a parallel process RMSEA performed well in differentiation only under conditions in which the number of waves was eight and the MA parameter was 0.8. When the within -person residual covariance structure followed an ARMA (1 1 ) structure, RMSEA could be used to detec t the model misspecification only under conditions in which the number of waves was eight and the ARMA parameter value was 0.5 and 0.45, and only for the LGM with a time invariant covariate.
PAGE 116
116 It was noticed that with each type of within person covariance st ructures, under certain conditions when the incorrect analysis model was employed, RMSEA became less capable of rejecting the model with the increase of the sample size. Summary of GOF test and GOF i ndexes Regarding whether the GOF test and fit index es ca n be used to differentiate correct analysis model from the incorrect analysis model CFI and SRMR was found not to be able to detect model misspecification under any conditions. For others their performance depended on the type of within -person residual covariance structure included in the generating model and the type of LGMs. When the within -person residual covariance structure was an AR (1) or a MA (1) process t he performance of the GOF test in model differentiation was perfect as long as the sample si ze was 2000 or the length of waves was eight for all LGMs TLI could differentiate between the correct analysis model and incorrect analysis model only for the LGM with a time varying covariate under conditions in which the number of waves was eight, the sample size was 2000 and the AR or MA parameter was 0.8 RMSEA can be used in model selection with an AR (1) error structure for all three LGMs under conditions in which the number of waves was eight, although the performance was less well for the LGM wit h a parallel process than for the other two LGMs. With an MA (1) error structure, f or the LGM with a time invariant covariate and with a time varying covariate, RMSEA could be used to differentiate between the two analysis models when the number of waves was eight. For the LGM with a parallel process, RMSEA could be used under conditions in which the number of waves was eight and the MA parameter was 0.8. When the within -person residual covariance structure was an ARMA (1, 1), the p value could not be used for model selection. TLI suggested adequate model fit no matter what type of
PAGE 117
117 analysis model was used and therefore were not recommended. RMSEA could be used for model selection only for LGM with a time invariant covariate and under conditions in which the number of waves was eight and the ARMA parameter value was 0.5 and 0.45. Some unexpected results were found. W ith an ARMA (1, 1) error structure, the Type I error rate tended to increase with the increase of the sample size when the parameter value was 0. 5 and 0.45. For the RMSEA, it was noticed that under certain conditions for each type of within -person covariance structures RMSEA became less capable of rejecting the misspecified model with the increase of the sample size.
PAGE 118
118 CHAPTER 5 DISCUSSION AND CO NCLUSION Although c orrelated errors were often found in longitudinal data, current practices in LGM normally assume the within -person errors were uncorrelated. As the literature review has indicated, the error structure misspecification affected certain p arameter estimates in LGM Therefore it is worth methodologists attention to investigate the consequence of model misspecification. W ithin the framework of SEM there is no study systematically investigating the impact when the within -person residual cova riance structure demonstrates one of the three commonly enc ountered time series process, but the analysis model fails to include these time series process. Furthermore, previous studies about investigating model misspecification within SEM category were mo stly conducted on unconditional LGM. Therefore, this study specifically investigated the consequence of misspecification of the within -person error structure under three commonly used conditional LGM s with the aim to make th e results more generalizable. Ge neral Conclusions and Discussions Results of this study has shown that when the within -person residual covariance structure failed to include one of the three types of time series process the fixed parameters and their accompanied standard error estimate s under any one of the three unconditional LGMs were unaffected. This conclusion is consistent with what was found in previous studies (e.g. Yuan & Ben t ler, 2004, 2006; Ferron et. al., 2002). Furthermore, the model misspecification did not bias the estimat es of standard error of variance components, but did bias the estimates of variance components under some selected conditions. However, the variance components and their accompanied standard error estimates were unexpectedly biased when the analysis model was correct under some selected conditions. With an AR (1) or a MA (1) within -person error structure included in the generating model some unexpected result s were found only in the
PAGE 119
119 estimates of It was shown that although all t he biases of obtained with the incorrect analysis model were unacceptable, some biases w ere unacceptable with the correct analysis model. For the standard error estimates of it was found that with an AR(1) error structure, using the incorrect analysis model did not lead to any unacceptable biases but using the correct analysis model caused some unacceptable biases. When the generating model included an ARMA (1, 1) process whether the biases of the v ariance components were acceptable depended mainly on the ARMA parameter s It is difficult to interpret the results using the analysis model type alone. A value of ARMA parameter of 0.5 and 0.45 always led to acceptable biases of estimates of and no matter what type of analysis model was used However, for the estimates of a value of ARMA parameter of 0.5 and 0.45 resulted in unacceptable bias only with the correct analysis m odel With the correct analysis model a value of ARMA parameter of 0.2 and 0.8 caused some unacceptable biases of all three variance components estimates For the standard error estimates of the three variance components, a failure to include an ARMA(1,1) process in error structure did not lead to biased estimates, but including an ARMA(1,1) process in model specification caused some unacceptable biases for all three variance components estimates. In summary, for t he variance component estimates, although the incorrect analysis model caused some biased estimates, the correct analysis model also caused some biases. Under certain conditions it is only the correct analysis model that caused unacceptable bias. For t he standard error estimates of variance comp onents with an AR (1) or an ARMA (1 1 ) residual structure only the correct analysis model caused biased estimates. All the biases of standard error estimates of variance components were acceptable when the analysis model was misspecified. Therefore, it d eserves our attention to investigate why the correct analysis model caused these
PAGE 120
120 unexpectedly biased estimates As more unexpected biases were found with an ARMA (1, 1) error structure, estimates with an ARMA (1, 1) structure were examined and the findings also applied to an AR (1) or a MA (1) structure. As the analysis model including the ARMA (1, 1) process resulted in many occurrence rate of non-positive definite covariance matrices (see Table 4 2 ), a first thought was to investigate whether the occurre nce of non-positive definite matrix cause d these unexpected estimates. Therefore new data sets were created by removing all the replications with a negative variance or a correlation greater than or equal to one and supplementing data with replications wi th out the non -positive definite covariance matrices. The biases of variance components and their standard error estimates under LGM with a parallel process obtained separately with the original data sets and the new data sets were given as an illustration (see Table 5 1 to Table 5 -6 ). Although only the results for LGM with a parallel process with an ARMA (1, 1) error structure are reported as an illustration. Similar results were found for the other two LGMs and therefore were not reported here. Results d isplayed from Table 5 1 to Table 5 3 indicate that the estimates of variance components were almost the same with or without removing the non-positive definite covariance matrices. Table 5 1. Biases of obtained with three dat a sets for LGM with a parallel process with an ARMA (1, 1) within -person residual covariance matrix ARMA Model Wave Size Data 1 Data 2 Data 3 0.2,0.8 Incorrect 4 200 0.353 0.322 0.353 Incorrect 4 500 0.345 0.338 0.345 Incorrect 4 2000 0.338 0. 338 0.338 Incorrect 8 200 0.221 0.221 0.221 Incorrect 8 500 0.214 0.214 0.214 Incorrect 8 2000 0.21 0 0.21 0 0.210 Correct 4 200 0.263 0.254 0.263 Correct 4 500 0.251 0.251 0.251
PAGE 121
121 Table 5 1. Continued. ARMA Model Wave Size Data 1 D ata 2 Data 3 Correct 4 2000 -0.241 -0.241 -0.241 Correct 8 200 0.102 0.102 0.102 Correct 8 500 0.087 0.087 0.087 Correct 8 2000 0.071 0.071 0.071 0.5,0.45 Incorrect 4 200 0.025 0.025 0.025 Incorrect 4 500 0.037 0.037 0.037 Incorrect 4 2000 0.041 0.041 0.041 Incorrect 8 200 0.015 0.015 0.015 Incorrect 8 500 0.027 0.027 0.027 Incorrect 8 2000 0.031 0.031 0.031 Correct 4 200 0.054 0.006 0.021 Correct 4 500 0.021 0.022 0.011 Correct 4 2000 0.003 0.020 0.019 Correct 8 200 0.019 0.009 0.010 Correct 8 500 0.01 0 0.006 0.004 Correct 8 2000 0.007 0.007 0.006 Note: Data 1 refers to original data. Data 2 refers to data deleting replications with non positive definite covariance matrix. Data 3 refers to data with 0.4% extreme values trimmed. Table 5 2. Biases of obtained with three data sets for LGM with a parallel process with an ARMA (1, 1) within -person residual covariance matrix ARMA Model Wave Size Data 1 Data 2 Data 3 0.2 ,0.8 Incor rect 4 200 0.104 0.092 0.104 Incorrect 4 500 0.087 0.085 0.087 Incorrect 4 2000 0.079 0.079 0.079 Incorrect 8 200 0.03 0 0.03 0 0.03 0 Incorrect 8 500 0.019 0.019 0.019 Incorrect 8 2000 0.014 0.014 0.014 Correct 4 200 -0.054 0.0 48 -0.054 Correct 4 500 0.029 0.028 0.029 Correct 4 2000 0.024 0.024 0.024 Correct 8 200 0.024 0.024 0.024 Correct 8 500 0.013 0.013 0.013 Correct 8 2000 0.007 0.007 0.007 0.5,0.45 Incorrect 4 200 0.024 0.024 0.024 Incorrect 4 500 0.003 0.003 0.003 Incorrect 4 2000 0.005 0.005 0.005 Incorrect 8 200 0.016 0.016 0.016 Incorrect 8 500 0.004 0.004 0.004
PAGE 122
122 Table 5 2. Continued. ARMA Model Wave Size Data 1 Data 2 Data 3 Incorrect 8 2000 0 .000 0 .000 0 .000 Correct 4 200 0.039 0.034 0.038 Correct 4 500 0.014 0.012 0.013 Correct 4 2000 0.003 0.003 0.003 Correct 8 200 0.019 0.019 0.019 Correct 8 500 0.007 0.007 0.007 Correct 8 2000 0.001 0.001 0.001 Note: Data 1 refers to original data. Dat a 2 refers to data deleting replications with non positive definite covariance matrix. Data 3 refers to data with 0.4% extreme values trimmed. Table 5 3. Biases of obtained with three data sets for LGM with a parallel process with an ARMA (1, 1) within -person residual covariance matrix ARMA Model Wave Size Data 1 Data 2 Data 3 0.2,0.8 Incorrect 4 200 0.22 0 0.184 0.220 Incorrect 4 500 0.229 0.219 0.229 Incorrect 4 2000 0.235 0.235 0.235 Incorrect 8 200 0.066 0.066 0.066 Incorrect 8 500 0.075 0.075 0.075 Incorrect 8 2000 0.076 0.076 0.076 Correct 4 200 0.031 0.015 0.031 Correct 4 500 0.042 0.041 0.042 Correct 4 2000 0.054 0.054 0.054 Correct 8 200 0.021 0.021 0.021 Correct 8 500 0.026 0.026 0.026 Correct 8 2 000 0.027 0.027 0.027 0.5,0.45 Incorrect 4 200 0.032 0.032 0.032 Incorrect 4 500 0.023 0.023 0.023 Incorrect 4 2000 0.024 0.024 0.024 Incorrect 8 200 0.024 0.024 0.024 Incorrect 8 500 0.017 0.017 0.017 Incorrect 8 2000 0.015 0.0 15 0.015 Correct 4 200 0.001 0.014 0.005 Correct 4 500 0.002 0.005 0.004 Correct 4 2000 0.001 0 .000 0.001 Correct 8 200 0.016 0.017 0.017 Correct 8 500 0.011 0.011 0.011 Correct 8 2000 0.009 0.009 0.009 Note: Data 1 refers to or iginal data. Data 2 refers to data deleting replications with non positive definite covariance matrix. Data 3 refers to data with 0.4% extreme values trimmed.
PAGE 123
123 Results presented from Table 5 4 to Table 5 6 indicate that there was barely any difference for the two sets of biases for the standard error estimates of and When the data was removed of the non -positive definite matrices with the correct analysis model and with an ARMA parameter of 0.5 and 0. 45, t he relative bias es of standard error of estimates obtained were still unacceptable. Therefore, the occurrence of nonpositive definite matrix could not explain the occurrence of those unexpected biases. Moreover, it was found that under some conditions (e.g., the sample size was 200, the ARMA parameter was 0.2 and 0.8 and the number of waves was 4), even though the rate of occurrence of non-positive definite matrices with the incorrect analysis model (16%) was higher than th at with the correct analysis model (7%), biases obtained with the incorrect analysis model were still acceptable (see Table 4 2 and Table 4 17). Furthermore, Leite (2007) found that removing replications with non-positive definite matrix changes the nor mal distribution of the variance component estimates to a skewed distribution. Therefore the unexpected findings could not be attributed to the occurrence of nonpositive definite covariance matrices. Table 5 4 Biases of standard error estimates of obtained with three data sets for LGM with a parallel process with an ARMA (1, 1) within -person residual matrix Model ARMA Wave Size Data 1 Data 2 Data 3 Incorrect 0.2, 0.8 4 200 0.004 0.092 0.004 0.2, 0.8 4 500 0.006 0.031 0. 006 0.2, 0.8 4 2000 0.017 0.018 0.017 0.2, 0.8 8 200 0.011 0.011 0.011 0.2, 0.8 8 500 0.000 0.000 0.000 0.2, 0.8 8 2000 0.011 0.011 0.011 0.5, 0.45 4 200 0.000 0.000 0.000 0.5, 0.45 4 500 0.006 0.006 0.006 0.5, 0.45 4 2000 0.024 0.024 0 .024 0.5, 0.45 8 200 0.016 0.016 0.016 0.5, 0.45 8 500 0.004 0.004 0.004 0.5, 0.45 8 2000 0.019 0.019 0.019
PAGE 124
124 Table 5 4. Continued. Model ARMA Wave Size Data 1 Data 2 Data 3 Correct 0.2, 0.8 4 200 0.130 0.112 0.063 0.2, 0.8 4 500 0.251 0.2 50 0.251 0.2, 0.8 4 2000 0.588 0.588 0.588 0.2, 0.8 8 200 0.207 0.207 0.207 0.2, 0.8 8 500 -0.346 -0.346 -0.346 0.2, 0.8 8 2000 0.604 0.604 0.604 0.5, 0.45 4 200 0.793 0.860 0.561 0.5, 0.45 4 500 0.891 0.892 0.827 0.5, 0.45 4 2000 0 .548 1.598 1.033 0.5, 0.45 8 200 1.136 1.199 0.536 0.5, 0.45 8 500 1.103 1.259 0.782 0.5, 0.45 8 2000 0.791 0.797 0.625 Note: Data 1 refers to original data. Data 2 refers to data deleting replications with non positive definite covariance matrix. Data 3 refers to data with 0.4% extreme values trimmed. Table 5 5 Biases of standard error estimates of obtained with three data sets for LGM with a parallel process with an ARMA (1, 1) within -person residual covariance matrix Model ARMA Wave Size Data 1 Data 2 Data 3 Incorrect 0.2, 0.8 4 200 0.004 0.004 0.004 0.2, 0.8 4 500 0.011 0.011 0.011 0.2, 0.8 4 2000 0.003 0.003 0.003 0.2, 0.8 8 200 0.006 0.006 0.006 0.2, 0.8 8 500 0.019 0.019 0.019 0.2, 0.8 8 2000 0.006 0.0 06 0.006 0.5, 0.45 4 200 0.007 0.007 0.007 0.5, 0.45 4 500 0.002 0.002 0.002 0.5, 0.45 4 2000 0.016 0.016 0.016 0.5, 0.45 8 200 0.007 0.007 0.007 0.5, 0.45 8 500 0.003 0.003 0.003 0.5, 0.45 8 2000 0.031 0.031 0.031 Correct 0.2, 0.8 4 200 0.020 0.019 0.020 0.2, 0.8 4 500 0.046 0.046 0.046 0.2, 0.8 4 2000 0.178 0.178 0.178 0.2, 0.8 8 200 0.003 0.003 0.003 0.2, 0.8 8 500 0.018 0.018 0.018 0.2, 0.8 8 2000 0.044 0.044 0.044 0.5, 0.45 4 200 0.024 0.023 0.025
PAGE 125
125 Table 5 5 Continued Model ARMA Wave Size Data 1 Data 2 Data 3 0.5, 0.45 4 500 0.030 0.028 0.029 0.5, 0.45 4 2000 0.043 0.040 0.043 0.5, 0.45 8 200 0.007 0.006 0.005 0.5, 0.45 8 500 0.001 0.000 0.030 0.5, 0.45 8 2000 0.020 0 .020 0.019 Note: Data 1 refers to original data. Data 2 refers to data deleting replications with non positive definite covariance matrix. Data 3 refers to data with 0.4% extreme values trimmed. Table 5 6 Biases of standard error estimates of obtained with three data sets for LGM with a parallel process with an ARMA ( 1 1 ) within -person residual covariance matrix Model ARMA Wave Size Data 1 Data 2 Data 3 Incorrect 0.2, 0.8 4 200 0.007 0.007 0.007 0.2, 0.8 4 500 0.009 0 .009 0.009 0.2, 0.8 4 2000 0.013 0.013 0.013 0.2, 0.8 8 200 0.000 0.000 0.000 0.2, 0.8 8 500 0.005 0.005 0.005 0.2, 0.8 8 2000 0.001 0.001 0.001 0.5, 0.45 4 200 0.012 0.012 0.012 0.5, 0.45 4 500 0.004 0.004 0.004 0.5, 0.45 4 2000 0.01 1 0.011 0.011 0.5, 0.45 8 200 0.020 0.020 0.020 0.5, 0.45 8 500 0.009 0.009 0.009 0.5, 0.45 8 2000 0.016 0.016 0.016 Correct 0.2, 0.8 4 200 0.026 0.021 0.025 0.2, 0.8 4 500 0.070 0.070 0.070 0.2, 0.8 4 2000 0.262 0.262 0.262 0 .2, 0.8 8 200 0.039 0.040 0.039 0.2, 0.8 8 500 0.088 0.088 0.088 0.2, 0.8 8 2000 0.24 0 0.24 0 0.240 0.5, 0.45 4 200 0.085 0.075 0.083 0.5, 0.45 4 500 0.084 0.083 0.084 0.5, 0.45 4 2000 0.087 0.085 0.089 0.5, 0.45 8 200 0.034 0.036 0.034 0.5, 0.45 8 500 0.011 0.012 0.011 0.5, 0.45 8 2000 0.041 0.041 0.041 Note: Data 1 refers to original data. Data 2 refers to data deleting replications with non positive definite covariance matrix. Data 3 refers to data with 0.4% extreme values trimmed.
PAGE 126
126 To find out why the correct analysis model resulted in worse estimates than the incorrect analysis model, further investigation was conducted. I t was noticed that the range of the estimates of standard error of was from 2.74 to 19799.94 in a total of 120,000 replications. The frequency table of the values of the estimates of standard error of showed that a total of 119572 estima tes fell in the range of 0 to 4 99, which was counted as 99 .6 % of total estimates (see Table 5 7 ). The remaining 428 estimates (counted as 0.4 % of total estimates) were greater than 5 00 and varied substantially in value. It should be noticed that the largest empirical standard error of was only 47.29 (under the condition when sample size was 200, number of waves was 4, ARMA parameter was 0.5 and 0.45 and the analysis model was correct ). Therefore, around 0.4% estimates of standard error of w ere more tha n 10 times larger than its empirical error. The 0.4 % e stimates included many extreme values Among the 428 standard error estimates of 42% were obtained with the occurrence of non positive definite covariance matrix and 100% wer e obtained with the correct analysis model Table 5 7 The frequency table for the standard error estimates of for LGM with a parallel process with an ARMA (1, 1) within person residual covariance matrix F requency 0 499 119 572 500 1499 317 1500 2499 51 2500 3499 20 35004499 17 4500 5499 6 55005499 3 6500 7499 4 7500 8499 2 85009499 1 9500 10499 2 10500 11499 3 1150012499 1 1 9500 2 0 499 1
PAGE 127
127 These observations indicated that all of the extreme values were associ ated with the correct analysis model. Therefore, it is suspected that the extreme values caused the unexpected findings. One thing worth to be mentioned is that among all the estimates greater than 1500 (a total of 111 estimates), 72% was accompanied with the occurrence of nonpositive definite covariance matrix. These observations indicated that there might be a relationship among the occurrence of nonpositive definite matrix, the extreme values and the correct analysis model, which deserves future invest igations. Regarding the standard error estimates of and the estimates did not vary as much as the standard error estimates of The variation s of the standard error estimates of and measured by the variance of these standard error estimates, were 22506.37, 6.42 and 5.44 respectively. The range for the standard error estimates of and was from 1. 83 to 313.81 and from 1. 79 to 40.28 respectively. Therefore, although there were some unexpected biases for the standard error estimates of and w ith an ARMA (1 1 ) error structure the severity in terms of the number of unacceptable biases and the magnitude of the biases was much less than that for standard error estimates of Based on the above observations regarding the extreme values and estimates variations, it is suspected that the unexpected findings were attribute d to the extreme values Therefore, the 0.4% estimates that were greater than 500 were trimmed from original data and biases obtained with the trimmed data were calculated Results in Tab le 5 4 indicate that on average the standard error estimates of obtained with the trimmed data was better than those obtained with the original data. T he relative mean biases of standard error est imates for and obtained with the trimmed data were almost no different from those obtained with the original data (see Table 5 2 and Table 5 3) It is the expected result since
PAGE 128
128 the removal of the extre me values affected the range of the standard error estimates of substantially but barely changed the range for the estimates of standard error estimates of and The results for the biases of variance component estimates with the trimmed data were also presented (see Table 5 1 to Table 5 3). The variance of the estimates of and across all the replications were 407.82, 36.31 and 44.29. There were not as many extreme values as were for the estimates of Therefore, the removal of extreme values barely changed the biases estimates. Staring values were also imposed to see whether the estimates with complex covariance structures could be improved. Besides those unexpected biases occurring with the correct analysis model, other unexpected results were also found Special interest were put on the estimates of as generally more problems occurred with than the other two variance components estimates. For example, with an AR (1) error structure, the biases of standard error estimates of obtained with the correct analysis model fluctuated substantially under certain conditions when the sample size was 200 or 500, which changed from a positive value to a negative value or the magnitude of which increased with the increase of sample size (see Table 4 17). T he population value s of the three variance components w ere imposed as staring value s to see whether the estimates were improved. R esults for the biases of standard error of with a sample size of 200 and 500, and with a f our wave, under LGM with a time invariant covariate with an AR (1) covariance structure was presented in Table 5 8 as an illustration. Noted that the irregularity did not change, that is, the biases still changed from negative to positive under some condit ions and the magnitude of the biases increased with an increase of sample size.
PAGE 129
129 Table 5 8 Biases of standard error of estimates obtained with and without imposing starting values for LGM with a time invariant covariate with an AR (1) within -person covariance matrix Model AR Size Wave Original New Correct 0.5 200 4 0.146 0.069 0.5 500 4 0.435 1.927 0.8 200 4 2.245 0.24 0.8 500 4 1.136 1.193 Note: original represents original data, while new represents results obtained with population value imposed as starting value. Furthermore, there still existed many extreme values of the standard error estimates of In the original data without the starting value imposed, the range of the standard error estimates of was from 4.98 to 52599. In the new data set with the starting value imposed and without including estimates obtained with a sample size of 2000, the range was from 7.33 to 40348,. The variations of the standard er ror estimates of and were 1044.32, 2.25 and 2.06 respectively in the original data, and 1084.27, 1.61, and 1.45 respectively in the new data. The variation for the estimates of standard error of was still substantially larger than those for the estimates of the other two. To inspect whether removing the extreme value could remove the unexpected findings, based on the frequency table (see Table 5 -9 ), estimates with a value greater than 500 was trimmed, which counted as a total of 1282 estimates and 1% estimates. Table 5 9 The frequency table for the standard error estimates of under LGM with a time invariant covariate with an AR (1 ) within person covariance matrix F requency 0 499 118718 500 1499 708 1500 2499 186 25003499 90 3500 4499 67 4500 5499 37 5500 5499 39
PAGE 130
130 Table 5 9. Continued. F requency 65007499 27 7500 8499 19 85009499 20 9500 10499 15 1050 0 11499 11 11500 12499 10 12500 13499 10 1350014499 11 14500 15499 7 1550016499 2 16500 17499 3 17500 27499 16 3150033499 2 41500 42499 1 52500 53499 1 The estimates obtained with the trimmed data improved in many aspects (see Table 5 10). With the trimmed data, the number of unacceptable biases reduced, the average magnitude of the unacceptable biases reduced, there was no more unexpected change of the sign of the biases and no more bias increasing in magnitude with the increase of the samp le size. Table 5 10. Biases of standard error estimates of obtained with two data sets for LGM with a time invariant covariate with an AR (1) within -person residual covariance matrix Model AR Wave Size Original Trimmed Incorre ct 0.5 4 200 0.014 0.014 0.5 4 500 0.008 0.008 0.5 4 2000 0.016 0.016 0.5 8 200 0.014 0.014 0.5 8 500 0.016 0.016 0.5 8 2000 0.007 0.007 0.8 4 200 0.015 0.015 0.8 4 500 0.009 0.009 0.8 4 2000 0.012 0.012 0.8 8 200 0 .002 0.002 0.8 8 500 0.018 0.018 0.8 8 2000 0.010 0.010 Correct 0.5 4 200 0.146 0.114
PAGE 131
131 Table 5 10. Continued. Model AR Wave Size Original Trimmed 0.5 4 500 0.435 0.066 0.5 4 2000 0.025 0.025 0.5 8 200 0.013 0.013 0.5 8 500 0 .024 0.024 0.5 8 2000 0.015 0.015 0.8 4 200 2.245 1.021 0.8 4 500 1.136 0.794 0.8 4 2000 0.07 0.206 0.8 8 200 0.101 0.140 0.8 8 500 0.193 0.076 0.8 8 2000 0.004 0.004 Note: original refer s original data, while trimmed represent s data with trimmed extreme values. Based on the above findings, it is suspected that the extreme values caused unexpected biases since removing the extreme values on average reduced the magnitude of the unacceptable biases associated with the correct an alysis model. As the analysis model including an ARMA (1, 1) or an AR (1) within -person residual covariance structure is substantially more complex than the misspecified models, model complexity could be a factor determining the occurrence of extreme stand ard errors. The decreased convergence rate with the increasing complexity of analysis models was further evidence (see Table 4 1). Although the inclusion of an ARMA(1,1) within-person residual covariance matrix within the framework of HLM analyzed by SAS was found to result in better estimates of random effect and smaller accompanied standard error estimates ( K wok, et. al., 2007), it is not true in the framework of LGM by this study. The different results might be due to the use of different software. As m entioned before, the ARMA process is less encountered in social science. Based on the above discussion, it is not recommended for the researchers to include the ARMA (1, 1) process in within person residual covariance structure due to the estimation diffic ulty with the current SEM software.
PAGE 132
132 Summary of I mpact of Each F actor Impact of Analysis Model Type When the within -person residual covariance structure was an AR(1) or a MA(1) process, no impact of the analysis model type was found on the estimates of either the fixed parameter s or their standard errors, but the impact on the variance components estimates was present Under the same condition, the correct analysis model always led to better estimates than the incorrect analysis model. Except that the correct analysis model caused some unacceptable biases of a correct analysis model led to unbiased estimates and biased estimates were observed with an incorrect analysis model. W ith an ARMA (1, 1) error structure the effect of the analysis model type on variance components estimates could not be separated from the effect of the ARMA parameter value. With an ARMA parameter value of 0.2 and 0.8, the absolute biases occurred with the incorrect analysis model were higher than those occ urred with the correct analysis model. However, some biases observed with the correct analysis model were also unacceptable, which deserved further investigation and was discussed above. Regarding the impact of the analysis model type on the estimates of s tandard error of variance components, estimates were not affected by the misspecification of the analysis model with each of the three types of error structures. However, unacceptable biases were observed only with the correct analysis model, which was als o discussed in the previous section The convergence rate and the occurrence rate of non-positive definite matrix w ere affected by the analysis model type. As long as the analysis model was wrong, all the estimates converged T he correct analysis model res ulted in a substantial low convergence rate with an ARMA (1, 1) error structure (as low as 52%), or with an AR ( 1) error structure (as low as 74%), but had little affect on the convergence rate with a MA ( 1) structure. The results were expected
PAGE 133
133 as among the three types of residual covariance structure s the most complex one was the ARMA (1, 1) error structure and the least complex one was the MA (1) structure The complexity of the residual covariance structure added the difficulty for estimation The imp act of the analysis model type on the occurrence of nonpositive definite matrix also depended on the number of waves. Provided the number of waves was eight, the misspecified model did not lead to any occurrence of non-positive definite covariance matrice s, but led to some occurrences when the model included the AR (1) or ARMA (1, 1) error structure. When the number of waves was four, analytical models failing to include the AR (1) error structure resulted in much less occurrence rate than models included the AR (1) error structure Analytical models failing to include the MA (1) error structure led to high occurrence rate while the occurrence rate was zero when models included the MA (1) error structure. For the ARMA (1, 1) error structure, a correct analy sis model resulted in less occurrence rate than an incorrect analysis model under the condition when ARMA parameter value was 0.2 and 0.8. However, when ARMA parameter value was 0.5 and 0.45, the occurrence rate was zero with an incorrect analysis model b u t ranges from 5% to 20% with a correct analysis model. Bollen and Curran (2005) pointed out that the possible causes that resulted in occurrence of non-positive definite matrix included sampling fluctuations, nonconvergence, outliers, model misspecificati on and empirical underidentification. In this study, as only converged and identified results were analyzed, it was suspected that it is the sampling flu ctuation and the model complexity that caused the occurrence of non-positive definite matri ces However this suspicion deserve s further investigation. Impact of Time Series Parameter T he impact of the time series parameter value was not found on the estimates of the fixed effect parameter s and their standard error estimates The AR parameter and the MA par ameter
PAGE 134
134 only affected the acceptability of the biases of the estimates for A higher value of AR or MA parameter tend ed to result in higher unacceptable biases for the estimates of than a lower value of AR or MA parameter The ARMA parameter value was critical in deciding whether the variance components estimates were biased under many situations. An ARMA parameter value of 0.2 and 0.8 always led to more biased estimates than its counterpart value of 0.5 and 0 .45. There was also an i mpact of the time series parameters on the standard error estimates of variance component with an AR(1) or an ARMA (1, 1 ) error structure, but not with a MA (1) structure. A higher value of an AR parameter led to more bias ed standard error estimates of than a lower value of AR parameter For the standard error estimates for and unacceptable biases were observed only with a value of ARMA p arameter of 0.2 and 0.8 not with a value of 0.5 and 0.45. However, for the standard error estimates of unacceptable biases occurred with both parameter values and a 0.5 and 0.45 value resulted in more substantially biased estimates than a value of 0.2 and 0.8. Additionally, with the correct analysis model, the convergence rate was lower with a higher AR parameter value than a lower AR parameter value, and the convergence rate was higher with a n ARMA parameter value of 0.2 and 0 .8 than with a value of 0.5 and 0.45. With the incorrect analysis model, the occurrence rate of non-positive definite matrices was higher with a higher MA parameter and the occurrence rate was zero for an ARMA parameter value of 0.5 and 0.45 but was high with a value of 0.2 and 0.8. With the correct analysis model, the occurrence rate of nonpositive definite matrices was higher with a higher AR parameter, and the occurrence rate was high for an ARMA parameter value of 0.5 and 0.45 but not all zero with a value of 0.2 and 0.8.
PAGE 135
135 As mentioned above, w hen the within -person residual covariance structure was an ARMA (1, 1) whether the biases of variance components estimates and their standard error estimates were acceptable, as well as the performance of some f it indexes were found to be related to the ARMA parameter. As mentioned in the method chapter, ARMA process is an integration of AR and MA process. If the AR parameter and MA parameter was close to each other and the AR parameter is not small (i.e., 0. 8 in this study), the ARMA (1, 1) model reduces approximately to a MA (2) model. As found consistently in this study, the pattern of acceptability of biases and differentiating ability of fit indexes depended mainly on the type of within-person residual covari ance structure, which explains why the value of time series parameter played an important role in the analysis of estimates obtained with an ARMA (1, 1) error structure Impact of Sample Size It was found that sample size did not affect the parameter esti mates of the fixed effects and their standard errors Th is finding regarding the fixed effects estimates is consistent with previous studies. Ferron, et. al (2002) found that sample size had no biased effect on the parameter estimates of fixed effects whe n the residual structure of level -one equation in HLM was misspecified. Regarding the effect of sample size on the acceptability of the biases of variance components in this study with each of the three types of error structures, only the estimates of was affected. Other variance components estimates were not affected by the sample size. However, the sample size was observed to be related to the unacceptable biases of only because these unacceptable b iases occurred with the correct analysis model. Based on the discussion above, it is suspected that without the unexpected findings the sample size should not affect the variance component estimates. Some unexpected findings were also noted for the estima tes of with an AR (1) error structure and for the estimates for with a MA structure
PAGE 136
136 and an ARMA (1, 1) error structure. That is, the magnitude of the unacceptable biases increased with the increase of t he sample size T hese suspicious problems deserve further investigation. T he sample size affected the standard error estimates of variance components with an AR (1 ) or an ARMA (1, 1) error structure, and led to some suspicious problems such as the change of signs of the unacceptable biases and the increase of the magnitude of the unacceptable biases with the increase of the sample size. However, based on the discussions before, all these findings regarding the impact of sample size on the variance components and their standard error estimates w ere suspected to be tenable. Hamilton, et al. (2003) found that in linear latent growth modeling the variance and covariance estimates of intercept and slope were not biased to a substantive degree by the sample size The sample size in their study ranged from 25 to 1000. You (2006) found that sample size had no significant effect on the variance components estimates in growth modeling when the within -person residual structure was misspecified as homoscedastic and unc orrelated. Hamilton, et al. (2003) also found that sample size did not bias the standard error estimates of variance components. Based on these previous findings and discussions in this study, it is suspected that sample size may not affect the variance co mponents and standard error estimates Hamilton, et al (2003) found that a larger sample size usually reduced the improper estimates and the failure of convergence and improv ed the model fit. In this study, the increase of sample size was found to incre ase the convergence rate with an AR (1) or an MA (1) error structure when the correct analysis model was used but the result did not apply to an ARMA (1, 1) error structure. Regarding the improper estimates, c onsistent with what Hamilton, et al (2003) fou nd, larger sample size tended to reduce the occurrence rate of non -positive definite matrix across the three types of residual structure Large sample size also increased the differentiation
PAGE 137
137 ability of the p value: when the error structure was an AR (1) o r a MA (1) process, the p value can differentiate between the correct analysis model and the incorrect model in 100% replications when the sample siz e was 2 000. Impact of Length of Waves It is generally believed that increasing the number of measurement periods may improve the accuracy of parameter estimates in growth modeling (Duncan, et al ., 1999; Fan 2003b). In this study, h olding other conditions constant, on average, the shorter waves (i.e.4) resulted more biased estimate s than the longer waves (i.e., 8) which is consistent with the general expectation. It was found that the length of waves played an important role in the estimates of A s long as the length of waves was eight, the biases of were a cceptable regardless of the analysis model type Results in this study indicated that longer waves substantially increased the convergence rate and reduced the number of occurrence rate of improper solutions and RMSEA can be used for model differentiatio n only under certain conditions in which the length of waves was eight Hamilton, et al. (2003) suggested that adding more number of measurement periods increased convergence rate and decreased the RMSEA upper bound. These findings were consistent with the conclusions from Hamilton, et al (2003). Furthermore, the number of waves affected the performance of p value. W hen the generating model included an AR (1) or a MA ( 1) process, the p value performed perfectly in differentiating between the incorrect and t he correct analysis models under all conditions in which the number of waves was eight and can differentiate a misspecified ARMA ( 1, 1) error structure from a correct one only under certain conditions in which the number of waves was eight
PAGE 138
138 Analytic Results of Variance Components Estimates As mentioned in the results part, whenever the variance components estimates were unacceptably biased, it was found that 1.the estimates of and were inflated and the estimates of were deflated when the residual co variance structure followed an AR (1) process; 2. the estimates of and were deflated and the estimates of were inflated when the residual covariance followed a MA (1) or an ARMA (1,1) process. These results could be analytically proved by examining the varian ce/covariance matrices under the two different analysis models. As mentioned in chapter 2, the implied variance/covariance matrix for a LGM is '() (5-1) where all the symbols remain the same meaning as before. For simple illustration, the variance/covariance matrix of unconditional LG M was employed. When the within-person residual covariance structure is an AR (1) proces s, the model implied covariance matrix for the observed variables is 1 122 2 1 11 1 1 22 2 2121 2 2 21 2 2 1() 2() ()() ()2tt ette t ette t tette t (5-2) where all the symbols remain the same meaning as before. When is misspecified as a diagonal matrix with uncorrelated errors, the m odel implied covariance matrix is as follows, which is the same as that in equation 2-15:
PAGE 139
139 22 1111 1 212122 22 11() 2( ) () () () 2ett tt tttte t (5-3) where all the symbols remain the same meani ng as described before. From equation 5-1, it should be noted that and compensate each other. To make the equation 5-2 and equation 53 equal to each other, each co rresponding element of the two matrices should be equal. Therefore, the difference in has to be absorbed in the matrix. If the variance element in is increased, some of the variance elements in has to decrease to achieve equivalence, vice verse. In our study, if four waves were assumed, the value of t was defined to be 0, 1, 2, 3. Plugging these numbers into equation 5-2 led to 123 1 12 21 1 321() 23 22334 2234465 33465 AAAA A AAA A AAA AAA 96 A (5-4) where 2 eA 12*eA ,222*eA and 323*eA Similarly, plugging the value of t into equation 5-3 led to 2 1 2 2 2 3 2 4() 23 22334 2234465 3346596e e e e (5-5)
PAGE 140
140 It is not difficult to prove wh en the analysis model fail s to consider an AR (1) within -person residual covariance structure, the estimates of and are inflated and the estimates of are deflated. Suppose the matrix of the correct analysis model is (5 6 ) a nd the matrix of the incorrect analysis model is 11 11 (5 7 ) The first step was to t ak e the element in row one and column two, and the element in row one and column three in equation 5 4 Making each of the two elements equal to the corresponding element in equation 5 5 led to the following equation: 2 11 22 1122e e (5 8) As mentioned in the method part, t he value of 2 e was set to be 50 and was assumed to be 0.8 Plugging these numbers into equati o n 5 8 led to the following equation: 11 1140 2322 (5 9) The solution to equation 5 9 was 1 148 8 (5 10) Therefore, in misspecified model, estimates of was inflated by a value of 48 and estimates of was deflated by a va lue of 8.
PAGE 141
141 Then, tak ing the element in row two and column three in the two covariance matrices in equation 5 4 and equation 5 5 and making them equal to each other led to the following equation: 111234023 (5 1 1 ) With results in equat ion 5 10 plugged into equation 5 1 1 it was shown that 18 (5 1 2 ) Therefore, estimates of and were inflated and estimates of was deflated when the within -person error structure failed to include the AR (1) process. The population value for and were set to be 80 60 and 35 respectively. Therefore, the perce ntage of the inflation for estimates of and was 80% and 13.3% respectively, and the percentage of the deflation for the estimates of was 22.9%. All the relative deviations w ere substantially greater than 0.05. Therefore, when the analysis model failed to include an AR (1) process and the number of waves was 4 all the biases of the three variance components estimates were unacceptable and the magnitude of the biases of was the largest. The above findings were consistent with what was found in this simulation study and previous studies (e.g., Ferron, et al., 2002; Kwok, et al., 2007) It is worthwhile to point out that there is no single solution for t he three variance components estimates in matrix to make equation 5 4 equal to equation 5 5 exactly. For example, if at the beginning the element in row one and column two and the element in row one and column four in equation 5 4 and 5 5 was used instead of using the element in row one and column two and the element in row one and column three the final solution becomes
PAGE 142
142 1 1 147.2 7.2 7.2 (5 1 3 ) which differs a little from what were obtained before (i.e., with the element in row one and column two and the element in row one and column three) However, any set of solutions would make the two covariance matri ces in equation 5 4 and in equation 5 5 still close to each other. To see the impact of the value of the AR parameter on the estimates of variance components estimate, the value of AR parameter was changed from 0.8 to 0.5. Similar calculation resulted in the following results : 1 1 137.5 12.5 12.5 (5 1 4 ) The change of the value of the AR param eter from 0.8 to 0.5 resulted in an inflation of estimate by 46.9%, an inflation of estimate by 20.8% and a deflation of estimate by 35.7%. All the biases were still unacc eptable. The magnitude of the bias was the least for estimates of and the highest for estimates of Compared with the results obtained with an AR parameter of 0.8, the estimates of were inflated less but the estimates of were inflated more and the estimates of were deflated more. These findings are consistent with what was found in this simulation study When the within -p erson residual covariance structure is a MA (1) process, the model implied covariance matrix for the observed variables becomes
PAGE 143
143 2 11 2 212122 22 112() () ()*/(1) () () 2et t tt tt ttee (5-15) where all the symbols remain the same meaning as before. Assuming four waves, we got the following covariance matrix: 2 2 2 2() 23 22334 2234465 3346596e e e eA AA A A A (5-16) where 22*/(1)eA and was the extra component caused by the MA (1) process. If is misspecified as a diagonal matrix with uncorrelated error, the model implied covariance matrix for the observed variables is the same as that in equation 5-5: 2 1 2 2 2 3 2 4() 23 22334 2234465 3346596e e e e (5-17) Applying the similar calculation as was did w ith an AR (1) covari ance structure by taking the element in row one and column two and the element in row one and column three first, setting the MA parameter to be 0. 8, led to the following equation: 11 1124 22 (5-18) The solution to equation 5-18 was 1 148 24 (5-19)
PAGE 144
144 where all the sym bols remain the same meaning as before Then making the element in row two and column three in the two matrices in equation 5 1 6 and equation 5 1 7 equal, and using the results in equation 519 led to the following solution: 124 (5 2 0 ) However the solution is not unique. If the element in row two and column four was taken in previous step, the sol ution became 116 (5 2 1 ) Despite the multiple solution s the general pattern of the estimates wi th the incorrect analysis model for the three variance components wa s: the estimates of and were deflated and the estimates of were inflated when the within -person residual covariance structure failed to include a MA (1) process The percentage of deflation of estimates of wa s 60%, the percentage of deflation of estimates wa s 26.7% (for a decrease of 16) or 40% (for a de crease of 24) and t he percentage of inflation of estimates was 6 8.6 % If the MA parameter wa s changed to 0.5 and the same calculation wa s followed, the solution to the estimates of the three variance components wa s 1 1 140 20 20 (5 2 2 ) or 1 1 140 20 13.3 (5 2 3 )
PAGE 145
145 The percentage of the deflation of estimates of wa s 50%, the percentage of deflation of estimates of wa s 22.2% (for a de crease of 13.3), or 33.3% (for a decrease of 20) and the percentage of inflation of estimates of was 57.1%. Therefore, the change of the value of the MA parameter d id not change the acceptability of the biases and the direction of these biases As long as the analysis model type was wrong and the number of waves was short, the biases of the three varianc e components were unacceptable and the estimates of and were deflated whi le the estimates of were inflated. The magnitude of these biases were the least for the estimates of Furthermore a higher value of MA parameter resulted in more biased estimates than a lower value of MA parameter did The above findings were consistent with what was found in the simulation study. When the within -person residual covariance structure is an ARMA (1 1 ) process, the model implied covariance matrix for the observed variables becomes (assum ing four waves) 11121314 21222324 () 31323334 41424344 AAAA AAAA AAAA AAAA (5 2 4 ) Each element in the matrix in equation 5 2 4 i s
PAGE 146
146 2 11 2 22 2 33 2 44 2 1221 2 1331 2 22 1441 2, 2, 44, 96, ()(1) 2 12 ()(1) 2, 12 ()(1) 3, 12e e e e e e eA A A A AA AA AA A 2 23 32 2 2 2442 2 2 3443 2()(1) 23 12 ()(1) 34 and 12 ()(1) 65 12e e eA AA AA If the analysis model failed to include the ARMA (1, 1) process in the model implied cova riance matrix for the observed variables is still the same as that in equation 5 5: 2 1 2 2 2 3 2 4() 23 2 2334 22344 65 3346596e e e e (5 2 5 ) where all the symbols remain the same meaning as before. Assume that the value of ARMA parameter was 0.5 and 0.45, then the value of ()(1) 2 12 was equal to 0.05. When the element in row one and column two and the element in row one and column three in equation 5 2 4 and in equation 5 2 5 were set to be equal to its corresponding one the following equation was obtained
PAGE 147
147 11 112.5 21.252 (5 2 6 ) The solution to equation 5 2 6 was 1 13.75 1.25 (5 2 7 ) Then equating the element in row two and column three in equation 5 2 4 and in equation 5 2 5 led to the following equation: 111232.523 (5 28) Then with the results in equation 5 2 7 plugged into equation 5 28, the following equation was obtained: 11.25. (5 29) Based on the results shown in equation 52 7 and in equation 5 29, it was found that th e estimates of and were inflated and the estimates of were deflated when the covari ance structure did not include an ARMA(1,1) process. However, the percentage of the inflation of estimates of wa s only 4. 6 %, the percentage of the inflation of estimates of wa s 2.1%, an d the percentage of the deflation of estimates of wa s 3. 6 %. Therefore, none of these variance estimates were biased using the 0.05 criterion The result is consistent with what was found in the simulation study: when the ARMA parameter value was 0.5 and 0.45, model misspecification did not lead to biased estimates of all variance components. When the ARMA parameter value was set to be 0.2 and 0.8, with the similar calculation, it was found that estimates of was de flated by a value of 34. 4 (a de crease of 43 %), are de flated by a value of 1 5.2 (a de crease of 25.3%), and are in flated by a value of 1 5.2 ( a n
PAGE 148
148 in crease of 43%). The directions of these biases were the same as obtained before. The magnitude of the biases was the least for the estimates of among the three. Th e above finding s are consistent with what was found in the simulation study : it explains why most unacceptable biases occurred when the ARMA parameter was 0.2 and 0.8. GOF Test and GOF Indexes It was found that wh en the actual within -person residual covariance structure was an AR (1) or a MA (1) process, the statistics that can reliably differentiate between the two types of analysis models were the p value and RMSEA under certain conditions in which the number of measurement periods was eight or the sample size wa s 2000 (only for adequate p value). TLI could be used only under a very restrictive condition and for only one type of LGM. W hen the within -person residual covariance structure was an ARMA (1, 1) only RMS EA could be used for model selection under certain conditions. As both the p value and RMSEA are based on chi square statistics, their sensitivity to model selection is expected to be similar. CFI and SRMR were not recommended to use because with any typ e of analysis model CFI non -discriminately suggested adequate model fit for most of the conditions and SRMR could not detect model misspecification for all the conditions You (2006) found that when the within -person error structure was misspecified as unc orrelated and homoscedastic, the fit index CFI was generally not sensitive to model misspecification and RMSEA was sensitive to model selection which was consistent with our findings. Some suspicious results were found with the p value and RMSEA. For the p value, it was found that under certain conditions the power did not depend on the sample size and the Type I error rate increased with the increase of sample size. For RMSEA, under certain conditions the
PAGE 149
149 ability to reject the misspecified model decrease d as the sample size got larger which deserves further investigation. S uggestions to A pplied R esearchers Suggestions are made according to different research plans. W hen researchers are only interested in fixed effects, the use of a simple diagonal withi n -person residual covariance structure would not produce problems in the parameter estimates or tests of fixed effects. Researchers interested in interpreting the variance parameters should consider the possibility of alternative error structures When an AR ( 1) or a MA ( 1) process were found in within -person residual covariance structure, some variance component parameters were biase d and the severity of the biases of variance component parameters increased for the incorrect analysis model than for the cor rect analysis model Therefore, for better estimates of variance components, researcher s should consider alternate covariance structures other than the simple diagonal covariance structure However, this application is subject to the limitation of sample size s and measurement periods. As discussed before, a small sample size and less measurement periods sometimes resulted in biased estimates even for the correct analysis model due to the complexity of covariance structure Therefore, a larger sample size and more measurement periods are recommended for applied researchers. Larger sample size and more measurement periods also help to get unbiased standard error estimates of variance components and increase the sensibility of GOF test and fit indexes to mode l misspecification. T here existed some unexpected findings in this study most of which were related to an ARMA (1, 1) covariance structure. It is believed that the complexity of the covariance structure make s it difficult for the current software to give the right estimation. However, a s the ARMA process is rarely encountered in social science and under many situations it approximates an AR or MA process (McCleary & Hay, 1980), and the AR (1) and the MA ( 1) process es are the most
PAGE 150
150 commonly encountered time series, therefore, complex ARMA covariance structure should be considered only after an AR ( 1 ) or a MA ( 1) covariance structure is ruled out. Limitation s and S uggestions for F uture R esearch This study inevitably suffers the same limitation s as many other simulation studies do : the scope of it is limited by the conditions that were examined. For example, more conditions can be included in the future. T he AR and MA parameter was chosen to represent a medium or large effect to make the results more obvious. A n AR parameter or a MA parameter of 0.8 may not be encountered quite often in educational research. It is worthwhile to test whether slightly misspecified model when an AR or a MA parameter is small would lead to the same results. Another limitation is th at only linear growth model was examined. It was shown that nonlinear growth model coupling with unequally spaced data caused problems for both the estimation and tests of fixed effects (Ferron, et al ., 2002). As nonlinear growth models are common in appli ed researches, it deserves an investigation in future research. It should be pointed out that in this simulation study, the population value of the within person residual variance 2 was 50, which was not far away from the populati on value of (i.e., 80) the population value of (i.e., 60) and the population value of (i.e., 35). As in LGM, most random effects come from between -person variation, it is u nknown whether results would be different when the2is specified to be much smaller than the between person variance components (e.g., change 2 from 50 to 1 and still keep values of between -person variance components the same). Furthermore, Muthn and Muthn (2002) pointed out that in most applied literature, the ratio between variance of the level and variance of the shape was 5 to 1. In this study, the ratio of the and was 4:3. Although the population value was obtained from the ECLS -K data, it is unknown whether a different ratio would lead to the same results.
PAGE 151
151 The present study only examined the situation when within person residual covarianc e struc ture was misspecified as a diagonal matrix. Although this is the most likely misspecification in practice, the possible other misspecifications also deserve examination. For example, a MA (1) residual covariance structure was used but actually it is an AR (1) structure operating ; or an overly complex covariance structure was used when a simple structure can be the substitute In this study, the choice of ARMA parameter 0.5 and 0.45 make s the ARMA (1 1 ) process approximately reduce to an MA ( 2) process T he refore using ARMA (1 1 ) process can serve as an example when an overly complex error structure was used. As no MA (2) error structure was examined in this study, f urther investigation can be conducted later.
PAGE 152
152 APPENDIX MPLUS CODE Latent Growth Model with a Time Invariant Covariate with an AR (1) Process DATA: FILE IS "c: \ mplus \ invar \ data \ filelist.txt"; Type=montecarlo; VARIABLE: NAMES ARE t1 t2 t3 t4 x; USEVARIABLES ARE t1 t2 t3 t4 x; ANALYSIS: TYPE IS general; iterations =5000; estimator=ML; MODEL: i s | t1@0 t2@1 t3@2 t4@3; [i* s*]; [t1 t4@0]; i* s*; t1 t4(e); i with s*; i s on x ; t1 t3 pwith t2 t4 (cov1); t1 t2 pwith t3 t4 (cov2); t1 with t4 (cov3); MODEL CONSTRAINT:
PAGE 153
153 new(lag); cov1 = e*lag; cov2=e*lag**2; cov3=e*lag**3; SAVEDATA: results are results.t xt; output: tech9; Latent Growth Model with a Time Invariant Covariate with an MA (1) Process DATA: FILE IS "c: \ mplus \ invar \ data \ filelist.txt"; Type = montecarlo; VARIABLE: NAMES ARE t1 t2 t3 t4 x ; USEVARIABLES ARE t1 t2 t3 t4 x; ANALYSIS: TYPE IS gene ral; iterations =5000; estimator=ML; MODEL: i s | t1@0 t2@1 t3@2 t4@3; [i* s*]; [t1 t4@0]; i* s*; t1 t4(e); i with s*;
PAGE 154
154 i s on x ; t1 t3 pwith t2 t4 (cov); MODEL CONSTRAINT: new(lag); cov = lag*e/(1+lag^2); SAVEDATA: results are results.txt; output: tech9; Latent Growth Model with a Time Invariant Covariate with an ARMA (1, 1) Process DATA: FILE IS "c: \ mplus \ invar \ data \ filelist.txt"; Type = montecarlo; VARIABLE: NAMES ARE t1 t2 t3 t4 x; USEVARIABLES ARE t1 t2 t3 t4 x; ANALYSIS: TYPE IS general; iterat ions =5000; estimator=ML; MODEL: i s | t1@0 t2@1 t3@2 t4@3; [i* s*]; [t1 t4@0]; i* s*; t1 t4(e);
PAGE 155
155 i with s*; i s on x ; t1 t3 pwith t2 t4 (cov1); t1 t2 pwith t3 t4 (cov2); t1 with t4 (cov3); MODEL CONSTRAINT: new(pho); new(r); cov1 =e*(1 -pho*r)*(pho r)/(1+ r^2 2*pho*r); cov2=e*(1 -pho*r)*(phor)/(1+r^2 2*pho*r)*pho; cov3=e*(1 -pho*r)*(phor)/(1+r^2 2*pho*r)*pho^2; output: tech4 tech9; SAVEDATA: results are results.txt;
PAGE 156
156 LIST OF REFERENCES Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommended two -step approach Psychological Bulletin, 103(3), 411423. Biesanz, J.C., West, S.G., & Kwok, O. (2003). Personality over time: Methodological approaches to the study of short term and long term development and change Journal of Personality, 71, 905941. Bodovski, K. & Farkas, G. (2007). Do i nstructional p ractices c ontribute to i nequality in a chievement? The c ase of m athematics i nstruction in k indergarten. The Journal of Early Childhood Research, 5(3) 301322. Bolle n, K.A., & Curran, P.J. (2005). Latent curve models: A structural equation perspective. Hoboken NJ: John Wiley & Sons, Inc. Box, G. E.P., & Jenkins, G.M. (1976). Time series analysis: Forecasting and control. Oakland, California: Holden Day Cheong, J., Mac kinnon, D., & Khoo, S. (2003). Investigation of meditational processes using parallel process latent growth curve modeling. Structural Equation Modeling, 10(2), 238 262 Curran, P.J. (2000). A latent curve framework for the study of developmental trajectori es in adolescent substance use. In J. Rose, L.Chassin, C. Presson, & J. Sherman (Eds.). Multivariate applications in substance use research (pp.1 42). Mahwah. NJ: Erlbaum. Curran,P.J. (2003). Have multilevel models been structural equation models all along Multivariate Behavioral Research, 38, 529569. Curran, P., & Bollen, K. (2001). The best of both worlds: Combining autoregressive and latent curve models. In L.Collins & A. Sayer (Eds.), New methods for the analysis of change: Decades of behavior (pp. 107 135). Washington, DC: American Psychological Association. Curran, P. J., Muthen, B.O., & Harford, T.C. (1998). The influence of changes in marital status on development trajectories of alcohol use in young adults Journal of Studies on Alcohol, 59, 647658. David, M., (1971). Lifetime income variability and income profiles Proceedings of the annual meeting of the American Statistical Association, Aug., 285295 Duncan,T.E., Duncan, S.C., Strycker, L.A., Li, F., & Alpert, A. (1999). An introduction to latent variable growth curve modeling. Mahwah, NJ: Lawrence Eribaum Associates. Fan, X. (2003b). Power of latent growth modeling for detecting linear growth: number of measurements and comparison with other analytic approaches. Paper presented at the annual meeting of the American Educational Research Association, Chicago, IL.
PAGE 157
157 Ferron, J., Dailey, R., & Yi, Q. (2002). Effects of misspecifying the first level error structure in two level models of change. Multivariate Behavioral Research, 37, 379403. Fitzmauri ce, G. M., Laird, N. M., & Ware, J. H. (2004). Applied longitudinal data analysis. Hoboken, NJ: John Wiley & Sons, Inc. Goldstein, H. (1995). Multilevel statistical models (2nd ED). New York: Wiley. Hamaker, E. L., Dolan, C. V., & Molenaar, P. C. M. (2002) On the nature of SEM estimates of ARMA parameters. Structural Equation Modeling, 9, 347 368. Hamilton, J., Gagne, P.E., & Hancock, G.R. (2003). The effect of sample size on latent growth models. Paper presented at the Annual Meeting of the American Educational Research Association. Chicago, IL. Hause, J. (1977). The covariance structure of earnings and the onthe job training hypothesis Annals of Economic and Social Measurement, 335366. Hedeker, D., & Mermelstein, R. (2007). Mixed -effects regression mod els with heterogeneous variance: Analyzing ecological momentary assessment (EMA) data of smoking. In T.D. Little, J.A. Bovaird, and Noel A. Card (Eds.), Modeling contextual effects in longitudinal studies. Mahwah, NJ: LEA. Hertzog, C., & Nesselroade, J. R. (2003). Assessing psychological change in adulthood: An overview of methodological issues. Psychology and Aging, 18(4), 639657. Hong, G., and Raudenbush, S.W. (2006). Evaluating k indergarten r etention p olicy: A c ase s tudy of c ausal i nference for m ulti le vel o bservational d ata. Journal of the American Statistical Association, 101(45), 901910. Hoogland, J.J. & Boomsma, A. (1998). Robustness s tudies in c ovariance s tructure m odeling. Sociological Methods and Research, 26(3), 329367. Hu, L., & Bentler P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6 (1), 1 55. Hsiao, C. (2003). Analysis of panel data. Cambridge: Cambridge University Press. Hudson, C. G. (2008). The i mpact of m anaged c are on the p sychiatric o ffset e ffect. International Journal of Mental Health, 37(1), 32 60. Jackson, D. L. (2003). Revisiting sample size and number of parameter estimates: Some support for the N: q hypothesis. Structural E quation Modeling, 10(1), 128141. Joreskog, K.G. (1979). Statistical estimation of structural models in longitudinal -developmental investigations. In J.R. Nesselroade & P.B.Baltes (Eds.), Longitudinal research in the study of behavior and development (pp.3 03352). New York: Academic.
PAGE 158
158 Kaplan, D. (2005). A s tage -s equential m odel of r eading. Journal of Educational Psychology, 97(4) 551563. Keselman, H. J., Algina, J., Kowalchuk, R. K., & Wolfinger, R. D. (1998). A comparison of two approaches for selecting covariance structures in the analysis of repeated measurements. Communications in S tatistics: Simulation, 27, 591604. Kline, R. B. (1998). Principles and practice of structural equation modeling New York: Guilford Press. Kwok, O., West, S. G., & Green, S B. (2007). The impact of misspecifying the within-subject covariance structure in multiwave longitudinal multilevel models: A Monte Carlo study. Multivariate Behavioral Research, 42(3), 557592 Lawrence, F. R. & Hancock, G. R. (1998). Assessing change over time using latent growth modeling. Measurement and Evaluation in Counseling and Development, 30(4), 211225. Leite, W. L. (2007). A c omparison of l atent g rowth m odels for c onstructs m easured by m ultiple i tems. Structural Equation Modeling. 14(4), 581610. Lillard, L. & Weiss, Y. (1979). Components of variation in panel earnings data: American scientist 19601970. Econometrica, 473 454. Lillard, L. & Willis, R. (1978). Dynamic aspects of earnings mobility, Econometrica, 9851012 MacCallum, R. C., Kim, C., Malarkey, W. B., & Kiecolt -Glaser, J. K. (1997). Studying multivariate change using multilevel models and latent curve models. Multivariate Behavioral Research, 32(3), 215 253. MaCurdy, T.E. (1982). The use of time series processes to model the error s tructure of earnings in a longitudinal data analysis. Journal of Econometrics, 18, 83114 Marsh, H.W. (1993). Stability of individual differences in multiwave panel studies:Comparison of simplex models and one -factor models. Journal of Educational Measurem ent, 30, 157183. Marsh, H. W., Hau, K., & Grayson, D. (2005). Goodness of fit in structural equation models. In A. Maydeu Olivares & J. J. McArdle (Eds.), Contemporary p sychometrics: A festschrift for Roderick P. McDonald (pp. 275340). Mahwah, NJ: Lawrence Erlbaum Associates. McCleary, R & Hay, R. (1980). Applied time series analysis for the social sciences. Bevery Hills, Longdon: Sage. Mitchell, C. M., Kaufman, C. E., & Beals, J. (2005). Resistive e fficacy and m ultiple s exual p artners a mong American Indi an y oung a dults: A p arallel -p rocess l atent g rowth c urve m odel. Applied Developmental Science, 9 (3), 160171.
PAGE 159
159 Muthn, B. O. (2002). Beyond S EM: General latent variable modeling. Behaviormetrika, 29, 81 117. Muthn, B. O., & Khoo, S.T.(1998). Longitudinal s tudies of achievement growth using latent variable modeling. Learning and individual differences, 10, 73101 Muthn, L. K., & Muthn, B. O. (2002). How to use a monte carlo study to decide on sample size and determine power. Structural Equation Modeling, 9 (4), 599620. R Development Core Team. (2008). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Raudenbush, S.W., & Bryk, A.S. (2002). Hierarchical linear models: Applications and data analy sis methods (2nd Ed.). Newbury Park, CA: Sage Rogosa, D. (1979). Causal models in longitudinal research: Rationale, formulation, and interpretation. In J.R. Nesselroade & P.B. Baltes (Eds.), Longitudinal research in the study of behavior and development (p p. 263302). New York: Academic. Simons, M.B. (2007). Social influences on adolescent substance use. American Journal of Health Behavior, 31(6), 672684 Singer, J. D., & Willet, J.B. (2003). Applied longitudinal data analysis: Modeling change and event occurrence. New York: Oxford University Press. Sivo, S. A.(1997). Modeling causal error structures in longitudinal data. Dissertation Abstracts International, 58, 04B (University Micofilms No.AAG9729271) Sivo, S.A. (2001). Multiple indicator stationary time series models. Structural Equation Modeling, 8, 599612. Sivo, S. A. Fan.X. & Witta, L ( 2005). The biasing effects of unmodeled ARMA time series processes on latent growth curve model estimates. Structural Equation Modeling, 12, 215231. Sivo, S. A., & W ilson, V.L. (1998). Is parsimony always desirable? Identifying the correct model for a longitudinal panel data set. Journal of Experimental Education, 66, 249 255. Sivo, S. A., & Willson, V.L.(2000). Modeling causal error structures in longitudinal panel data: A Monte Carlo study. Structural Equation Modeling, 7, 174205. Stoel, R. D., Van den Wittenboer, D. & Hox, J. (2004). Including time invariant covariate in the latent growth curve model. Structural Equation Modeling, 11, 155167. Verbeke, G. & Molenb erghs G. (2000). Linear mixed models for longitudinal data. New York: Springer -Verlag.
PAGE 160
160 Willett, J. B., & Keiley, M. K. (2000). Using covariance structure analysis to model change over time. In H. E. A. Tinsley & S. D. Brown (Eds.), Handbook of applied mul tivariate statistics and mathematical modeling (pp. 665 694). San Diego, CA: Academic Press. Willett, J. B., & Sayer, A. G. (1994). Using covariance structure analysis to detect correlates and predictors of individual change over time. Psychological Bulletin, 116(2). Wolfinger, R. (1993). Covariance structure selection in general mixed models. Communications in Statistics, Simulation and Computation, 22, 10791106 You, W. (2006). Assessing the impact of failure to adequately model the residual structure in growth modeling. Doctoral dissertation, University of Virginia, Charlottesville, VA. Yuan, K., & Bentler, P. M. (2004). On chi -square difference and z tests in mean and covariance structure analysis when the base model is misspecified. Educational and Psyc hological Measurement, 64 (5), 737757. Yuan, K., & Bentler, P. M. (2006). Mean comparison: Manifest variable versus latent variable. Psychometrika, 71(1), 139159.
PAGE 161
161 BIOGRAPHICAL SKETCH Yuying Shi was born in China. She obtained her bachelor s degree i n economics major in finance from Shanghai University of Finance and Economics In 2003, s he came to Kent State University for graduate study, where s he received her m aster s degree in economics. She began her Ph.D. study in research and evaluation methodology at University of Florida in 2005 and received her doctorate in August 2009.