Metodolo¡ski zvezki, Vol. 1, No. 1, 2004, 185-204
Estimation of Dynamic Structural Equation Models with Latent Variables
Dario Czir´aky1
Abstract
The paper proposes a time series generalisation of the structural equation model with latent variables (SEM). An instrumental variable estimator is considered and its asymptotic properties are analysed. Special emphases are placed on the potential use of the lagged observed variables as instruments and consistency of such estimation is established under some general assumptions about the stochastic properties of the modelled variables. In addition, an identification procedure suitable both for static and dynamic structural equation models is described. The methods are illustrated in an empirical application to dynamic panel estimation of a consumption function using UK household data.
1     Introduction
Latent variable methods for time series data are notably underdeveloped in comparison with cross-sectional methods. So far the main developments in the literature focused on simple factor analysis model without causal or structural relationships between latent variables.
Stock and Watson (1989) considered a time series single-factor model of asset return and Stock and Watson (1999) analysed factor analytic models for forecasting purposes. Lewbel (1991) and Donald (1997) considered factor analytic models for time series data and proposed a procedure for determining the number of factors. Similarly Cragg and Donald (1997), Connor and Korajczyk (1993), Stock and Watson (1998), and Bai and Ng (2002) developed procedures for determining the number of factors in time series and panel models. An early selection procedure for pure time series factor models was proposed by Mallows (1973).
Sargent and Sims (1977), Geweke (1977), and Forni et al. (2000) considered estimation of dynamic factor models.2 Chamberlain and Rothschild (1983) analysed approximate factor models allowing for correlation in the idiosyncratic components of the latent errors. Recently, Bai (2003) developed asymptotic inferential theory for a principal components estimator of factor models suitable for large panels. However, time series generalisations of the latent variable models that include structural
1 Department of Statistics, London School of Economics; d.ciraki@lse.ac.uk 2Dynamic factor model is specified as xt = Epi=1 ?i4t-i + et, i.e. the contemporaneous observable indicators are assumed to be caused by both contemporaneous and lagged latent factors.
186
Dario Czir´aky
(causal) relationships among latent variables such as the general structural equation model with latent variables (SEM or LISREL) developed by J¨oreskog (1973) and J¨oreskog et al. (2000) were not developed.
In this paper we propose a time series generalisation of the structural equation model with latent variables in the form of a structural autoregressive distributed lag model with latent variables and propose a general estimation procedure. We show how instrumental variables methods can be used to estimate dynamic latent variable models and we analyse the asymptotic properties of these estimators. In particular, we consider instruments in the form of the lagged observable indicators and show that these can be used for consistent estimation.
The paper is organised as follows. The second section describes the static struc-tural equation model with latent variables and the third section generalizes this model to a dynamic structural equation model. Fourth section describes IV estima-tion procedures while the fifth section deals with the identification of the model.
2     Static structural equation model
The static structural equation model with latent variables (J¨oreskog and S¨orbom, 1996) is specified with three matrix equations-the structural equation, the measurement equation for latent exogenous variables, and the measurement equation for latent endogenous variables
r] = a? + Br] + ?^ + C,    x = ax + ?xL + ö,    y = ay + ?yr] + e,         (2.1)
where r] is a (m x 1) matrix of endogenous latent variables; | is a (g x 1) matrix of exogenous latent variables; B and ? are (m x m) and (m x g) matrices of structural coefficients, respectively; ?x and ?y are k x g and l x m matrices of factor loadings, respectively; ct?, ctx, and ay are (mx1),(h1), and (l x 1) matrices of intercepts, respectively.
3     Dynamic structural equation model (DSEM)
We formulate a dynamic structural equation model with latent variables (DSEM) as a time series generalisation of the static structural equation model with latent variables.3 Specifically, we define a structural autoregressive distributed lag model of the form
p                                q
m = a? + J2 BjVt-j + E ?^t-j + <t,                       (3.1)
j=0                       j=0
where a?, B0, and ?0 are coefficient matrices from the static model (2.1), and B1, B2,..., Bp, ?1, ?2,..., ?q are the additional p + q matrices that contain coefficients
3A static version of this model can be easily estimated by software packages such as LISREL 8.54 (see e.g. Czir´aky, 2004).
Estimation of Dynamic Structural Equation Models...
187
of the lagged endogenous and exogenous latent variables.4 Note that the specification (3.1) is “structural” because contemporaneous endogenous latent variables might be included as regressors (i.e. B0 = 0). If we assume time-invariance of the measurement model, the usual specification of the measurement models for xt and yt applies, thus the structural part of the model (3.1) can be augmented with the measurement equation for the latent exogenous variables
xt = ax + ?xLt + St                                             (3.2)
and for the latent endogenous variables
yt = ocy + ?y^ + et                                             (3.3)
The matrix equations (2)-(4) provide full specification of a general DSEM model directly extending the static structural equation model with latent variables (SEM) to time series. It follows that static SEM is a special case of the DSEM model.
However, the DSEM model from (3.1)-(3.3) cannot be directly estimated due to the presence of unobserved latent components. To solve this problem and enable estimation of the model parameters, we rewrite the latent variable specification in terms of the observed variables and latent errors only, following the approach similar to Bollen (1996; 2001; 2002). Bollen used such specification to enable non-parametric estimation of standard (cross-sectional) structural equation models with an aim of achieving greater robustness to misspecification and non-normality.
In this paper we show that a similar approach can be used to re-write the DSEM model in the observed form specification (OFS) and to subsequently estimate all model parameters (except latent error terms) by generalised instrumental variables methods.
The OFS uses the fact that in the measurement model for each latent variable one loading can be fixed to one without loss of generality. Thus, we can re-write the measurement models for xt and yt as
xt = (  x 1t ) = (     (x)  ) + (   ? (x) ) ^ + (  Ž1t  )                  (3.4)
and
yt = (  y1t ) = (     0 (y)) + ( /(y)   ) Vt + (  L1t )                   (3.5)
Note that the observed indicators with unit loadings were placed in the top part of the vectors for xt and yt and thus the upper part of the lambda matrix is an identity matrix. Having divided xt into xt1 and xt2, note that for xt1 it holds that
x1t = Lt + S1t ?Lt = x1t - S1t                                 (3.6)
and, similarly, for yt1 we can replace the latent variable with its unit-loading indicators
4Note that (3.1) does not require specification of lagged latent variables as separate variables; rather each vector containing all modelled and exogenous latent variables is written for each included lag separately, with a separate coefficient matrix. Also note that (3.1) allows different lag lengths for different latent variables (i.e., elements of 77 and L vectors) by appropriate specification of Bj and ?j matrices (e.g., zero elements).
188
Dario Czir´aky
y1t = r]t + e1t ? Vt = y1t - e1t                                    (3.7)
It is now possible to use the relations in (3.6) and (3.7) to re-write the measurement model for xt as
x2t = a^+?^(x1t-ö1t) + ö2t
= c2 x) + ? 2  x1t + (52t - ?(2x )S1t)                               (     )
and for yt as
y2t = ^)+?(2y )(y1t-e1t) + e2t
= a2   +?(2y)y1t+(e2t-?(2y)e1t)                               ()
Following the same principle it is possible to re-write the structural part of the model using definitions (3.6) and (3.7) as follows
p                                                           q
y1t - e1t = a? + ^ Bj(y1t-j - e1t-j) + E?j(x1t-j - S1t-j) + Ct.          (3.10)
j=0                                               j=0
Separating the observed part of the model from the latent errors we obtain
p                          q                        (                      p                         q                  \
y1t = a? + J2 Bjy1t-j + E ?jx1t-j +    Ct + e1t - E Bje1t-j - E ?j81t-j    ,   (3.11) j=0                                j =0                                 \                              j=0                               j=0                       /
with the measurement model for the latent endogenous variables
y2t = ot(2y ) + ?(2y )y1t + (e2t - ?(2y )e1t) ,                            (3.12)
and for the latent exogenous variables
x2t = c4x) + ?(2x )x1t + (Ö2t - ?(2x)S1t) .                            (3.13)
Aside of the specific structure of the latent error terms, (3.11)-(3.13) present a classical structural equation system with observed variables. However, the OFS form of the DSEM model differs from the standard econometric simultaneous equation system in respect to the exogeneity status of the OFS variables, which are generally observable indicators of the latent variables.
It can be shown that estimation of the OFS equations might be possible by the use of the instrumental variable (IV) methods. Furthermore, it can be shown that IV estimation might be based on model-implied instruments in the form of various lags of the OFS variables.
We propose a limited information generalised IV (GIVE) technique for consistent estimation of the OFS equations by using the model-implied instruments in the form of the lagged indicators of the latent variables.
Estimation of Dynamic Structural Equation Models...
189
4    Estimation of the OFS system 4.1    Full-sample specification
Estimation of the OFS equations aims at consistent and, possibly, efficient estimation of the structural and measurement-model parameters. However, the structural (latent) errors cannot be directly estimated. Therefore, ignoring the specific structure of the measurement error terms, let u1t = Ct + ^1t-Y.pj=0 B^1t-j-Eqj=0 ? ^1t-j, u2t = e2t - ?2y)e1t, and u3t = 62t -written as
?(2y )e1t, and u3t = ö2t - ?(2x)ö1t the structural OFS equations can be
p                              q
y 1t = a? + J2 B j yit-j + L ?x 1t-j + u 1t ,
i                                  o—n
(4.1)
j=0
j=0
with the measurement models
y2t = cx(2y) +?(2y)y1t + u2t,
(4.2)
and
x2t = a(2x) + ?(2x )x1t + u3t.
(4.3)
a
For notational convenience, we switch to full-sample notation, assuming that max(p,q) pre-sample observations are available for estimation.    Define ykj  =
and x2j = f(2j)     (2j)               (2jn
(kj)       (kj)
y0   ,y1
..., yT(kj)
(2j)       (2j)
x0   ,x1   ,
..., x(t2j), for k= 1,2 where the j ” subscript refers to the jth equation where there are m individual y1 equations, n individual y2 equations, and h individual x2 equations. Further define Y1j = (Y1jt,Y1jt-k), and X1j = (X1jt, X1jt-k), where
(11)
y0
Y1jt
=
y1
(11)
y2
(11)
yT
and
(12)
y0
(11)       (12)
y1
(12)
y2
(12)
yT
... ...
...
(1m)
y0
(1m)
y1
(1m)
y2
(1m)
yT
X1jt
=
(11)
x0
(11)
x1
(11)
x2
(11)
xT
(12)
x0
(12)
x1
(12)
x2
(12)
xT
... ...
...
(1m)
x0
(1m)
x1
(1m)
x2
(1m)
xT
Y1jt-k =
(11)
y
-1 (11)
y0
(11)
y2
(12)
y
-1 (12)
y0
(12)
y1
(11)        (12)
yy
T-1       T-1
... ...
(1m)
y
-1 (1m)
y0
y1
(1m)
... ...
(11)
y
-p (11)
y
1-p (11)
y
2-p
(12)
y
-p (12)
y
1-p (12)
y
2-p
(1m)                  (11)        (12)
···   yT-1   ···   yT-p   yT-p
... ...
...
(1m)
y
-p (1m)
y
1-p (1m)
y
2-p
(1m)
y
T-p
,
,
,
190
Dario Czir´aky
X1jt-k ?
 x -1
(11) x0
(11)
x2
- (12)
x__1
x0
x1 (12)
... ...
(11)         (12)
x T1     x T1
x (1g)
-1
(1g) x0
x (1g)
 1
(1g)
x T1
... ...
x
x
x
(11)
q (11) 1-q (11)
x_q
(12)
(12) x
··· q)    ···
)
x (1g)  - q
(1g) 1-q
(1g)
x
2-q      x 2-q
(11)         (12)
x T        x T
x
2-q
(1g)
x T
.
xT-1     xT-1      • • •     x(T-1      ¦ ¦ ¦     xT-q     x(T-q      ¦ ¦ ¦     xT-q
In addition, we define the following notation for the parameter vectors
and
ßj=  (ß0      ,
X(22)  ..., A(2n)
yj
yj
,
(x)
(2h)
\(x) = ((21) , ( xy>,..., x(2
,
/(12) , ...,   (1m) ,   (11)    (1 2) , ...,   (1m) , ...,   (11) ß(12)  ..., ßp m)
lj
=
(T,
70(12), ..., 70(1g), 71(11), 71(12), ..., 71(1g), ..., 7q(
(11)      (12)
,Tq 2), ...,%
(1g)
Using the above notation, we can now write the (4.1)–(4.3) as
y1j =
a
+ Y1jßj + X1j~fj + u1j,
y2j =
a
(y) 2j
+Y
(y)
?
1jt   j
(x)
x   =?(x) +X   ?
2j            2j             1jt   j
(x)
+ u2j, + u3j.
(4.4) (4.5) (4.6)
Note that the individual OFS equations are specified as
V1j =
a
(y) 1j
m
p
k=1i=0
g
q
k=1i=0
(1k)   (1k)
x
t-i
+ u1jt,
for the structural part of the model, and as
V2j =
OL
(y)
m
2 j   +Y,2jkVt        +u 2jt , k=1
x2j =
a
(x) 2i
g
+ E ASx      + u 3jt , k=1
for the measurement models. This completes the specification of the DSEM model. It remains to show that the available instruments in the form of lags of the observed variables can enable consistent estimation. The issue of the choice of instruments is also discussed in Bollen (1996; 2001), however he does not discuss this issue in the context of dynamic models. The following discussion takes into account the specific structure of the OFS system and the implications derived from the composition of the latent errors. This (known) composition of the latent error terms and their implied relation with the observed components of the model, as a consequence of the latent structure, presents the major difference between the DSEM OFS equations and classical econometric models. Specifically, it is not possible to
,
.
Estimation of Dynamic Structural Equation Models...
191
simply assume the availability of external instrumental variables that satisfy some general conditions such as being uncorrelated with the errors and correlated with the regressors. Rather, it will be necessary to show under which conditions the lagged modelled variables can serve as valid instruments in the estimation of the OFS equations.
4.2
Consistency conditions and instrumental variables
The standard consistency conditions needed for the validity of instrumental variables (see e.g. Judge et al, 1985) and Davidson and MacKinnon, 1993) can be stated in terms of the data matrix X defined asX= (<,, Yj, Xj) where Y1j = (Y1jt,Y1jt-k) and X1j = (X1jt,X1jt-k), as defined above. ments defined as Z =
v* = (X*1,X1*2,...,X1*c),X2; =
...
Let Z be a matrix of valid instru-(Y1?, Y2?, X?1, X?2) where Y1? ? (Y1? 1,Y1? 2,...,Y1? a), Y2? ?
. . ., X?2 d), and
(Y2VY52,...,Yi),X1i = (Xi1,X1i2,...,X1ic),X5 = (X51,X52,
(11)
y
-p-k
1k
=
(12)
y
-p-k (11)             (12)
yy
1-p-k        1-p-k
(11)             (12)
yy
2-p-k        2-p-k
(11)             (12)
yy
T-p-k        T-p-k
(1m)
y
-p-k (1m)
y
1-p-k (1m)
y
2-p-k
,   Y?2l =
···   y(1m)
T-p-k
y-( 2l2)     ···
(21)         (22)
y-l+1   y-l+1   ···
(21)         (22)
y-l+2   y-l+2   ···
(21)
y
T-l
(22)
y
T-l
...
(2n)
y
-l (2n)
y
-l+1 (2n)
y
-l+2
(2n)
y
T-l
Xh =
(11)
x
-q-i (11)
x
1-q-i (11)
x
2-q-i
x
(12)
q-i     ··· (12)
x
1-q-(12
x
2-q-i
i    ···
xT
(11)
(12)
x
T-q-
...
(1m)
x
-q-i (1m)
x
1-q-i (1m)
x
2-q-i
(1m)
x
T-q-i
,X?2 j	=
/	V
(21)
x
-j (21)
x
-j+1 (21)
x
-j+2
(21)
x
T-j
(22)
x
-j (22)
x
-j+1 (22)
x
-j+2
(22)
x
T-j
... ...
...
(2n)
x
-j (2n)
x
-j+1 (2n)
x
-j+2
(2n)
x
T-j
where k = 1, 2, . . ., a; l = 1, 2, . . ., b; i = 1, 2, . . ., c; and j = 1, 2, . . ., d.
We state the general conditions for these instruments in terms of the joint ma-trices X and Z though, in practice, only subsets of these matrices will be used in estimated models. It is generally necessary that
and also that
plim (T-1Z'Z) = lim    T-1Z'Z   = ?ZZ, plim (T-1ZX) = lim    T-1Z'X   = ?ZX,
where ?ZZ and ?ZX are positive definite matrices. These conditions will generally hold for the case of lagged instruments given they satisfy certain stochastic conditions. In addition, we assume homoscedastic residuals, i.e., E(uiu'j) = ?ijI and, specially, E (Z'ui) = 0.
,
192
Dario Czir´aky
To assure the consistency of the IV estimator we will need to make the following assumption about the stochastic properties of the observed variables.
Assumption 4.2.1 For stochastic processes {yt} and {xt} suppose that:
A1. E (yijt) = µ(ijy),     ?t A2. E (xijt) = µ
{yt} and {xt}
(x)
ij ,
?t
A3.   E (yij,t-r - µ(ijy)) (yef,t-w - µ(eyf))) = ?(ri-jewf ),     ?t
A4.   E (xij,t-r - µ(ijx))(xeft-w - µ(exf))) = ?(rijef ),     ?t
A5.   E   (yij,t-r - µ(ijy)) (xef,t-w - µ(exf)) = ?(rij-w ,     ?t
A6.   E ?k(   < ?,   E ?k(.  < ?,   E ?k(.) < ?
k=0                   k=0                   k=0
We will also need the following two lemmas.
Lemma 4.2.2 Letwt be a covariance-stationary process with finite fourth moments and absolutely summable autocovariances. Then the sample mean satisfies
where m.s. denotes convergence in mean square.
Proof. Omitted. See Hamilton (1994: 188), Proposition 7.5.
Lemma 4.2.3 Let yt and xt be stochastic processes satisfying Assumption (4.2.2). Then the following convergence results hold:
T
yij,t-s ›pE
(i) T1      yij,t-s ›E (yijt) =

T             ij,t-s                   ijt               ij
T
(ii) 1     yi2j,ts ›E (y2) = ?0(ij) + (µ(ijy))2
T           ij,t-s                 ijt             0                 ij
t=0   ij,t-s T
›pE   yi2jt
T t=0
T 1
p                                      (ijef)        (y)       (y)
(iii) T1      yij,t-ryef,t-w ›pE (yij,t-ryij,t-w) = ?(ri-jewf ) +
yij,t-ryef,t-w ›pE (yij,t-ryij,t-w)
t=0 T
xij,t-s ›pE
(vi) T1       xij,t-s ›E (xijt) =

(v) 1      x2ij,ts ›E (x2) = ?0(ij) + (µ(ijx))2
T            ij,t-s                 ijt            0                 ij
t=0    ij,t-s T
›pE  x2ijt
T

(vi)   1       xij,t-rxef,t-w ›E (xij,t-rxij,t-w) = ?(ri-jewf ) +
xij,t-rxef,t-w ›pE (xij,t-rxij,t-w) yij,t-rxef,t-w›pE(yij,t-rxef,t-w)
(vii)   T1    T =0 yij,t-rxef,t-w ›p    E (yij,t-rxef,t-w) = ?(ri-jewf )l   + µ(ij)µef
Estimation of Dynamic Structural Equation Models...
193
Proof. Omitted. See Czir´aky (2003) for details.
The main underlying assumption in lemma (4.2.2) and lemma (4.2.3) is that of covariance stationarity for the observable variables. Therefore, to apply these methods to non-stationary variables the data would need to be differences to achieve stationarity.
Proposition 4.2.4 Let X = (<,, Yj, Xj) where Y1j = (Y1jt, Y1jt_fc) and X1j = (X1jt, X1jt_fc). Let Z be a matrix of valid instruments defined asZ = (Y*, Y*, X*, X*). Assuming that E (uiu'j) = ?ijI, the following result holds
(i) plim (1Z'Z) = ?ZZ (ii) plim (1Z'X   = ?zx (iii) E (Z'ui) = 0
Proof. Omitted. See Czir´aky (2003) for details.
The above results allow consistent GIVE estimation of the OFS equations using the available, model-implied (lagged) instruments contained in Z, which includes all available eligible instruments that do not come from outside the modelled data. It must be mentioned that nothing precludes availability of valid instruments that are not merely lags of the modelled variables. However, the nature of structural equation models with latent variables casts doubt that such variables will be available. In any case, valid variables will satisfy the same conditions, but we have shown that available instruments already might exist in the used data in forms of lagged values not already included in the model.
4.3    Consistent generalised instrumental variable estimation of the OFS equations
Formulation and estimation of the OFS equations requires reliance on specific structure and status of the modelled variables. This structure is determined by the latent-form specification and makes specification of the OFS equations rather complex. In order to derive generalised instrumental variable estimators (GIVE) for the OFS equations, we start from the system of equations given in (4.4), (4.5), and (4.6) and write it by positioning its matrix and vector elements in the way that will facilitate the use of more concise notation, i.e.,
y 1j = a( 1j + Y 1jßj + X 1jj + u 1j
y2j = ot($ + Y1jt\f             + u2j                         (4.7)
x 2j = a2 x) +                 X 1jtA(x) + u 3j
194
Dario Czir´aky
We are now able to simplify our notation by stacking all of the right-hand-side variables of each of the three parts of the system (4.7) by making the following definitions: W1j = (<,, Y1j, X1j), W2j = (<-, Y1jt), W3j = (<,, X1jt), ö(1yj) = (a(1yj), ß'j, 70', 8(2yj ) = (a2(yj ), X2(yj ))', and ö(2xj ) = (a2(xj ), X2(xj ))'. It is now possible to re-write the system (4.7) in a simpler, more concise notation as
y 1j     =     W 1j1j   +u 1j
y2j   =   W2jö2y) + u2j
x2j   =   W3j62x) + u3j                                        (4.8)
An appropriate matrix of instruments Z need not contain all available eligible instruments, but it needs to have at least as many of them as there are endogenous variables in each equation. The matrix of instruments Z can differ across different (individual) equations of the system (4.8). For simplicity we assume that Z is correctly specified.
We proceed in defining the GIVE estimator. First, by premultiplying each part of the system by Z we obtain matrix equations Z'y1j = Z'W1jc(1yj ) + Z'u1j, Z'y2j = Z'W2jö2(yj ) + Z'u2j, and Z'x2j = Z'W3j62x) + Z'u3j. We now define usual GIVE estimators for coefficient vectors ^    , ^ 2j, and ^ 2xj ) as

= (W1j Z (Z'Z)-1 Z'W1j) W'1j Z (Z'Z)-1 Z'y1j,                 (4.9)
W1jZ(Z'Z)-1Z'W1j
*(2yj) =   W2jZ (Z'Z)-1 Z'W2j   W'2jZ (Z'Z)-1 Z'y2j,               (4.10)
and
W2jz(z'z)-1z'w2j
4(2xj ) =   W3jZ(Z'Z)-1 Z'W3j   W'3jZ(Z'Z)-1 Z'x2j.                (4.11)
It is easy to show that (4.9), (4.10), and (4.11) are consistent estimators of the unknown coefficient vectors <5(1yj), 62y), and ö2(xj ). To show this note that
^    = 6$ + (WijZ(Z'Z)-1 Z'Wij) W'ijZ(Z'Z)-1 Z'uij
Taking probability limits we obtain
plim (^#)) = öV + (plim (1W'ij Z) • plim (1 (Z'Z)-1) plim (1Z'Wij))-1 xplim (T1W'ij Z) • plim (1 (Z'Z)-1) plim   T1Z'uij)
= ^
? W Z ?-1 ?)-1
Estimation of Dynamic Structural Equation Models...
195
The above results holds for each of the vectors ^Vj, ^ 2j, and ^ , where superscripts (y,x) were replaced by asterisks, and subscripts (1,2) by i. For computational purposes, the GIVE estimators using the OFS notation defined above can be written in more detail as follows. Firstly, the three sets of coefficient vectors in the structural part of the model are estimated by
o^m \    (  c'Z(Z'Z)-1Z'c     c'Z(Z'Z)-1Z'Y1j     c'Z(Z'Z)-1Z'X1j
^ j      =     Y'1jZ(Z'Z)-1Z     Y'1jZ(Z'Z)-1Z'Y1j     Y'1jZ(Z'Z)-1Z'X1j
7j     /      I   X'1jZ(Z'Z)-1Z     X'1jZ(Z'Z)-1Z'Y1j      X'1jZ(Z'Z)-1Z'X1j
/     t'Z(Z'Z)-1Z'y1j x      Y'1jZ(Z'Z)-1Z'y1j V X'1jZ(Z'Z)-1Z'y1j
Secondly, the GIVE estimators of the measurement model are given by
(&w\=(   L'Z(Z'Z)'Z'c      c'Z(Z'Z)xZ'Yljt   \l (   L>Z(Z>Z)lZ>y2j   \
\ A2j} J        Y\jtZ(Z'Z)xZ'i   Y'ijtZ(Z'Z)lZ'Yijt )      { Y\jtZ(Z<Z)xZiy2j )
,
and
(&w\=(   c'Z(Z'Z)'Z'c      c'Z(Z'Z)xZYljt   \l (   dZ(Z'Z)lZ'y2j   \ \ A2j} J       Y\jtZ(Z'Z)xZ'i  Y\jtZ(Z'Z)lZ'Yijt       \ Y\jtZ(Z<Z)xZiy2j ) ¦
Asymptotic distribution of these estimators does not depend on the assumption that the modelled data is multivariate normal and, thus, GIVE estimators of the DSEM model are asymptotically distribution free. This is an advantage over the maximum likelihood estimator of the static structural equation model, and therefore, GIVE estimator can prove to be more robust to both misspecification of certain parts of the model and to departure from normality.5
The asymptotic distribution of the GIVE estimators is normal and it can be derived by noting that
Vt (^f - *g>) = ((1WijZ (1 (Z'Z)-1) (1Z'Wij))-1
x (1W'ijZ   (1 (Z'Z)-1) (jfZ'uij) .
If we assume that T-1/2Z'uij -^ N (0,?ij?zz), we can conclude that the asymptotic distribution of the DSEM coefficient estimates is
VT (S^ij/ - off )^N(0,?ij (?w. .z?- 1?zw..)     J
5 Misspecification of one OFS equation will not necessarily affect coefficients of other equations since these are estimated separately using a limited information estimator
196
Dario Czir´aky
The asymptotic covariance matrix ?^ij (W'ijZ (Z'Z)-1 Z'Wij) 1 can be estimated with ?*., = ?^ij (WijZ (Z'Z)-1 Z'Wij)-1 where
?^ij = T-1u^i ju^ij = T-1 (yij - Wij ^Y (yij - Wij ^ )].
The empirical validity of instrumental variables, as opposite to their model-implied eligibility, is empirically testable. The validity of the choice of the instrumental variables can be tested by the Sargan’s (1964) ?2 test. Applied to the OFS equations, the Sargan test can be calculated as
y'ijZ (Z'Z)-1 Z'yij - ^     (WijZ (Z'Z)-1 Z'Wij) ^
^——--------——-1;,—------------^-^,              (4.12)
T       u^i ju^ ij                                                    a?pp
where d is the number of over-identifying instruments, assumed to be independent of the equation error. It is important to note that selection of the IV’s on the basis of the model-implied eligibility without testing for their empirical validity can result in considerable bias in the estimated coefficients. As the choice of instruments affects consistency of GIVE estimates, inappropriate IV selection might result in estimates that will not be robust to misspecification. Therefore, testing for the validity of IV’s should be an important part in empirical estimation of DSEM models.
5    Identification
Identification of the static structural equation models with latent variables is generally problematic. An early discussion of this topic can be found already in Wiley (1973), but a simple and straightforward procedure still does not exist. On the other hand, identification is well defined and straightforward in classical econometric simultaneous equation systems, and a similar approach can be developed for the OFS equations.
We propose a simple procedure that uses only the coefficient matrices from the latent specification for identifying the OFS estimation equations. The following technique provides sufficient conditions for identification of all equations in the systems.
Proposition 5.0.1 Given a DSEM model with the structural equation of the form Vt = oc? + Ej=0 BjVt-j + Lqj=0 ?&t-j + Ct and the measurement model given by xt = ctx + ?xLt + St and yt = ay + ?yrjt + et define
Estimation of Dynamic Structural Equation Models...                                      197
4X))
0 0
0
0
0
(5.1)
where R, is a zero-one selection matrix having one’s in places of omitted variables and one row for each omission. Note that if the equality holds the equation is exactly identified, otherwise it is overidentified.
Corollary 5.0.2 A corollary to Proposition (5.0.1) states that unless
rank (R,) ? m + n + h - 1                                     (5.2)
the jth equation is not identified. The condition (5.2) is necessary for identification, while condition (5.1) is sufficient.
Proof. Omitted. See Czir´aky (2003) for details.
It is therefore possible to use these rules to check for identification of each individual equation. The relevance of this approach lies in its ability to check for identification of the model that is specified in latent form and thus it avoids the need to derive the OFS equations. In addition, this method is equally applicable for both static and dynamic structural equation models with latent variables.
6 An empirical application: Estimating a dynamic latent consumption function using UK household data
We apply above proposed methods by estimating a latent consumption function model that incorporates liquidity effects using micro data from the last 10 waves
K =
(I
-
B'o)
?2'    0
In       0
0       Ih
,    G =
-av -B'i -B'2
-B'p
-?'o -?'i
0 0
0 0 0
Then, the jth equation of the system will be identified iff
rank    R3
K G
?
m+n+h-1
-
198
Dario Czir´aky
Table 1: Variables and notation.
Symbol    Description
Ft         Annual personal food expenditure
Ht        Annual personal housing costs
Lt         Annual labour income
NLt       Annual non-labour, non-investment income
It         Annual investment income
St         Annual personal savings
Bt         Cumulative credit repayment problem
EDt       Highest level of academic education
(years) of the British Household Panel Study. We merged the 10 waves of the British Household Panel Survey (Taylor, et al. 2001) into a panel. Since the available variables on consumption expenditure, types of income, and liquidity constraints indicators vary across waves, we use only those variables that were available across all 10 waves.6
The variables we use in estimation are shown in Table 1. Household data (expen-ditures) were firstly spread to individual level and then combined with the individual level income data, thus creating all-individual data files for each wave. Finally, wave-specific files were merged into a joint panel for all individuals across all waves in the “long format”, meaning the first individual in the sample is recorded on each time point followed by the second individual, etc. In this analysis we do not ad-dress the issues of missing values and attrition but we note that we used data on 3,324 individuals that had no missing values. Thus our panel with NT observations amounted to 33,240 observations.
We specify our model in the general DSEM form. The model assumes that cur-rent consumption, modelled as a latent variable, depends on current (latent) income, previous period consumption and previous period income. Simultaneously, current income depends on the last period income and education, which is assumed to be measured without error for simplicity. Note that we assume that education is per-fectly measured by a single indicator EDt. Finally, the (latent) liquidity constraints are directly incorporated into the model and assumed to depend on previous period consumption.7 The structural part of the model describes the relationships among the latent variables and is specified as
6See Czir´aky (2002) for details on data manipulation and computer implementation. 7Logically, we expect that excessive spending in one year causes greater degree of liquidity constraints in the following year.
Estimation of Dynamic Structural Equation Models...
199
LQt
Ct Yt
=
+
?lq\       (    0      0    ß{$
?C       +     ß21>    0   ß2 13
?y   J       V    0      0      0
(ßß {S ßS> \
(1)    (1)
0    ß%  ßy23
0
(1)
0      ß33
LQ t-1
C t-1
Y t-1
LQt Ct Yt
0 +       0
?31
(Et) +
?LQ
?C
?Y
There are three measurement models, for latent consumption, income and liq-uidity constraints variables. The measurement model is given by

St Ft Lt Bt Ht NLt It

=
 ?W
(y)
F
(y)
L
(y)
B
(y)
H
?
?
?
?
+
1	0	0
0	1	0
0	0	1
? 41	0	0
0	(y) ? 52	0
0	0	? 63
0	0	? 73
LQt C t       + Yt
 ?St
?Ft ?Lt
?Bt ?Ht ?NLt
 ?It
We can re-write the model in the specific OFS form.  The OFS form for the structural model is thus given by:
Ft Lt
=
+
0
(0)
0   ß1{$
ß
22
ß
(0)
?S
?F     I   +  I    ß2(01)     0     ß23
?L /       V    0      0      0
/ ß
0 v  0
S t-1 F t-1 L t-1
St
Ft Lt
+
0   \                   / u11t
0       (EDt) +     u12t
?31    /                  V   u13t
and the OFS for the measurement model is given by:


Bt Ht
NLt It

=

 NL
+
 ? 41
0
0 0

0
? 52
0
0
0 0
? y)
63
?v)
73

St Ft Lt
+
 u21t
u22t u23t  u24t
We estimate the model using the GIVE technique. Table 2 shows the IV-validity test results (Sargan, 1964) for individual equations estimated by GIVE methods using different Zj matrices. Due to the panel nature of the BHPS data, it was necessary to estimate differenced equations using appropriately constructed IV ma-trices (see Arellano and Bond, 1991). While differences as well as lags can be used
200
Dario Czir´aky
Table 2: Validity of instruments tests.
Equation for:    Instruments                                                             ?2     ~
?St            Bt-3,Ht-3,EDt                                                  0.687      2
?Ft            Bt-3,Ht-3,EDt                                                  0.253      1
?Lt            Bt-3, Ht-3, Ft-3                                                  1.253      3
?Bt            NLt-2,It-2,EDt                                                9.167      2
?Ht            Bt-2,Ht-2,NLt-2,It-2,St-2Ft-2,Lt-2,EDt    2.771      7
?NLt          Bt-2,Ht-2,EDt                                                  1.379      2
?It            Bt-2,Ht-2,EDt____________________________2.007      2
as instruments (given they are selected from the set of eligible instruments for each equation), we used lagged variables only (see Arellano, 1989 for more details on problems caused by the use of differences for instruments in simple error-component models).
The first column of Table 2 shows endogenous variable for which OFS equations were estimated. Note that the constant term cannot be estimated in the differenced model, and since intercepts have no substantive importance here we do not attempt to recover them.
The second column shows which instruments were used for estimation. The selection was based on minimisation of the Sargan’s validity-of-instruments test.
We report the coefficient estimates in Table 3. Looking at the individual coefficient estimates it is possible to conclude that most coefficients are well determined with small standard errors. The attempt to model the degree of liquidity constraints and its influence on relationship between consumption and income provided little new insight in this well researched topic. Namely, the efforts to construct and model a liquidity constraints variable that includes cumulative credit repayment problem measure, though conceptually promising, resulted in poor statistical results; the coefficient of Bt turned out to be insignificant, thus effectively all that we have in the liquidity constraints measurement model is personal savings, which however, has small (though significant) negative effect on consumption (higher savings, in return, result in smaller consumption). A significant negative effect of the LQt-x variable suggests cyclical saving pattern, i.e., saving is lower in the current period when it was unusually high in the previous period and vice versa. The meaning of the ß[$ coefficient can be explained as the increase in income for each additional year of education.
Estimation of Dynamic Structural Equation Models...
201
Table 3: Coefficient estimates.
Coefficient   Estimate   Standard error
(0) ß 13	0.0914	0.0267
(1) ß 11	0.3420	0.0059
(1) ß 12	0.0254	0.0287
(1) ß 13	0.0178	0.0047
(0) ß 21	0.0719	0.4430
(0) ß 23	0.1553	0.0921
(1) ß 22	0.3171	0.0908
(1) ß 23	0.0239	0.0121
(1) ß 33	0.1690	0.0006
731	118.1200	12.3740
(y) ? 41	0.0002	0.0000
(y) ? 52	0.1933	0.2763
(y) ? 63	0.3368	0.0699
(y) ? 73	0.0349	0.0261
7    Conclusion
In this paper we considered a time series generalisation of the structural equation model with latent variables and proposed an asymptotically distribution-free ap-proach to its estimation. We described a limited information instrumental variable estimator and analysed suitability of the lagged observable indicators as instruments. We showed that such lagged variables can be used for consistent estimation under some general assumptions regarding stochastic properties of the observed variables. The main restriction of this research is in the stationarity requirement, hence further extensions should consider non-stationary, possibly cointegrated variables. Another direction for further research would be to consider full-information estimation of the OFS equations such as maximum likelihood and 3-SLS methods. In addition, diagnostic and fit statistics including specification and misspecification test should be developed and small sample performance of the available estimators should be further studied.
References
[1] Arellano, M. (1989): A note on the Anderson-Hsiao estimator for panel data. Economics Letters, 31, 337–341.
202
Dario Czir´aky
[2] Arellano, M. and Bond, S. (1991): Some tests of specification for panel data: Monte Carlo evidence and an application to employment equations. Review of Economic Studies, 58, 277–297.
[3] Bai, J. and Ng. S. (2002): Determining the number of factors in approximate factor models. Econometrica, 70, 191–221.
[4] Bai, J. (2003): Inferential theory for factor models of large dimensions, Econo-metrica, 71, 135–171.
[5] Bollen, K.A. (1996): An alternative two stage least squares (2SLS) estimator for latent variable equations, Psychometrika, 61, 109–121.
[6] Bollen, K.A. (2001): Two-stage least squares and latent variable models: si-multaneous estimation and robustness to misspecification, In R. Cudeck, S. Du Toit, and D. S¨orbom (Eds.): Structural Equation Modeling: Present and Future. Chicago: Scientific Software International, 119–138.
[7] Bollen, K.A. (2002): A Note on a Two-Stage Least Squares Estimator for Higher-Order Factor Analysis, Sociological Methods and Research, 30, 568–579.
[8] Connor, G. and Korajzcyk, R. (1993): A test for the number of factors in an approximate factor model, Journal of Finance, 48, 1263–1291.
[9] Cragg, J. and Donald, S. (1997): Inferring the rank of a matrix. Journal of Econometrics, 76, 223–250.
[10] Czir´aky D. (2002): Estimation of a general structural equation latent variable autoregressive distributed lag model with an application to UK micro-consumption function, University of Essex, unpublished MA thesis (download-able from www.policy.hu/cziraky/papers.htm).
[11] Czir´aky D. (2003): Estimation of a dynamic structural equation model with latent variables. International Conference on Methodology and Statistics; September 14–17, Ljubljana, Slovenia (downloadable from www.policy.hu/cziraky/papers.htm).
[12] Czir´aky, D. (2004): LISREL 8.54: A programme for structural equation mod-elling with latent variables, Journal of Applied Econometrics, 19, 135–141.
[13] Czir´aky, D., Ti¡sma, S., and Pisarovi´c, A. (2003): Determinants of the low SME loan approval rate in Croatia: A latent variable structural equation approach, Small Business Economics Journal, forthcoming.
[14] Davidson, R. and MacKinnon, J. (1993): Estimation and Inference in Econo-metrics. Oxford: Oxford University Press.
Estimation of Dynamic Structural Equation Models...
203
[15] Dhrymes, P.J., Friend, I., and Glutekin, N.B. (1984): A critical reexamination of the empirical evidence on the arbitrage pricing theory, Journal of Finance, 39, 323–346.
[16] Donald, S. (1997): Inference concerning the number of factors in a multivariate nonparametric relationship. Econometrica, 65, 103–132.
[17] Forni, M., Hallin, M., Lippi, M., and Reichlin, L. (2000): The generalized dynamic factor model: Identification and estimation. Review of Economics and Statistics, 82, 540–554.
[18] Forni, M. and Reichlin, L. (1998): Let’s get real: A factor-analytic approach to disaggregated business cycle dynamics. Review of Economic Studies, 65, 453–473.
[19] Geweke, J. (1977): The dynamic factor analysis of economic time series. In D.J. Aigner and A.S. Goldberger (Eds.): Latent Variables in Socio Economic Models. Amsterdam: North Holland.
[20] J¨oreskog, K.G. (1973): A general method for estimating a linear structural equation system. In A.S. Goldberger and O.D. Duncan (Eds.): Structural Equa-tion Models in the Social Sciences. Chicago: Academic Press, 85–112.
[21] Judge, G.G., Griffiths, W.E., Hill, R.C., Ltkepohl, H., and Lee, T.C. (1985): The Theory and Practice of Econometrics. 2nd ed. New York: John Wiley.
[22] Lewbel, A. (1991): The rank of demand systems: Theory and nonparametric estimation, Econometrica, 59, 711–730.
[23] Mallows, C.L. (1973): Some comments on Cp, Technometrics, 15, 661–675.
[24] Sargent, T. and Sims, C. (1977): Business cycle modelling without pretending to have too much a priori economic theory. In C. Sims (Ed.): New Methods in Business Cycle Research. Minneapolis: Federal Reserve Bank of Minneapolis.
[25] Stock, J.H. and Watson, M. (1989): New indexes of coincident and leading economic indications. In O.J. Blanchard S. and Fisher (Eds.): NBER Macroe-conomic Annual 1989. Cambridge, MA.: M.I.T. Press.
[26] Stock, J.H. and Watson, M. (1998): Diffusion indexes. NBER Working Paper 6702.
[27] Stock, J.H. and Watson, M. (1999): Forecasting inflation. Journal of Monetary Economics, 44, 293–335.
204
Dario Czir´aky
[28] Taylor, M.F. (Ed.) with Bruce J., Nick, B. and Prentice-Lane, E. (2001): British Household Panel Survey User Manual, Volume A: Introduction, Technical Report and Appendices. Colchester, University of Essex.