Yoke

The concept of a yoke, introduced in [a3], is of great importance in relation to geometric, i.e. parametrization invariant, calculations on statistical models (cf. also Differential geometry in statistical inference; Statistical manifold). A yoke on a model $ M $ induces a metric and families of connections, derivative strings and tensors on $ M $ in terms of which geometric properties of $ M $ may be formulated, see [a5]. Differences and similarities between the expected and observed geometry of $ M $ may be discussed using yokes, see [a5]. Furthermore, invariant Taylor expansions of functions defined on $ M $ are obtainable via yokes. Finally, a relationship between yokes and symplectic forms has been established in [a4].

In order to define a yoke, let $ M $ be a smooth $ d $- dimensional manifold and let $ \omega = ( \omega ^ {1} \dots \omega ^ {d} ) $ and, correspondingly, $ ( \omega, \omega ^ \prime ) = ( \omega ^ {1} \dots \omega ^ {d} , \omega ^ {\prime 1 } \dots \omega ^ {\prime d } ) $ denote local coordinates on $ M $ and $ M \times M $, respectively. Arbitrary components of $ \omega $ will be denoted by the letters $ i,j,k,m, \dots $. For two sets of indices $ K _ {t} = k _ {1} \dots k _ {t} $ and $ M _ {u} = m _ {1} \dots m _ {u} $ and a smooth function $ g : {M \times M } \rightarrow \mathbf R $, the symbol $ /g _ {K _ {t} ;M _ {u} } $ is used for the values of the function

$$ g _ {K _ {t} ;M _ {u} } ( \omega, \omega ^ \prime ) = { \frac{\partial ^ {t + u } g ( \omega, \omega ^ \prime ) }{\partial \omega ^ {k _ {1} } \dots \partial \omega ^ {k _ {t} } \partial \omega ^ {\prime m _ {1} } \dots \partial \omega ^ {\prime m _ {u} } } } $$

evaluated at the diagonal of $ M \times M $, i.e.

$$ /g _ {K _ {t} ;M _ {u} } ( \omega ) = g _ {K _ {t} ;M _ {u} } ( \omega, \omega ) . $$

With this notation, a yoke is a smooth function $ g : {M \times M } \rightarrow R $, such that for every $ \omega \in M $:

i) $ /g _ {i; } ( \omega ) = 0 $;

ii) the matrix $ [ /g _ {i;j } ( \omega ) ] $ is non-singular.

A normalized yoke is a yoke satisfying the additional condition $ g ( \omega, \omega ) = 0 $. For any yoke $ g $ there exists a corresponding normalized yoke $ {\overline{g}\; } $, given by $ {\overline{g}\; } ( \omega, \omega ^ \prime ) = g ( \omega, \omega ^ \prime ) - g ( \omega ^ \prime , \omega ^ \prime ) $, and a dual yoke $ g ^ {*} $, given by $ g ^ {*} ( \omega, \omega ^ \prime ) = {\overline{g}\; } ( \omega ^ \prime , \omega ) $.

In the statistical context the two most important examples of normalized yokes are the expected and the observed likelihood yoke. For a parametric statistical model with parameter space $ M $, sample space $ {\mathcal X} $ and log-likelihood function $ l : {M \times {\mathcal X} } \rightarrow \mathbf R $, the expected likelihood yoke is given by

$$ g ( \omega, \omega ^ \prime ) = {\mathsf E} _ {\omega ^ \prime } \{ l ( \omega ;x ) - l ( \omega ^ \prime ;x ) \} . $$

The observed likelihood yoke is given by

$$ g ( \omega, \omega ^ \prime ) = l ( \omega ; \omega ^ \prime ,a ) - l ( \omega ^ \prime ; \omega ^ \prime ,a ) . $$

Here, $ a $ is an auxiliary statistic such that the function $ x \rightarrow ( {\widehat \omega } ,a ) $, where $ {\widehat \omega } $ denotes the maximum-likelihood estimator of $ \omega $( cf. also Maximum-likelihood method), is bijective. Further examples of statistical yokes are related to contrast functions, see [a5].

Some further notation is needed for the discussion of properties of yokes. If $ f : M \rightarrow \mathbf R $ is a smooth function, one sets

$$ f _ {/K _ {t} } = { \frac{\partial ^ {t} f ( \omega ) }{\partial \omega ^ {k _ {1} } \dots \partial \omega ^ {k _ {t} } } } . $$

Furthermore, if $ \psi = ( \psi ^ {1} \dots \psi ^ {d} ) $ is an alternative set of local coordinates for which arbitrary components are denoted by the letters $ a,b,c,d, \dots $ and if for $ t, \tau = 1,2, \dots $ $ C _ {t} $ and $ K _ \tau $ are two sets of indices related to the local coordinates $ \psi $ and $ \omega $, respectively, one sets

$$ \omega _ {C _ {t} } ^ {K _ \tau } = \sum _ {C _ {t} / \tau } \omega _ {/C _ {t1 } } ^ {k _ {1} } \dots \omega _ {/C _ {t \tau } } ^ {k _ \tau } . $$

Here, the summation is over ordered partitions of $ C _ {t} = c _ {1} \dots c _ {t} $ into $ \tau $( non-empty) subsets $ C _ {t1 } \dots C _ {t \tau } $ such that the order of the indices in each of the subsets is the same as the order within $ C _ {t} $ and such that for $ \mu = 1 \dots \tau - 1 $ the first index of $ C _ {t \mu } $ comes before the first index of $ C _ {t, \mu + 1 } $ as compared with the ordering within $ C _ {t} $. For $ \tau > t $, the sum is to be interpreted as $ 0 $.

Let $ g $ be an arbitrary yoke and let $ /g _ {;} = \{ {/g _ {K _ {t} ;M _ {u} } } : {t,u = 1,2, \dots } \} $. Then the most important properties of $ g $ are:

a) $ /g _ {;} $ satisfies the balance relation

$$ /g _ {K _ {t} ; } + \sum _ {K _ {t} /2 } /g _ {K _ {t1 } ;K _ {t2 } } = 0. $$

b) $ /g _ {;} $ is a double derivative string, i.e. the transformation law is

$$ /g _ {C _ {t} ;D _ {u} } = \sum _ {\tau = 1 } ^ { t } \sum _ {\nu = 1 } ^ { u } /g _ {K _ \tau ;M _ \nu } \omega _ {/C _ {t} } ^ {K _ \tau } \omega _ {/D _ {u} } ^ {M _ \nu } . $$

In particular, $ /g _ {i;j } $ is a symmetric non-singular $ ( 0,2 ) $- tensor, and consequently $ M $ equipped with this metric is a Riemannian manifold. The inverse of the matrix $ [ /g _ {i;j } ] $ will be denoted by $ [ /g ^ {i;j } ] $.

c) For $ \alpha \in \mathbf R $ the collection of arrays $ {\Gamma ^ \alpha } = \{ { {\Gamma ^ \alpha } {} _ {K _ {t} } ^ {i} } : {t = 1,2, \dots } \} $, where

$$ {\Gamma ^ \alpha } {} _ {K _ {t} } ^ {i} = \left \{ { \frac{1 + \alpha }{2} } /g _ {K _ {t} ;j } + { \frac{1 - \alpha }{2} } /g _ {j;K _ {t} } \right \} /g ^ {i;j } $$

is a connection string, i.e. $ {\Gamma ^ \alpha } $ satisfies the transformation law

$$ {\Gamma ^ \alpha } {} _ {C _ {t} } ^ {a} = \left \{ \sum _ {\tau = 1 } ^ { t } {\Gamma ^ \alpha } {} _ {K _ \tau } ^ {i} \omega _ {/C _ {t} } ^ {K _ \tau } \right \} \psi _ {/i } ^ {a} . $$

In particular, $ {\Gamma ^ \alpha } {} _ {k _ {1} k _ {2} } ^ {i} $ is the (upper) Christoffel symbol of a torsion-free affine connection, the so-called $ \alpha $- connection, $ {\nabla ^ \alpha } $ corresponding to the yoke $ g $.

The expected and observed $ \alpha $- geometries, see [a1] and [a2], are those corresponding to the expected and observed likelihood yokes, respectively.

d) For $ \alpha \in \mathbf R $ there exists a sequence of tensors $ {T ^ \alpha } _ {;} = \{ { {T ^ \alpha } _ {I _ \tau ;J _ \upsilon } } : {\tau, \upsilon = 1,2, \dots } \} , $ such that $ {T ^ \alpha } _ {I _ \tau ;J _ \upsilon } $ is a covariant tensor of degree $ \tau + \upsilon $. The quantities $ {T ^ \alpha } _ {;} $ are referred to as the tensorial components of $ /g _ {;} $ with respect to $ {\Gamma ^ \alpha } $ and are obtained by intertwining $ /g _ {;} $ and $ {\Gamma ^ \alpha } $, i.e. determined recursively by the equations

$$ /g _ {K _ {t} ;M _ {u} } = \sum _ {\tau = 1 } ^ { t } \sum _ {\nu = 1 } ^ { u } {T ^ \alpha } _ {I _ \tau ;J _ \upsilon } {\Gamma ^ \alpha } {} _ {K _ {t} } ^ {I _ \tau } {\Gamma ^ \alpha } {} _ {M _ {u} } ^ {J _ \upsilon } , $$

where

$$ {\Gamma ^ \alpha } {} _ {K _ {t} } ^ {I _ \tau } = \sum _ {K _ {t} / \tau } {\Gamma ^ \alpha } {} _ {K _ {t1 } } ^ {i _ {1} } \dots {\Gamma ^ \alpha } {} _ {K _ {t \tau } } ^ {i _ \tau } . $$

In terms of the local coordinates $ \omega $, an invariant Taylor expansion, around $ m \in M $ or $ \omega ^ \prime = \omega ^ \prime ( m ) $, of a smooth function $ f $ is of the form

$$ f ( \omega ) = f ( \omega ^ \prime ) + \sum _ {\tau = 1 } ^ \infty { \frac{1}{\tau ! } } {f ^ { 1 } } _ {//I _ \tau } ( \omega ^ \prime ) \gamma ^ {I _ \tau } , $$

where $ \{ { {f ^ { 1 } } _ {//I _ \tau } } : {\tau = 1,2, \dots } \} $ are the tensorial components of the derivatives $ \{ {f _ {/K _ {t} } } : {\tau = 1,2, \dots } \} $ with respect to the connection string $ {\Gamma ^ { 1 } } $ given recursively by

$$ f _ {/K _ {t} } = \sum _ {\tau = 1 } ^ { t } {f ^ { 1 } } _ {//I _ \tau } {\Gamma ^ { 1 } } {} _ {K _ {t} } ^ {I _ \tau } . $$

Furthermore, $ \gamma ^ {I _ \tau } = \gamma ^ {i _ {1} } \dots \gamma ^ {i _ \tau } $, where $ \gamma $ indicates the extended normal coordinates around $ m $ whose components are given by

$$ \gamma ^ {i} ( \omega ) = {\overline{g}\; } _ {;j } ( \omega, \omega ^ \prime ) /g ^ {i;j } , $$

$ {\overline{g}\; } $ being the normalized yoke corresponding to $ g $ and $ \omega ^ \prime = \omega ^ \prime ( m ) $.

The Taylor expansion is invariant in the sense that $ {f ^ { 1 } } _ {//I _ \tau } $ and $ \gamma ^ {I _ \tau } $ are tensors.

References[edit]

[a1]	S-I. Amari, "Differential-geometrical methods in statistics" , Lecture Notes in Statistics , 28 , Springer (1985)
[a2]	O.E. Barndorff-Nielsen, "Likelihood and observed geometries" Ann. Stat. , 14 (1986) pp. 856–873
[a3]	O.E. Barndorff-Nielsen, "Differential geometry and statistics. Some mathematical aspects" Indian J. Math. (Ramanujan Centenary Volume) , 29 (1987) pp. 335–350
[a4]	O.E. Barndorff-Nielsen, P.E Jupp, "Statistics, yokes and symplectic geometry" Ann. Toulouse , to appear (1997)
[a5]	P. Blæsild, "Yokes and tensors derived from yokes" Ann. Inst. Statist. Math. , 43 (1991) pp. 95–113