フィッシャー情報量 – Wikipedia

Posted on April 19, 2022 by lordneo

この記事は検証可能な参考文献や出典が全く示されていないか、不十分です。出典を追加して記事の信頼性向上にご協力ください。
出典検索^?: “フィッシャー情報量” – ニュース · 書籍 · スカラー · CiNii · J-STAGE · NDL · dlib.jp · ジャパンサーチ · TWL（2016年10月）

フィッシャー情報量（フィッシャーじょうほうりょう、英: Fisher information
）

IX(θ){displaystyle {mathcal {I}}_{X}(theta )}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/4d0f56f40909d1169a1d359c82368c93d7f5e6cf" aria-hidden="true" alt="{mathcal {I}}_{X}(theta )" width="2526.3" height="1223.9">$ は、統計学や情報理論で登場する量で、確率変数

X{displaystyle X}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/68baa052181f707c662844a465bfeeb135e82bab" aria-hidden="true" alt="X" width="852.5" height="936.9">$ が母数

θ{displaystyle theta }

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/6e5ab2664b422d53eb0c7df3b87e1360d75ad9af" aria-hidden="true" alt="theta " width="469.5" height="936.9">$ を母数とし、

X{displaystyle X}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/68baa052181f707c662844a465bfeeb135e82bab" aria-hidden="true" alt="X" width="852.5" height="936.9">$ を確率密度関数が

f(x|θ){displaystyle f(x|theta )}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/5151115d7ffb7ac662c07581c6594610f37a1d09" aria-hidden="true" alt="f(x|theta )" width="2650" height="1223.9">$ で表される確率変数とする。
このとき、

θ{displaystyle theta }

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/6e5ab2664b422d53eb0c7df3b87e1360d75ad9af" aria-hidden="true" alt="theta " width="469.5" height="936.9">$ の尤度関数

L(θ|x){displaystyle L(theta |x)}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/1c66d0ea9d47dbb3058b7158058452ee1e20f1e0" aria-hidden="true" alt="L(theta |x)" width="2781" height="1223.9">$ は

L(θ|x)=f(x|θ){displaystyle L(theta |x)=f(x|theta ),} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/9a23ce6fbd80c3a66eae84db424791b7278d7a54" aria-hidden="true" alt="L(theta |x)=f(x|theta )," width="6931.7" height="1223.9">

で定義され、スコア関数は対数尤度関数の微分

V(x;θ)=∂∂θln⁡L(θ|x){displaystyle V(x;theta )={frac {partial }{partial theta }}ln L(theta |x)} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/5532ba5c98faebd52cc840609950e44d1fc92553" aria-hidden="true" alt="V(x;theta )={frac {partial }{partial theta }}ln L(theta |x)" width="9716.1" height="2372">

により定義される。このとき、フィッシャー情報量

IX(θ){displaystyle {mathcal {I}}_{X}(theta )}

IX(θ)=E[V(x;θ)2|θ]=E[(∂∂θln⁡L(θ|x))2|θ]{displaystyle {begin{aligned}{mathcal {I}}_{X}(theta )&=mathrm {E} [V(x;theta )^{2}|theta ]\&=mathrm {E} left[left.{biggl (}{frac {partial }{partial theta }}ln L(theta |x){biggr )}^{2}right|,theta right]end{aligned}}} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/c0972dbcbdd7c40679d98b05df3179db7f3c1082" aria-hidden="true" alt="{begin{aligned}{mathcal {I}}_{X}(theta )&amp;={mathrm {E}}[V(x;theta )^{2}|theta ]\&amp;={mathrm {E}}left[left.{biggl (}{frac {partial }{partial theta }}ln L(theta |x){biggr )}^{2}right|,theta right]end{aligned}}" width="14524.2" height="4668.3">

により定義される。紛れがなければ添え字の

X{displaystyle X}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/68baa052181f707c662844a465bfeeb135e82bab" aria-hidden="true" alt="X" width="852.5" height="936.9">$ を省略し、

I(θ){displaystyle {mathcal {I}}(theta )}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/93d0e554bc0fb296dac5ded2a7be914f4398543e" aria-hidden="true" alt="{mathcal {I}}(theta )" width="1920.5" height="1223.9">$ とも表記する。なお、

X{displaystyle X}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/68baa052181f707c662844a465bfeeb135e82bab" aria-hidden="true" alt="X" width="852.5" height="936.9">$ に関しては期待値が取られている為、フィッシャー情報量は

X{displaystyle X}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/68baa052181f707c662844a465bfeeb135e82bab" aria-hidden="true" alt="X" width="852.5" height="936.9">$ の従う確率密度関数

f(x|θ){displaystyle f(x|theta )}

X{displaystyle X}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/68baa052181f707c662844a465bfeeb135e82bab" aria-hidden="true" alt="X" width="852.5" height="936.9">$ と

Y{displaystyle Y}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/961d67d6b454b4df2301ac571808a3538b3a6d3f" aria-hidden="true" alt="Y" width="763.5" height="865.1">$ が同じ確率密度関数を持てば、それらのフィッシャー情報量は同一である。

スコア関数は

E[V(x;θ)|θ]=0{displaystyle mathrm {E} [V(x;theta )|theta ]=0,} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/1301bca3b9932f3f1b39891a5edb2cae4a8877d7" aria-hidden="true" alt="{mathrm {E}}[V(x;theta )|theta ]=0," width="7023.4" height="1223.9">

を満たす事が知られているので、

IX(θ)=var(V(x;θ)){displaystyle {mathcal {I}}_{X}(theta )=mathrm {var} (V(x;theta ))} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/3fccc10efcd0b65ee16a3770f8b0c8a79aa3a327" aria-hidden="true" alt="{mathcal {I}}_{X}(theta )={mathrm {var}}(V(x;theta ))" width="9096.5" height="1223.9">

が成立する。ここで

var{displaystyle mathrm {var} }

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/6a2d11228a050aea641870dde2667506bc960a13" aria-hidden="true" alt="{mathrm {var}}" width="1421.5" height="721.6">$ は分散を表す。

また

ln⁡f(x|θ){displaystyle ln f(x|theta )}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/d9db95e587625a154b8ab217f405393e10a72d6c" aria-hidden="true" alt="ln f(x|theta )" width="3651.7" height="1223.9">$ が二回微分可能で以下の標準化条件

∫∂2∂θ2f(X;θ)dx=0,{displaystyle int {frac {partial ^{2}}{partial theta ^{2}}}f(X;theta ),dx=0,} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/0a92cde6f24edad4c095e841fa623b541cc0e16e" aria-hidden="true" alt="int {frac {partial ^{2}}{partial theta ^{2}}}f(X;theta ),dx=0," width="9434.5" height="2659.1">

を満たすなら、フィッシャー情報量は以下のように書き換えることができる。

I(θ)=−E[∂2∂θ2ln⁡f(X;θ)].{displaystyle {mathcal {I}}(theta )=-mathrm {E} left[{frac {partial ^{2}}{partial theta ^{2}}}ln f(X;theta )right].} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/ef233fe1aa86d6a930003dc08083d6a0c1e5fae6" aria-hidden="true" alt="{mathcal {I}}(theta )=-{mathrm {E}}left[{frac {partial ^{2}}{partial theta ^{2}}}ln f(X;theta )right]." width="12499.3" height="2730.8">

このとき、フィッシャー情報量は、

f{displaystyle f}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/132e57acb643253e7810ee9702d9581f159a1c61" aria-hidden="true" alt="f" width="550.5" height="1080.4">$ の対数の

θ{displaystyle theta }

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/6e5ab2664b422d53eb0c7df3b87e1360d75ad9af" aria-hidden="true" alt="theta " width="469.5" height="936.9">$ についての最尤推定量付近のサポート曲線の「鋭さ」としてもとらえることができる。例えば、「鈍い」（つまり、浅い最大値を持つ）サポート曲線は、2次の導関数として小さな値を持つため、フィッシャー情報量としても小さな値を持つことになるし、鋭いサポート曲線は、2次導関数として大きな値を持つため、フィッシャー情報量も大きな値になる。

Table of Contents

フィッシャー情報行列[編集]

パラメータがN個の場合、つまり、

θ{displaystyle mathbf {theta } }

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/9813b67e5416572ea1d10056c50d380742924a78" aria-hidden="true" alt="{mathbf {theta }}" width="469.5" height="936.9">$ がN次のベクトル

θ=(θ1,θ2,⋯,θN)T{displaystyle theta =(theta _{1},theta _{2},cdots ,theta _{N})^{T}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/b5532f9023cede5b30cb5ef13e67276189135e29" aria-hidden="true" alt="theta =(theta _{{1}},theta _{{2}},cdots ,theta _{{N}})^{T}" width="8900" height="1367.4">$ であるとき、フィッシャー情報量は、以下で定義されるNxN 行列に拡張される。

I(θ)=E[∂∂θln⁡f(X;θ)∂∂θTln⁡f(X;θ)].{displaystyle {mathcal {I}}(mathbf {theta } )=mathrm {E} left[{frac {partial }{partial mathbf {theta } }}ln f(X;theta ){frac {partial }{partial mathbf {theta } ^{T}}}ln f(X;theta )right].} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/5bc13cfaedd8d144a1b2c0e61bfe651f47bcf551" aria-hidden="true" alt="{mathcal {I}}({mathbf {theta }})={mathrm {E}}left[{frac {partial }{partial {mathbf {theta }}}}ln f(X;theta ){frac {partial }{partial {mathbf {theta }}^{T}}}ln f(X;theta )right]." width="17527" height="2659.1">

これを、フィッシャー情報行列(FIM, Fisher information matrix)と呼ぶ。成分表示すれば、以下のようになる。

(I(θ))i,j=E[∂∂θiln⁡f(X;θ)∂∂θjln⁡f(X;θ)].{displaystyle {left({mathcal {I}}left(theta right)right)}_{i,j}=mathrm {E} left[{frac {partial }{partial theta _{i}}}ln f(X;theta ){frac {partial }{partial theta _{j}}}ln f(X;theta )right].} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/06d3c094ee951007b339b58423b7af22b8f7058e" aria-hidden="true" alt="{left({mathcal {I}}left(theta right)right)}_{{i,j}}={mathrm {E}}left[{frac {partial }{partial theta _{i}}}ln f(X;theta ){frac {partial }{partial theta _{j}}}ln f(X;theta )right]." width="19414" height="2730.8">

フィッシャー情報行列は、NxN の正定値対称行列であり、その成分は、N次のパラメータ空間からなるフィッシャー情報距離を定義する。

p{displaystyle p}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/81eac1e205430d1f40810df36a0edffdc367af36" aria-hidden="true" alt="p" width="542" height="865.1">$ 個のパラメータによる尤度があるとき、フィッシャー情報行列のi番目の行と、j番目の列の要素がゼロであるなら、2つのパラメータ、

θi{displaystyle theta _{i}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/302b19204ed378e99ff4575341a67eebdbe5a555" aria-hidden="true" alt="theta _{{i}}" width="813.8" height="1080.4">$ と

θj{displaystyle theta _{j}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/56cd705853e931c59ff5fa4e7149cca4b22019e5" aria-hidden="true" alt="theta _{{j}}" width="861.2" height="1223.9">$ は直交である。パラメータが直交であるとき、最尤推定量が独立になり、別々に計算することができるため、扱いやすくなる。このため、研究者が何らかの研究上の問題を扱うとき、その問題に関わる確率密度が直交になるようにパラメーター化する方法を探すのに一定の時間を費やすのが普通である。

基本的性質[編集]

フィッシャー情報量は

0≤I(θ)<∞{displaystyle 0leq {mathcal {I}}(theta ) <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/19fac7e680b642fef9f21faad6fd8c702cdedf5b" aria-hidden="true" alt="0leq {mathcal {I}}(theta )&lt;infty ," width="6226.8" height="1223.9">

を満たす。

また

X{displaystyle X}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/68baa052181f707c662844a465bfeeb135e82bab" aria-hidden="true" alt="X" width="852.5" height="936.9">$ ，

Y{displaystyle Y}

IX,Y(θ)=IX(θ)+IY(θ){displaystyle {mathcal {I}}_{X,Y}(theta )={mathcal {I}}_{X}(theta )+{mathcal {I}}_{Y}(theta )} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/ed2a4fa45907c3569b731c2444f551be0feb6776" aria-hidden="true" alt="{mathcal {I}}_{{X,Y}}(theta )={mathcal {I}}_{X}(theta )+{mathcal {I}}_{Y}(theta )" width="10750.8" height="1295.7">

　(フィッシャー情報量の加算性）

が成立する。すなわち、「

(X,Y){displaystyle (X,Y)}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/41f29b9537685f499713112d6802e811cbf51bba" aria-hidden="true" alt="(X,Y)" width="2840.2" height="1223.9">$ が

θ{displaystyle theta }

X{displaystyle X}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/68baa052181f707c662844a465bfeeb135e82bab" aria-hidden="true" alt="X" width="852.5" height="936.9">$ が

θ{displaystyle theta }

Y{displaystyle Y}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/961d67d6b454b4df2301ac571808a3538b3a6d3f" aria-hidden="true" alt="Y" width="763.5" height="865.1">$ が

θ{displaystyle theta }

よって特に、無作為に取られたn個の標本が持つフィッシャー情報量は、1つの標本が持つフィッシャー情報量のn倍である（観察が独立である場合）。

Cramér–Raoの不等式[編集]

θ{displaystyle theta }

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/6e5ab2664b422d53eb0c7df3b87e1360d75ad9af" aria-hidden="true" alt="theta " width="469.5" height="936.9">$ の任意の不偏推定量

θ^{displaystyle {hat {theta }}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/f0eaae56d74c5844e86caeed8ae205ff9e413bba" aria-hidden="true" alt="{hat {theta }}" width="583.8" height="1223.9">$ は以下のCramér–Rao(クラメール-ラオ)の不等式を満たす：

var(θ^)≥1I(θ){displaystyle mathrm {var} ({hat {theta }})geq {frac {1}{{mathcal {I}}(theta )}},} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/3dc089db82f5571f6dda6a26ec48fd5118f2656e" aria-hidden="true" alt="{mathrm {var}}({hat {theta }})geq {frac {1}{{mathcal {I}}(theta )}}," width="6536" height="2587.3">

この不等式の直観的意味を説明する為、両辺の逆数を取った上で確率変数

X{displaystyle X}

IX(θ)≥1var(θ^(X)){displaystyle {mathcal {I}}_{X}(theta )geq {frac {1}{mathrm {var} ({hat {theta }}(X))}},} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/bd6d1e7e1b40964109878ded8f8b962a3f16a2be" aria-hidden="true" alt="{mathcal {I}}_{X}(theta )geq {frac {1}{{mathrm {var}}({hat {theta }}(X))}}," width="8802.8" height="2802.6">

となる。一般に推定量はその分散が小さいほど(よって分散の逆数が大きいほど)母数

θ{displaystyle theta }

θ^(X){displaystyle {hat {theta }}(X)}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/2321ae4be64907cf19fce5b73938a68a6fdc722f" aria-hidden="true" alt="{hat {theta }}(X)" width="2215.3" height="1439.2">$ を使って

θ{displaystyle theta }

X{displaystyle X}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/68baa052181f707c662844a465bfeeb135e82bab" aria-hidden="true" alt="X" width="852.5" height="936.9">$ から算出されたどんな不偏推定量であっても

X{displaystyle X}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/68baa052181f707c662844a465bfeeb135e82bab" aria-hidden="true" alt="X" width="852.5" height="936.9">$ が元々持っている「情報」以上に「よい」推定量にはなりえない事を意味する。

十分統計量との関係[編集]

一般に

T=t(X){displaystyle T=t(X)}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/9e296821f2b6f9ed5408f1d3250e54c15153d11b" aria-hidden="true" alt="T=t(X)" width="4031.6" height="1223.9">$ が統計量であるならば、

IT(θ)≤IX(θ){displaystyle {mathcal {I}}_{T}(theta )leq {mathcal {I}}_{X}(theta )} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/0cc27525152a3f4c8624ee072f338dda951714ec" aria-hidden="true" alt="{mathcal {I}}_{T}(theta )leq {mathcal {I}}_{X}(theta )" width="6252.5" height="1223.9">

が成立する。すなわち、「

X{displaystyle X}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/68baa052181f707c662844a465bfeeb135e82bab" aria-hidden="true" alt="X" width="852.5" height="936.9">$ から計算される値

T=t(X){displaystyle T=t(X)}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/9e296821f2b6f9ed5408f1d3250e54c15153d11b" aria-hidden="true" alt="T=t(X)" width="4031.6" height="1223.9">$ が持っている

θ{displaystyle theta }

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/6e5ab2664b422d53eb0c7df3b87e1360d75ad9af" aria-hidden="true" alt="theta " width="469.5" height="936.9">$ の情報」は「

X{displaystyle X}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/68baa052181f707c662844a465bfeeb135e82bab" aria-hidden="true" alt="X" width="852.5" height="936.9">$ 自身が持っている

θ{displaystyle theta }

上式で等号成立する必要十分条件は

T{displaystyle T}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/ec7200acd984a1d3a3d7dc455e262fbe54f7f6e0" aria-hidden="true" alt="T" width="704.5" height="936.9">$ が十分統計量であること。
これは

T(X){displaystyle T(X)}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/fe67aad4eff628fcb5bb28ee6a2213d28ff12e7e" aria-hidden="true" alt="T(X)" width="2336" height="1223.9">$ が

θ{displaystyle theta }

f{displaystyle f}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/132e57acb643253e7810ee9702d9581f159a1c61" aria-hidden="true" alt="f" width="550.5" height="1080.4">$ および

g{displaystyle g}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/d3556280e66fe2c0d0140df20935a6f057381d77" aria-hidden="true" alt="g" width="480.5" height="865.1">$ が存在して

f(X;θ)=g(T(X),θ)h(X){displaystyle f(X;theta )=g(T(X),theta )h(X)} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/08c98be43657512874c1611ebe40178ff312e731" aria-hidden="true" alt="f(X;theta )=g(T(X),theta )h(X)" width="11148.9" height="1223.9">

が成り立つ（ネイマン分解基準）事を使って証明できる。

カルバック・ライブラー情報量との関係[編集]

Xθ{displaystyle X_{theta }}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/b227f0965a3c1c2bd04d2238e8beb6b89d2c513f" aria-hidden="true" alt="X_{theta }" width="1260.5" height="1080.4">$ を母数

θ→=(θ1,…,θn){displaystyle {vec {theta }}=(theta _{1},ldots ,theta _{n})}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/d28f5d2b2c83f78ae9be0b6a35a67769ebcec5eb" aria-hidden="true" alt="{vec {theta }}=(theta _{1},ldots ,theta _{n})" width="6814.4" height="1510.9">$ を持つ確率変数とすると、カルバック・ライブラー情報量

DKL{displaystyle D_{mathrm {KL} }}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/2ab26e8799fe9a2b74e4c49da80bb18eaff04da0" aria-hidden="true" alt="D_{{{mathrm {KL}}}}" width="1921.3" height="1080.4">$ とフィッシャー情報行列は以下の関係が成り立つ。

DKL(Xθ→+h→‖Xθ→)=th→⋅I(θ→)⋅h→2+o(|h→|2){displaystyle D_{mathrm {KL} }(X_{{vec {theta }}+{vec {h}}}|X_{vec {theta }})={frac {{}^{t}{vec {h}}cdot {mathcal {I}}({vec {theta }})cdot {vec {h}}}{2}}+o(|{vec {h}}|^{2})} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/bdca38e8800b3135982f3ddf73d757970ab56181" aria-hidden="true" alt="D_{{{mathrm {KL}}}}(X_{{{vec {theta }}+{vec {h}}}}|X_{{{vec {theta }}}})={frac {{}^{t}{vec {h}}cdot {mathcal {I}}({vec {theta }})cdot {vec {h}}}{2}}+o(|{vec {h}}|^{2})" width="17547.8" height="2659.1">

すなわちフィッシャー情報行列はカルバック・ライブラー情報量をテイラー展開したときの2次の項として登場する。（0次、1次の項は0）。

ベルヌーイ分布[編集]

ベルヌーイ分布は、確率θ でもたらされる「成功」と、それ以外の場合に起きる「失敗」という2つの結果をもたらす確率変数が従う分布である（ベルヌーイ試行）。例えば、表が出る確率がθ、裏が出る確率が1 – θであるような、コインの投げ上げを考えれば良い。

n回の独立なベルヌーイ試行が含むフィッシャー情報量は、以下のようにして求められる。なお、以下の式中で、A は成功の回数、B は失敗の回数、n =A +B は試行の合計回数を示している。対数尤度関数の2階導関数は、

∂2∂θ2ln⁡f(A;θ)=∂2∂θ2ln⁡[θA(1−θ)B(A+B)!A!B!]=∂2∂θ2[Aln⁡(θ)+Bln⁡(1−θ)]=−Aθ2−B(1−θ)2{displaystyle {begin{aligned}{frac {partial ^{2}}{partial theta ^{2}}}ln {f(A;theta )}&={frac {partial ^{2}}{partial theta ^{2}}}ln left[theta ^{A}(1-theta )^{B}{frac {(A+B)!}{A!B!}}right]\&={frac {partial ^{2}}{partial theta ^{2}}}left[Aln(theta )+Bln(1-theta )right]\&=-{frac {A}{theta ^{2}}}-{frac {B}{(1-theta )^{2}}}end{aligned}}} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/bbb259695a93e5999c2fc8fe679c8f79f2c98bac" aria-hidden="true" alt="{displaystyle {begin{aligned}{frac {partial ^{2}}{partial theta ^{2}}}ln {f(A;theta )}&amp;={frac {partial ^{2}}{partial theta ^{2}}}ln left[theta ^{A}(1-theta )^{B}{frac {(A+B)!}{A!B!}}right]\&amp;={frac {partial ^{2}}{partial theta ^{2}}}left[Aln(theta )+Bln(1-theta )right]\&amp;=-{frac {A}{theta ^{2}}}-{frac {B}{(1-theta )^{2}}}end{aligned}}}" width="20440.7" height="7969.2">

であるから、

I(θ)=−E[∂2∂θ2ln⁡(f(A;θ))]=nθθ2+n(1−θ)(1−θ)2{displaystyle {begin{aligned}{mathcal {I}}(theta )&=-mathrm {E} left[{frac {partial ^{2}}{partial theta ^{2}}}ln(f(A;theta ))right]\&={frac {ntheta }{theta ^{2}}}+{frac {n(1-theta )}{(1-theta )^{2}}}end{aligned}}} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/995573eb1780f8d3a74acb53b1f32dd40abfc658" aria-hidden="true" alt="{begin{aligned}{mathcal {I}}(theta )&amp;=-{mathrm {E}}left[{frac {partial ^{2}}{partial theta ^{2}}}ln(f(A;theta ))right]\&amp;={frac {ntheta }{theta ^{2}}}+{frac {n(1-theta )}{(1-theta )^{2}}}end{aligned}}" width="12858.5" height="5529.4">

となる。但し、Aの期待値はn θ、B の期待値はn (1-θ )であることを用いた。

つまり、最終的な結果は、

I(θ)=nθ(1−θ),{displaystyle {mathcal {I}}(theta )={frac {n}{theta (1-theta )}},} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/0dc6aa3c4cd37f2f2cf2904563c161171d95fb56" aria-hidden="true" alt="{mathcal {I}}(theta )={frac {n}{theta (1-theta )}}," width="7334.5" height="2372">

である。これは、n回のベルヌーイ試行の成功数の平均の分散の逆数に等しい。

ガンマ分布[編集]

形状パラメータα、尺度パラメータβのガンマ分布において、フィッシャー情報行列は

I(α,β)=(ψ′(α)1β1βαβ2){displaystyle {mathcal {I}}(alpha ,beta )={begin{pmatrix}psi ‘(alpha )&{frac {1}{beta }}\{frac {1}{beta }}&{frac {alpha }{beta ^{2}}}end{pmatrix}}} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/4a46328e3902a23620c71590e2f183492242a30b" aria-hidden="true" alt="{mathcal {I}}(alpha ,beta )={begin{pmatrix}psi '(alpha )&{frac {1}{beta }}\{frac {1}{beta }}&{frac {alpha }{beta ^{2}}}end{pmatrix}}" width="11009.7" height="3807.2">

で与えられる。但し、ψ(α)はディガンマ関数を表す。

正規分布[編集]

平均μ、分散σ²の正規分布N(μ, σ²)において、フィッシャー情報行列は

I(μ,σ2)=(1σ20012(σ2)2){displaystyle {mathcal {I}}(mu ,sigma ^{2})={begin{pmatrix}{frac {1}{sigma ^{2}}}&0\0&{frac {1}{2(sigma ^{2})^{2}}}end{pmatrix}}} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/f804e3160cd68fdef25c0efeae1ddc1f3cb32ff0" aria-hidden="true" alt="{mathcal {I}}(mu ,sigma ^{2})={begin{pmatrix}{frac {1}{sigma ^{2}}}&amp;0\0&{frac {1}{2(sigma ^{2})^{2}}}end{pmatrix}}" width="11444" height="3807.2">

で与えられる。

多変量正規分布[編集]

N個の変数の多変量正規分布についてのフィッシャー情報行列は、特別な形式を持つ。

μ(θ)=(μ1(θ),μ2(θ),⋯,μN(θ)),{displaystyle mu (theta )={begin{pmatrix}mu _{1}(theta ),mu _{2}(theta ),cdots ,mu _{N}(theta )end{pmatrix}},} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/12f2f3b74be7591b4aa9e68ba74529703f4775de" aria-hidden="true" alt="mu (theta )={begin{pmatrix}mu _{{1}}(theta ),mu _{{2}}(theta ),cdots ,mu _{{N}}(theta )end{pmatrix}}," width="14433.8" height="1223.9">

であるとし、

Σ(θ){displaystyle Sigma (theta )}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/c3686e89d987472e5b443e2bed7d649a2d3cb77f" aria-hidden="true" alt="Sigma (theta )" width="1971" height="1223.9">$ が

μ(θ){displaystyle mu (theta )}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/6d5ebf4f267e249be3296e50317462d29266f02e" aria-hidden="true" alt="mu (theta )" width="1852" height="1223.9">$ の共分散行列であるとするなら、

X{displaystyle X}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/68baa052181f707c662844a465bfeeb135e82bab" aria-hidden="true" alt="X" width="852.5" height="936.9">$ ～

N(μ(θ),Σ(θ)){displaystyle N(mu (theta ),Sigma (theta ))}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/452cb74642a59c9b14d8865be14ddbd3add59e66" aria-hidden="true" alt="N(mu (theta ),Sigma (theta ))" width="5935.7" height="1223.9">$ のフィッシャー情報行列、

Im,n(0≤;m,n<N){displaystyle {mathcal {I}}_{m,n},(0leq ;m,n

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/3e838bd2011b51eed7c52f6b47b8dcd8cb063a60" aria-hidden="true" alt="{mathcal {I}}_{{m,n}},(0leq ;m,n<N)" width="9012.1" height="1295.7">$ の成分は以下の式で与えられる。

Im,n=∂μ∂θmΣ−1∂μ⊤∂θn+12tr(Σ−1∂Σ∂θmΣ−1∂Σ∂θn),{displaystyle {mathcal {I}}_{m,n}={frac {partial mu }{partial theta _{m}}}Sigma ^{-1}{frac {partial mu ^{top }}{partial theta _{n}}}+{frac {1}{2}}mathrm {tr} left(Sigma ^{-1}{frac {partial Sigma }{partial theta _{m}}}Sigma ^{-1}{frac {partial Sigma }{partial theta _{n}}}right),} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/7af36e19044ee0bfb2e7a2fc2f54bc7db3a6f516" aria-hidden="true" alt="{mathcal {I}}_{{m,n}}={frac {partial mu }{partial theta _{m}}}Sigma ^{{-1}}{frac {partial mu ^{top }}{partial theta _{n}}}+{frac {1}{2}}{mathrm {tr}}left(Sigma ^{{-1}}{frac {partial Sigma }{partial theta _{m}}}Sigma ^{{-1}}{frac {partial Sigma }{partial theta _{n}}}right)," width="21722.2" height="2730.8">

ここで、

(..)⊤{displaystyle (..)^{top }}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/c257137c16ffbf844d7985c4d672cc8c94af22f6" aria-hidden="true" alt="(..)^{top }" width="2319.8" height="1367.4">$ はベクトルの転置を示す記号であり、

tr(..){displaystyle mathrm {tr} (..)}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/66cd37b7cd72df8d29873faf328164458e0f9e36" aria-hidden="true" alt="{mathrm {tr}}(..)" width="2451.3" height="1223.9">$ は、平方行列のトレースを表す記号である。また、微分は以下のように定義される。

∂μ∂θm=(∂μ1∂θm,∂μ2∂θm,⋯,∂μN∂θm){displaystyle {frac {partial mu }{partial theta _{m}}}={begin{pmatrix}{frac {partial mu _{1}}{partial theta _{m}}},&{frac {partial mu _{2}}{partial theta _{m}}},&cdots ,&{frac {partial mu _{N}}{partial theta _{m}}}end{pmatrix}}} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/8a6b4a86ea2e994a376c6233a8dd7b003d19de1a" aria-hidden="true" alt="{frac {partial mu }{partial theta _{m}}}={begin{pmatrix}{frac {partial mu _{1}}{partial theta _{m}}},&{frac {partial mu _{2}}{partial theta _{m}}},&amp;cdots ,&{frac {partial mu _{N}}{partial theta _{m}}}end{pmatrix}}" width="15250.9" height="2587.3">

∂Σ∂θm=(∂Σ1,1∂θm∂Σ1,2∂θm⋯∂Σ1,N∂θm∂Σ2,1∂θm∂Σ2,2∂θm⋯∂Σ2,N∂θm⋮⋮⋱⋮∂ΣN,1∂θm∂ΣN,2∂θm⋯∂ΣN,N∂θm).{displaystyle {frac {partial Sigma }{partial theta _{m}}}={begin{pmatrix}{frac {partial Sigma _{1,1}}{partial theta _{m}}}&{frac {partial Sigma _{1,2}}{partial theta _{m}}}&cdots &{frac {partial Sigma _{1,N}}{partial theta _{m}}}\\{frac {partial Sigma _{2,1}}{partial theta _{m}}}&{frac {partial Sigma _{2,2}}{partial theta _{m}}}&cdots &{frac {partial Sigma _{2,N}}{partial theta _{m}}}\\vdots &vdots &ddots &vdots \\{frac {partial Sigma _{N,1}}{partial theta _{m}}}&{frac {partial Sigma _{N,2}}{partial theta _{m}}}&cdots &{frac {partial Sigma _{N,N}}{partial theta _{m}}}end{pmatrix}}.} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/632e7d163aec657324991987073c65a6841a3dab" aria-hidden="true" alt="{frac {partial Sigma }{partial theta _{m}}}={begin{pmatrix}{frac {partial Sigma _{{1,1}}}{partial theta _{m}}}&{frac {partial Sigma _{{1,2}}}{partial theta _{m}}}&amp;cdots &{frac {partial Sigma _{{1,N}}}{partial theta _{m}}}\\{frac {partial Sigma _{{2,1}}}{partial theta _{m}}}&{frac {partial Sigma _{{2,2}}}{partial theta _{m}}}&amp;cdots &{frac {partial Sigma _{{2,N}}}{partial theta _{m}}}\\vdots &amp;vdots &amp;ddots &amp;vdots \\{frac {partial Sigma _{{N,1}}}{partial theta _{m}}}&{frac {partial Sigma _{{N,2}}}{partial theta _{m}}}&amp;cdots &{frac {partial Sigma _{{N,N}}}{partial theta _{m}}}end{pmatrix}}." width="17211" height="12705.3">

フィッシャー情報量 – Wikipedia

フィッシャー情報行列[編集]

基本的性質[編集]

Cramér–Raoの不等式[編集]

十分統計量との関係[編集]

カルバック・ライブラー情報量との関係[編集]

ベルヌーイ分布[編集]

ガンマ分布[編集]

正規分布[編集]

多変量正規分布[編集]

関連項目[編集]

Recent Posts

Recent Comments

Archives

Categories

Meta