非線形最小二乗法 – Wikipedia

Posted on October 31, 2019 by lordneo

非線形最小二乗法^[1]^[2]（ひせんけいさいしょうにじょうほう、英: non-linear least squares）とは、観測データに対するカーブフィッティング手法の一つであり、最小二乗法を非線形なモデル関数に拡張したものである。非線形最小二乗法は、未知パラメータ（フィッティングパラメータ）を非線形の形で持つ関数モデルを用いて、観測データを記述すること、すなわち、データに最も当てはまりの良い^{[注 1]}フィッティングパラメータを推定することを目的とする。

Table of Contents

最小二乗法の主張[編集]

m{displaystyle m}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/0a07d98bb302f3856cbabc47b2b9016692e3f7bc" aria-hidden="true" alt="m" width="878.5" height="721.6">$ 個のデータポイント

(xi,yi),(x2,y2),…,(xm,ym){displaystyle (x_{i},y_{i}),(x_{2},y_{2}),dots ,(x_{m},y_{m})}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/1924af0d84e27eef6269deec2117e53159234bde" aria-hidden="true" alt="{displaystyle (x_{i},y_{i}),(x_{2},y_{2}),dots ,(x_{m},y_{m})}" width="12575" height="1223.9">$ からなるセットに対し、

n{displaystyle n}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/a601995d55609f2d9f5e233e36fbe9ea26011b3b" aria-hidden="true" alt="n" width="600.5" height="721.6">$ 個^{[注 2]}のフィッティングパラメータ

β1,β2,…,βn{displaystyle beta _{1},beta _{2},dots ,beta _{n}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/05a3d8a9959f48bf33d478599de569796b7f53f9" aria-hidden="true" alt="{displaystyle beta _{1},beta _{2},dots ,beta _{n}}" width="5806.6" height="1080.4">$ を持つモデル関数

y=f(x,β){displaystyle y=f(x,{boldsymbol {beta }})} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/18ddc3f5ba56ee42835a751de7abcf8d39a69c1e" aria-hidden="true" alt="y=f(x,{boldsymbol beta })" width="4839.2" height="1223.9">

(1-1)

をあてはめる場合を考える。ここで、それぞれのデータ

(xm,ym){displaystyle (x_{m},y_{m})}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/6ad1904bbe759ab59a4fe514a28dfbd10fff49c4" aria-hidden="true" alt="{displaystyle (x_{m},y_{m})}" width="3729.6" height="1223.9">$ において、

xi{displaystyle x_{i}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/e87000dd6142b81d041896a30fe58f0c3acb2158" aria-hidden="true" alt="x_{i}" width="916.8" height="865.1">$ は説明変数とし、

yi{displaystyle y_{i}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/67d30d30b6c2dbe4d6f150d699de040937ecc95f" aria-hidden="true" alt="y_i" width="834.8" height="865.1">$ は目的変数とする。

β=(β1,β2,…,βn){displaystyle {boldsymbol {beta }}=(beta _{1},beta _{2},dots ,beta _{n})}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/7151cf3c53d0210f6ced5a6e261eaec067b42e84" aria-hidden="true" alt="{displaystyle {boldsymbol {beta }}=(beta _{1},beta _{2},dots ,beta _{n})}" width="8580.2" height="1223.9">$ は、前記の

n{displaystyle n}

βi{displaystyle beta _{i}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/f111c43e5cfce37bcffc6121a19b81c6efd825ed" aria-hidden="true" alt="beta _{i}" width="910.8" height="1080.4">$ からなる実数ベクトルとする。

また、以下で定まる残差

ri=yi−f(xi,β)(i=1,2,…,m){displaystyle r_{i}=y_{i}-f(x_{i},{boldsymbol {beta }})qquad (i=1,2,dots ,m)} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/0cffa77d6faa5ec9a086f62acb2ad67e8bafa9be" aria-hidden="true" alt="r_{i}=y_{i}-f(x_{i},{boldsymbol beta })qquad (i=1,2,dots ,m)" width="16552.3" height="1223.9">

(1-2)

のそれぞれは、それぞれ、期待値

0{displaystyle 0}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/2aae8864a3c1fec9585261791a809ddec1489950" aria-hidden="true" alt="{displaystyle 0}" width="500.5" height="936.9">$ 、標準偏差

σi{displaystyle sigma _{i}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/6ab3208a7d0c634ef720e03ff5a9949e8310edc4" aria-hidden="true" alt="sigma _{i}" width="915.8" height="865.1">$ の正規分布に従うとする。また、話を簡単にするため、

xi{displaystyle x_{i}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/e87000dd6142b81d041896a30fe58f0c3acb2158" aria-hidden="true" alt="x_{i}" width="916.8" height="865.1">$ それぞれは、いずれも誤差を持たないとする。

このとき、考えるべき問題は、もっとも当てはまりのよい

β{displaystyle {boldsymbol {beta }}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/702cafc420cc00c54896f6d125112820956aaf6b" aria-hidden="true" alt="{displaystyle {boldsymbol {beta }}}" width="660.5" height="1080.4">$ を見つけ出すことである。

非線形最小二乗法では、以下の残差平方和（より正確に言えば、標準化された残差平方和）

S(β)=∑i=1mri22σi2=∑i=1m(yi−f(xi,β))22σi2{displaystyle S({boldsymbol {beta }})=sum _{i=1}^{m}{frac {r_{i}^{2}}{2{sigma }_{i}^{2}}}=sum _{i=1}^{m}{frac {({y}_{i}-f({x}_{i},{boldsymbol {beta }}))^{2}}{2{sigma }_{i}^{2}}}} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/8504a10d53bbd8955dfd558702f815f29c1b74a1" aria-hidden="true" alt="S({boldsymbol beta })=sum _{{i=1}}^{{m}}{frac {r_{i}^{2}}{2{sigma }_{{i}}^{{2}}}}=sum _{{i=1}}^{{m}}{frac {({y}_{{i}}-f({x}_{{i}},{boldsymbol {beta }}))^{{2}}}{2{sigma }_{{i}}^{{2}}}}" width="16865.3" height="3017.9">

(1-3)

を最小とするような

β{displaystyle {boldsymbol {beta }}}

f{displaystyle f}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/132e57acb643253e7810ee9702d9581f159a1c61" aria-hidden="true" alt="f" width="550.5" height="1080.4">$ を与えるフィッティングパラメータと考える^[1]^[2]。

この考え方は、数多ある考え方の一つに過ぎない。他の考え方としては、例えば

$∑i=1n|ri|{displaystyle {sum }_{i=1}^{n}|{r}_{i}|} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/d44bef594620e8c78cc45393a5a3a9a64bc161b2" aria-hidden="true" alt="{sum }_{{i=1}}^{{n}}|{r}_{{i}}|" width="4046" height="1654.5">$ を最小にする考え方
$∑i=1m(yi−f(xi,β))2{displaystyle sum _{i=1}^{m}(y_{i}-f(x_{i},{boldsymbol {beta }}))^{2}} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/36817522b80341580027134d37bb15aae3d09009" aria-hidden="true" alt="sum _{{i=1}}^{m}(y_{i}-f(x_{i},{boldsymbol beta }))^{2}" width="8087.1" height="2946.1">$ を最小とする考え方（単に各データのバラつきが同じと勝手に仮定しただけ）。
データ、モデル関数共に何らかの変換（例えば対数変換）を加えたうえで、最小二乗法をする考え方。
カイ二乗値を最小にする考え方^[3]。

等があり得る。これらの考え方で”最適”となったフッティングパラメータは、最小二乗法では”最適”とは限らない^{[注 3]}。

ただし、最小二乗法の考え方は、確率論的に尤もらしさが裏付けられている^[2]。このことについては、次節にて論じる。

最小二乗法の尤もらしさ[編集]

最小二乗法は、正規分布に対応したフィッティングパラメータの最尤推定法である^[4]。ここでは最小二乗法の尤もらしさについて、確率論を援用して検討する^[2]。すなわち、残差

ri{displaystyle {boldsymbol {r_{i}}}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/aad5fef6e7d6a9d068bc11981336fdc84f4b50a0" aria-hidden="true" alt="{displaystyle {boldsymbol {r_{i}}}}" width="916.2" height="865.1">$ それぞれが、期待値

0{displaystyle {boldsymbol {0}}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/58fe04d1d35aac19861a0d7c9f7c374c88d42db0" aria-hidden="true" alt="boldsymbol{0}" width="575.5" height="936.9">$ 、標準偏差

σi{displaystyle {boldsymbol {sigma _{i}}}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/65c295b6ce7a50240444f99a9f1547ef948fbc88" aria-hidden="true" alt="{displaystyle {boldsymbol {sigma _{i}}}}" width="1073.2" height="865.1">$ の正規分布に従う確率変数であり、かつ、

ri{displaystyle r_{i}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/a0b6d651eaf432dbf1f106021c8bb499ae83fd1f" aria-hidden="true" alt="r_{i}" width="795.8" height="865.1">$ からなる確率変数の族は、独立試行と考え、確率論を援用する。

仮定より、残差

ri{displaystyle r_{i}}

0{displaystyle 0}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/2aae8864a3c1fec9585261791a809ddec1489950" aria-hidden="true" alt="{displaystyle 0}" width="500.5" height="936.9">$ 、標準偏差

σi{displaystyle sigma _{i}}

(xi,yi){displaystyle (x_{i},y_{i})}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/d6dbb919b91ccacf17ed47898048428a1baf9703" aria-hidden="true" alt="(x_{i},y_{i})" width="2975.8" height="1223.9">$ において、その測定値が

yi{displaystyle y_{i}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/67d30d30b6c2dbe4d6f150d699de040937ecc95f" aria-hidden="true" alt="y_i" width="834.8" height="865.1">$ となる確率

P(yi){displaystyle P(y_{i})}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/9779f0d16e3112e9b3f1e57fed4c70f14af64264" aria-hidden="true" alt="{displaystyle P(y_{i})}" width="2365.3" height="1223.9">$ は、

P(yi)=1σ2πexp⁡(−ri22σ2){displaystyle {P}({y}_{i})={frac {1}{sigma {sqrt {2pi }}}}exp left(-{frac {{r}_{i}^{2}}{2sigma ^{2}}}right)} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/f21930911fc241bd4e22c4cebe07b2b1e07bf0b3" aria-hidden="true" alt="{P}({y}_{{i}})={frac {1}{sigma {sqrt {2pi }}}}exp left(-{frac {{r}_{{i}}^{{2}}}{2sigma ^{2}}}right)" width="12486.3" height="3233.2">

　(2-1)

となる。

今、データの測定は（数学的に言えば残差

ri{displaystyle {boldsymbol {r_{i}}}}

m{displaystyle {boldsymbol {m}}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/4af144129c1eea0bb04a677c237d04d0cddfba1f" aria-hidden="true" alt="{displaystyle {boldsymbol {m}}}" width="1032.5" height="721.6">$ 個のデータポイントのセット

(x1,y1),(x2,y2),…,(xm,ym){displaystyle {boldsymbol {(x_{1},y_{1}),(x_{2},y_{2}),ldots ,(x_{m},y_{m})}}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/06d5d66fd398c146f032214201a82e9b85b3569e" aria-hidden="true" alt="{displaystyle {boldsymbol {(x_{1},y_{1}),(x_{2},y_{2}),ldots ,(x_{m},y_{m})}}}" width="14502.1" height="1223.9">$ が得られる確率

P(y1,…,ym){displaystyle {boldsymbol {P(y_{1},ldots ,y_{m})}}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/8d8a0b38503bdbff2a6f46328c6989817ef6c8d2" aria-hidden="true" alt="{displaystyle {boldsymbol {P(y_{1},ldots ,y_{m})}}}" width="6695" height="1223.9">$ は、

P(y1,…,ym)=∏i=1mP(yi)=∏i=1m1σ2πexp⁡(−ri22σ2)=1(σ2π)mexp⁡(∑i=1m(−(yi−f(xi,β))22σ2)){displaystyle {begin{aligned}P(y_{1},dots ,y_{m})&=prod _{i=1}^{m}P(y_{i})&=prod _{i=1}^{m}{frac {1}{sigma {sqrt {2pi }}}}exp left(-{frac {r_{i}^{2}}{2sigma ^{2}}}right)&={frac {1}{(sigma {sqrt {2pi }})^{m}}}exp left(sum _{i=1}^{m}left(-{frac {(y_{i}-f(x_{i},{boldsymbol {beta }}))^{2}}{2sigma ^{2}}}right)right)end{aligned}}} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/a185387ff905198dba8bb66a3ad1b19d797803a1" aria-hidden="true" alt="{begin{aligned}P(y_{1},dots ,y_{m})&amp;=prod _{{i=1}}^{m}P(y_{i})&amp;=prod _{{i=1}}^{m}{frac {1}{sigma {sqrt {2pi }}}}exp left(-{frac {r_{i}^{2}}{2sigma ^{2}}}right)&amp;={frac {1}{(sigma {sqrt {2pi }})^{m}}}exp left(sum _{{i=1}}^{m}left(-{frac {(y_{i}-f(x_{i},{boldsymbol beta }))^{2}}{2sigma ^{2}}}right)right)end{aligned}}" width="26172.4" height="9547.9">

　(2-2)

となる。ここで、

Πi=1n{displaystyle {Pi }_{i=1}^{n}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/e4f2c9326bc1d1e0062492942b06df420186b9a6" aria-hidden="true" alt="{Pi }_{{i=1}}^{{n}}" width="1999.2" height="1223.9">$ は、連乗積を表す。

上式において、正規分布の単峰性より、確率

P(yi,…,ym){displaystyle P(y_{i},ldots ,y_{m})}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/1ee7f5e3b5547ff46f546cd063cf1f357f9690b5" aria-hidden="true" alt="{displaystyle P(y_{i},ldots ,y_{m})}" width="5806.5" height="1223.9">$ は、

S(β)=∑i=1m(yi−f(xi,β))22σ2{displaystyle S(beta )=sum _{i=1}^{m}{frac {(y_{i}-f(x_{i},{boldsymbol {beta }}))^{2}}{2sigma ^{2}}}} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/2927deb422c088b79e553520f059b42c5b02d9e5" aria-hidden="true" alt="S(beta )=sum _{{i=1}}^{m}{frac {(y_{i}-f(x_{i},{boldsymbol beta }))^{2}}{2sigma ^{2}}}" width="11945.9" height="3017.9">

　(2-3)

が最小（最も

0{displaystyle 0}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/2aae8864a3c1fec9585261791a809ddec1489950" aria-hidden="true" alt="{displaystyle 0}" width="500.5" height="936.9">$ に近いとき）において、最大（最尤）となる。すなわち、最尤法の教えるところによれば、このとき、もっとも当てはまりがよいと考えるのが妥当だろうということになる。

勾配方程式への帰着[編集]

我々が考えるべき問題は、標準化された残差平方和

S(β)=∑i=1mri22σi2=∑i=1m(yi−f(xi,β))22σi2{displaystyle S({boldsymbol {beta }})=sum _{i=1}^{m}{frac {r_{i}^{2}}{2sigma _{i}^{2}}}=sum _{i=1}^{m}{frac {(y_{i}-f(x_{i},{boldsymbol {beta }}))^{2}}{2sigma _{i}^{2}}}} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/ec95349b2be58d4aed2b6fb5cd0b3f063be82de3" aria-hidden="true" alt="S({boldsymbol beta })=sum _{{i=1}}^{m}{frac {r_{i}^{2}}{2sigma _{i}^{2}}}=sum _{{i=1}}^{m}{frac {(y_{i}-f(x_{i},{boldsymbol {beta }}))^{2}}{2sigma _{i}^{2}}}" width="16865.3" height="3017.9">

　(3-1)

を最小とするようなパラメータ

β{displaystyle {boldsymbol {beta }}}

このような

β{displaystyle {boldsymbol {beta }}}

S{displaystyle S}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/4611d85173cd3b508e67077d4a1252c9c05abca2" aria-hidden="true" alt="S" width="645.5" height="936.9">$ の勾配 grad

S{displaystyle S}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/4611d85173cd3b508e67077d4a1252c9c05abca2" aria-hidden="true" alt="S" width="645.5" height="936.9">$ は

0{displaystyle 0}

β{displaystyle {boldsymbol {beta }}}

∂S∂βj=2∑i=1mri∂ri∂βj=0(j=1,…,n)(1){displaystyle {frac {partial S}{partial beta _{j}}}=2sum _{i=1}^{m}r_{i}{frac {partial r_{i}}{partial beta _{j}}}=0quad (j=1,dots ,n)qquad (1)} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/9eefc6ab28d68876c03a96ed301b4d94bb46e64c" aria-hidden="true" alt="{frac {partial S}{partial beta _{j}}}=2sum _{{i=1}}^{m}r_{i}{frac {partial r_{i}}{partial beta _{j}}}=0quad (j=1,dots ,n)qquad (1)" width="20149.7" height="2946.1">

　(3-2)

数値解法[編集]

線形の最小二乗法では、式(3-2)は未知パラメータ

β{displaystyle {boldsymbol {beta }}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/702cafc420cc00c54896f6d125112820956aaf6b" aria-hidden="true" alt="{boldsymbol {beta }}" width="660.5" height="1080.4">$ についての連立一次方程式になるため、行列を用いて容易に解くことができるが、非線形最小二乗法では反復解法を用いる必要がある。解法には以下のような方法が知られている^[4]。

脚注・参考文献[編集]

参考文献[編集]

^ ^a ^b 本間仁; 春日屋伸昌『次元解析・最小二乗法と実験式』コロナ社、1989年。
^ ^a ^b ^c ^d
T. Strutz: Data Fitting and Uncertainty (A practical introduction to weighted least squares and beyond). Vieweg+Teubner, ISBN 978-3-8348-1022-9.
Ch6に、非線形最小二乗法の尤もらしさに関する記述が記載されている。
^ http://www.hulinks.co.jp/support/kaleida/curvefit.html
^ ^a ^b 中川徹; 小柳義夫『最小二乗法による実験データ解析』東京大学出版会、1982年、19, 95-124頁。ISBN 4-13-064067-4。

脚注[編集]

^ 実際には、重解が出る場合も多い。
^ 少なくとも $m>n{displaystyle m>n} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/637039c4a193f33fee72ebfeb6cb003593696160" aria-hidden="true" alt="{displaystyle m></img>n}”/></span> でなければナンセンスとなる。</span> </li> <li id=" width="2813.1" height="793.3">^ 無論、例えば一つの特別な状況として、いずれの残差の標準偏差も、全て同じ値σである時、すなわち、 ri{displaystyle r_{i}} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/a0b6d651eaf432dbf1f106021c8bb499ae83fd1f" aria-hidden="true" alt="r_{i}" width="795.8" height="865.1"> それぞれが、期待値 0{displaystyle 0} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/2aae8864a3c1fec9585261791a809ddec1489950" aria-hidden="true" alt="{displaystyle 0}" width="500.5" height="936.9">、標準偏差 σ{displaystyle sigma } <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/59f59b7c3e6fdb1d0365a494b81fb9a696138c36" aria-hidden="true" alt="sigma" width="572.5" height="721.6"> の正規分布に従う場合には、残差平方和 S{displaystyle S} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/4611d85173cd3b508e67077d4a1252c9c05abca2" aria-hidden="true" alt="S" width="645.5" height="936.9"> から、共通項 1/(2σi2){displaystyle 1/(2{sigma _{i}}^{2})} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/557ba15d0222b69bb9e550604768d4225f4c11bf" aria-hidden="true" alt="{displaystyle 1/(2{sigma _{i}}^{2})}" width="3650.2" height="1367.4"> がくくりだせる。したがって、この場合には、最小二乗法は、 ∑i=1m(yi−f(xi,β))2{displaystyle sum _{i=1}^{m}(y_{i}-f(x_{i},{boldsymbol {beta }}))^{2}} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/36817522b80341580027134d37bb15aae3d09009" aria-hidden="true" alt="sum _{{i=1}}^{m}(y_{i}-f(x_{i},{boldsymbol beta }))^{2}" width="8087.1" height="2946.1"> を最小とするような β{displaystyle {boldsymbol {beta }}} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/702cafc420cc00c54896f6d125112820956aaf6b" aria-hidden="true" alt="{displaystyle {boldsymbol {beta }}}" width="660.5" height="1080.4">$ が、最も当てはまりが良いと考えるのと同等である。

非線形最小二乗法 – Wikipedia

最小二乗法の主張[編集]

最小二乗法の尤もらしさ[編集]

勾配方程式への帰着[編集]

数値解法[編集]

脚注・参考文献[編集]

参考文献[編集]

脚注[編集]

Recent Posts

Recent Comments

Archives

Categories

Meta