Shrinkage Fields (image restoration) – Wikipedia

From Wikipedia, the free encyclopedia

Shrinkage fields is a random field-based machine learning technique that aims to perform high quality image restoration (denoising and deblurring) using low computational overhead.

The restored image

x{displaystyle x}

is predicted from a corrupted observation

y{displaystyle y}

after training on a set of sample images

S{displaystyle S}

.

A shrinkage (mapping) function

fπi(v)=j=1Mπi,jexp(γ2(vμj)2){displaystyle {f}_{{pi }_{i}}left(vright)={sum }_{j=1}^{M}{pi }_{i,j}exp left(-{frac {gamma }{2}}{left(v-{mu }_{j}right)}^{2}right)}

is directly modeled as a linear combination of radial basis function kernels, where

γ{displaystyle gamma }

is the shared precision parameter,

μ{displaystyle mu }

denotes the (equidistant) kernel positions, and M is the number of Gaussian kernels.

Because the shrinkage function is directly modeled, the optimization procedure is reduced to a single quadratic minimization per iteration, denoted as the prediction of a shrinkage field

gΘ(x)=F1[F(λKTy+i=1NFiTfπi(Fix))λKˇ*Kˇ+i=1NFˇi*Fˇi]=Ω1η{displaystyle {g}_{mathrm {Theta } }left({text{x}}right)={mathcal {F}}^{-1}leftlbrack {frac {{mathcal {F}}left(lambda {K}^{T}y+{sum }_{i=1}^{N}{F}_{i}^{T}{f}_{{pi }_{i}}left({F}_{i}xright)right)}{lambda {check {K}}^{text{*}}circ {check {K}}+{sum }_{i=1}^{N}{check {F}}_{i}^{text{*}}circ {check {F}}_{i}}}rightrbrack ={mathrm {Omega } }^{-1}eta }

where

F{displaystyle {mathcal {F}}}

denotes the discrete Fourier transform and

Fx{displaystyle F_{x}}

is the 2D convolution

fx{displaystyle {text{f}}otimes {text{x}}}

with point spread function filter,

F˘{displaystyle {breve {F}}}

is an optical transfer function defined as the discrete Fourier transform of

f{displaystyle {text{f}}}

, and

F˘*{displaystyle {breve {F}}^{text{*}}}

is the complex conjugate of

F˘{displaystyle {breve {F}}}

.

x^t{displaystyle {hat {x}}_{t}}

is learned as

x^t=gΘt(x^t1){displaystyle {hat {x}}_{t}={g}_{{mathrm {Theta } }_{t}}left({hat {x}}_{t-1}right)}

for each iteration

t{displaystyle t}

with the initial case

x^0=y{displaystyle {hat {x}}_{0}=y}

, this forms a cascade of Gaussian conditional random fields (or cascade of shrinkage fields (CSF)). Loss-minimization is used to learn the model parameters

Θt={λt,πti,fti}i=1N{displaystyle {mathrm {Theta } }_{t}={leftlbrace {lambda }_{t},{pi }_{mathit {ti}},{f}_{mathit {ti}}rightrbrace }_{i=1}^{N}}

.

The learning objective function is defined as

J(Θt)=s=1Sl(x^t(s);xgt(s)){displaystyle Jleft({mathrm {Theta } }_{t}right)={sum }_{s=1}^{S}lleft({hat {x}}_{t}^{left(sright)};{x}_{gt}^{left(sright)}right)}

, where

l{displaystyle l}

is a differentiable loss function which is greedily minimized using training data

{xgt(s),y(s),k(s)}s=1S{displaystyle {leftlbrace {x}_{gt}^{left(sright)},{y}^{left(sright)},{k}^{left(sright)}rightrbrace }_{s=1}^{S}}

and

x^t(s){displaystyle {hat {x}}_{t}^{left(sright)}}

.

Performance[edit]

Preliminary tests by the author suggest that RTF5[1] obtains slightly better denoising performance than

CSF7×7{3,4,5}{displaystyle {text{CSF}}_{7times 7}^{leftlbrace mathrm {3,4,5} rightrbrace }}

, followed by

CSF5×55{displaystyle {text{CSF}}_{5times 5}^{5}}

,

CSF7×72{displaystyle {text{CSF}}_{7times 7}^{2}}

,

CSF5×5{3,4}{displaystyle {text{CSF}}_{5times 5}^{leftlbrace mathrm {3,4} rightrbrace }}

, and BM3D.

BM3D denoising speed falls between that of

CSF5×54{displaystyle {text{CSF}}_{5times 5}^{4}}

and

CSF7×74{displaystyle {text{CSF}}_{7times 7}^{4}}

, RTF being an order of magnitude slower.

Advantages[edit]

  • Results are comparable to those obtained by BM3D (reference in state of the art denoising since its inception in 2007)
  • Minimal runtime compared to other high-performance methods (potentially applicable within embedded devices)
  • Parallelizable (e.g.: possible GPU implementation)
  • Predictability:
  • Fast training even with CPU

Implementations[edit]

See also[edit]

References[edit]

  1. ^
    Jancsary, Jeremy; Nowozin, Sebastian; Sharp, Toby; Rother, Carsten (10 April 2012). Regression Tree Fields – An Efficient, Non-parametric Approach to Image Labeling Problems. IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR). Providence, RI, USA: IEEE Computer Society. doi:10.1109/CVPR.2012.6247950.