Farkas’ lemma – Wikipedia

Posted on February 28, 2016 by lordneo

Farkas’ lemma is a solvability theorem for a finite system of linear inequalities in mathematics. It was originally proven by the Hungarian mathematician Gyula Farkas.^[1]
Farkas’ lemma is the key result underpinning the linear programming duality and has played a central role in the development of mathematical optimization (alternatively, mathematical programming). It is used amongst other things in the proof of the Karush–Kuhn–Tucker theorem in nonlinear programming.^[2]
Remarkably, in the area of the foundations of quantum theory, the lemma also underlies the complete set of Bell inequalities in the form of necessary and sufficient conditions for the existence of a local hidden-variable theory, given data from any specific set of measurements.^[3]

Generalizations of the Farkas’ lemma are about the solvability theorem for convex inequalities,^[4] i.e., infinite system of linear inequalities. Farkas’ lemma belongs to a class of statements called “theorems of the alternative”: a theorem stating that exactly one of two systems has a solution.^[5]

Table of Contents

Statement of the lemma[edit]

There are a number of slightly different (but equivalent) formulations of the lemma in the literature. The one given here is due to Gale, Kuhn and Tucker (1951).^[6]

Here, the notation

x≥0{displaystyle mathbf {x} geq 0}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/24bc2f6d24125839cf5adea3114d61060be0c794" aria-hidden="true" alt="{displaystyle mathbf {x} geq 0}" width="2442.1" height="1008.6">$ means that all components of the vector

x{displaystyle mathbf {x} }

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/32adf004df5eb0a8c7fd8c0b6b7405183c5a5ef2" aria-hidden="true" alt="mathbf {x} " width="607.5" height="721.6">$ are nonnegative.

Example[edit]

Let m, n = 2,

A=[6430]{displaystyle mathbf {A} ={begin{bmatrix}6&4\3&0end{bmatrix}}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/358914fff8584656f04118be93c8dc192a6e12c0" aria-hidden="true" alt="{displaystyle mathbf {A} ={begin{bmatrix}6&4\3&0end{bmatrix}}}" width="5585.1" height="2659.1">$ , and

b=[b1b2]{displaystyle mathbf {b} ={begin{bmatrix}b_{1}\b_{2}end{bmatrix}}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/963a783eb16d9acfe399c7c604d499f0e8176c2f" aria-hidden="true" alt="{displaystyle mathbf {b} ={begin{bmatrix}b_{1}\b_{2}end{bmatrix}}}" width="4237.5" height="2659.1">$ . The lemma says that exactly one of the following two statements must be true (depending on b₁ and b₂):

There exist x₁ ≥ 0, x₂ ≥ 0 such that 6 x₁ + 4 x₂ = b₁ and 3 x₁ = b₂, or
There exist y₁, y₂ such that 6 y₁ + 3 y₂ ≥ 0, 4 y₁ ≥ 0, and b₁y₁ + b₂y₂ < 0.

Here is a proof of the lemma in this special case:

If b₂ ≥ 0 and b₁ − 2b₂ ≥ 0, then option 1 is true, since the solution of the linear equations is x₁ = b₂/3 and x₂ = (b₁-2b₂) / 4. Option 2 is false, since b₁y₁ + b₂y₂ ≥ b₂ (2 y₁ + y₂) = b₂ (6 y₁ + 3 y₂) / 3, so if the right-hand side is positive, the left-hand side must be positive too.
Otherwise, option 1 is false, since the unique solution of the linear equations is not weakly positive. But in this case, option 2 is true:
- If b₂ < 0, then we can take e.g. y₁ = 0 and y₂ = 1.
- If b₁ − 2b₂ < 0, then, for some number B > 0, b₁ = 2b₂ − B, so: b₁y₁ + b₂y₂ = 2 b₂y₁ + b₂y₂ − B y₁ = b₂ (6 y₁ + 3 y₂) / 3 − B y₁. Thus we can take, for example, y₁ = 1, y₂ = −2.

Geometric interpretation[edit]

Consider the closed convex cone

C(A){displaystyle C(mathbf {A} )}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/a9205986721b949b527e680a27070c8304509ce6" aria-hidden="true" alt="{displaystyle C(mathbf {A} )}" width="2409" height="1223.9">$ spanned by the columns of

A{displaystyle mathbf {A} }

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/0795cc96c75d81520a120482662b90f024c9a1a1" aria-hidden="true" alt="mathbf {A} " width="869.5" height="936.9">$ ; that is,

C(A)={Ax∣x≥0}.{displaystyle C(mathbf {A} )={mathbf {A} mathbf {x} mid mathbf {x} geq 0}.} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/f2cf29da9cff36ac7648a71884a94c28f851c520" aria-hidden="true" alt="{displaystyle C(mathbf {A} )={mathbf {A} mathbf {x} mid mathbf {x} geq 0}.}" width="9775.7" height="1223.9">

Observe that

C(A){displaystyle C(mathbf {A} )}

b{displaystyle mathbf {b} }

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/13ebf4628a1adf07133a6009e4a78bdd990c6eb9" aria-hidden="true" alt="mathbf {b} " width="639.5" height="936.9">$ for which the first assertion in the statement of Farkas’ lemma holds. On the other hand, the vector

y{displaystyle mathbf {y} }

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/bb25a040b592282dc2a254c3117e792c3c81161f" aria-hidden="true" alt="mathbf {y} " width="607.5" height="865.1">$ in the second assertion is orthogonal to a hyperplane that separates

b{displaystyle mathbf {b} }

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/13ebf4628a1adf07133a6009e4a78bdd990c6eb9" aria-hidden="true" alt="mathbf {b} " width="639.5" height="936.9">$ and

C(A){displaystyle C(mathbf {A} )}

b{displaystyle mathbf {b} }

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/13ebf4628a1adf07133a6009e4a78bdd990c6eb9" aria-hidden="true" alt="mathbf {b} " width="639.5" height="936.9">$ belongs to

C(A){displaystyle C(mathbf {A} )}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/a9205986721b949b527e680a27070c8304509ce6" aria-hidden="true" alt="{displaystyle C(mathbf {A} )}" width="2409" height="1223.9">$ .

More precisely, let

a1,…,an∈Rm{displaystyle mathbf {a} _{1},dots ,mathbf {a} _{n}in mathbb {R} ^{m}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/88059e45f9d75938181fe872c0969a3650a5dfa3" aria-hidden="true" alt="{displaystyle mathbf {a} _{1},dots ,mathbf {a} _{n}in mathbb {R} ^{m}}" width="6993.8" height="1152.1">$ denote the columns of

A{displaystyle mathbf {A} }

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/0795cc96c75d81520a120482662b90f024c9a1a1" aria-hidden="true" alt="mathbf {A} " width="869.5" height="936.9">$ . In terms of these vectors, Farkas’ lemma states that exactly one of the following two statements is true:

There exist non-negative coefficients $x1,…,xn∈R{displaystyle x_{1},dots ,x_{n}in mathbb {R} } <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/b81fd5818d69fb8ccb94989d262290e86ceb8381" aria-hidden="true" alt="{displaystyle x_{1},dots ,x_{n}in mathbb {R} }" width="6298.6" height="1080.4">$ such that $b=x1a1+⋯+xnan{displaystyle mathbf {b} =x_{1}mathbf {a} _{1}+dots +x_{n}mathbf {a} _{n}} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/6b5390a0278f9a697ed00615911de66015894cf5" aria-hidden="true" alt="{displaystyle mathbf {b} =x_{1}mathbf {a} _{1}+dots +x_{n}mathbf {a} _{n}}" width="9813" height="1080.4">$ .
There exists a vector $y∈Rm{displaystyle mathbf {y} in mathbb {R} ^{m}} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/bf090a9d337583d991036e368ae28901426943c2" aria-hidden="true" alt="{displaystyle mathbf {y} in mathbb {R} ^{m}}" width="3274.2" height="1152.1">$ such that $aiTy≥0{displaystyle mathbf {a} _{i}^{mathsf {T}}mathbf {y} geq 0} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/3fcd98217b7e22e6492fd089862d7fb23f5f61f0" aria-hidden="true" alt="{displaystyle mathbf {a} _{i}^{mathsf {T}}mathbf {y} geq 0}" width="3583.4" height="1367.4">$ for $i=1,…,n{displaystyle i=1,dots ,n} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/f3f269b2f3b2f87fec0168426652a5ea80b56112" aria-hidden="true" alt="i=1,dots ,n" width="5010.1" height="1080.4">$ , and $bTy<0{displaystyle mathbf {b} ^{mathsf {T}}mathbf {y} <0} <img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/d5fca1b38852418b61056d76239f4201dc37ebed" aria-hidden="true" alt="{displaystyle mathbf {b} ^{mathsf {T}}mathbf {y} <0}" width="3663.4" height="1295.7">$ .

The sums

x1a1+⋯+xnan{displaystyle x_{1}mathbf {a} _{1}+dots +x_{n}mathbf {a} _{n}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/948b3a9b30877cd0f1f62feeae549768dc3bb41d" aria-hidden="true" alt="{displaystyle x_{1}mathbf {a} _{1}+dots +x_{n}mathbf {a} _{n}}" width="7839.4" height="1008.6">$ with nonnegative coefficients

x1,…,xn{displaystyle x_{1},dots ,x_{n}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/e5afdbc2d248d8fa9ba2c4f5188d946a0537e753" aria-hidden="true" alt="x_{1},dots ,x_{n}" width="4353" height="865.1">$ form the cone spanned by the columns of

A{displaystyle mathbf {A} }

b{displaystyle mathbf {b} }

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/13ebf4628a1adf07133a6009e4a78bdd990c6eb9" aria-hidden="true" alt="mathbf {b} " width="639.5" height="936.9">$ belongs to

C(A){displaystyle C(mathbf {A} )}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/a9205986721b949b527e680a27070c8304509ce6" aria-hidden="true" alt="{displaystyle C(mathbf {A} )}" width="2409" height="1223.9">$ .

The second statement tells that there exists a vector

y{displaystyle mathbf {y} }

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/bb25a040b592282dc2a254c3117e792c3c81161f" aria-hidden="true" alt="mathbf {y} " width="607.5" height="865.1">$ such that the angle of

y{displaystyle mathbf {y} }

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/bb25a040b592282dc2a254c3117e792c3c81161f" aria-hidden="true" alt="mathbf {y} " width="607.5" height="865.1">$ with the vectors

ai{displaystyle mathbf {a} _{i}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/0a01879ce830ef8790aa7dc9f3665d6727f3af3a" aria-hidden="true" alt="mathbf{a}_i" width="903.8" height="865.1">$ is at most 90°, while the angle of

y{displaystyle mathbf {y} }

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/bb25a040b592282dc2a254c3117e792c3c81161f" aria-hidden="true" alt="mathbf {y} " width="607.5" height="865.1">$ with the vector

b{displaystyle mathbf {b} }

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/13ebf4628a1adf07133a6009e4a78bdd990c6eb9" aria-hidden="true" alt="mathbf {b} " width="639.5" height="936.9">$ is more than 90°. The hyperplane normal to this vector has the vectors

ai{displaystyle mathbf {a} _{i}}

b{displaystyle mathbf {b} }

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/13ebf4628a1adf07133a6009e4a78bdd990c6eb9" aria-hidden="true" alt="mathbf {b} " width="639.5" height="936.9">$ on the other side. Hence, this hyperplane separates the cone spanned by

a1,…,an{displaystyle mathbf {a} _{1},dots ,mathbf {a} _{n}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/cced47e40afb60cd1acb5d637ce52663f727e01e" aria-hidden="true" alt="{displaystyle mathbf {a} _{1},dots ,mathbf {a} _{n}}" width="4327" height="865.1">$ from the vector

b{displaystyle mathbf {b} }

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/13ebf4628a1adf07133a6009e4a78bdd990c6eb9" aria-hidden="true" alt="mathbf {b} " width="639.5" height="936.9">$ .

For example, let n, m = 2, a₁ = (1, 0)^T, and a₂ = (1, 1)^T. The convex cone spanned by a₁ and a₂ can be seen as a wedge-shaped slice of the first quadrant in the xy plane. Now, suppose b = (0, 1). Certainly, b is not in the convex cone a₁x₁ + a₂x₂. Hence, there must be a separating hyperplane. Let y = (1, −1)^T. We can see that a₁ · y = 1, a₂ · y = 0, and b · y = −1. Hence, the hyperplane with normal y indeed separates the convex cone a₁x₁ + a₂x₂ from b.

Logic interpretation[edit]

A particularly suggestive and easy-to-remember version is the following: if a set of linear inequalities has no solution, then a contradiction can be produced from it by linear combination with nonnegative coefficients. In formulas: if

Ax{displaystyle Ax}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/4527572c17e687922277b4ffe25b28c9411fc0a9" aria-hidden="true" alt="Ax" width="1323" height="936.9">$ ≤

b{displaystyle b}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/f11423fbb2e967f986e36804a8ae4271734917c3" aria-hidden="true" alt="b" width="429.5" height="936.9">$ is unsolvable then

yTA=0{displaystyle y^{mathsf {T}}A=0}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/15b04a80a8578a8ad790505e69fa342cfe82bd64" aria-hidden="true" alt="{displaystyle y^{mathsf {T}}A=0}" width="3666.6" height="1295.7">$ ,

yTb=−1{displaystyle y^{mathsf {T}}b=-1}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/f1883179af988e8830c2da28296ffe2d780a37f3" aria-hidden="true" alt="{displaystyle y^{mathsf {T}}b=-1}" width="4124.1" height="1295.7">$ ,

y{displaystyle y}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/b8a6208ec717213d4317e666f1ae872e00620a0d" aria-hidden="true" alt="y" width="497.5" height="865.1">$ ≥

0{displaystyle 0}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/2aae8864a3c1fec9585261791a809ddec1489950" aria-hidden="true" alt="{displaystyle 0}" width="500.5" height="936.9">$ has a solution.^[7] Note that

yTA{displaystyle y^{mathsf {T}}A}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/d63f544f441525a94892350c36464ce75e55fbf5" aria-hidden="true" alt="{displaystyle y^{mathsf {T}}A}" width="1832" height="1295.7">$ is a combination of the left-hand sides,

yTb{displaystyle y^{mathsf {T}}b}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/54f1a553261e4b22f44a194e84804fd56bb95a8f" aria-hidden="true" alt="{displaystyle y^{mathsf {T}}b}" width="1511" height="1295.7">$ a combination of the right-hand side of the inequalities. Since the positive combination produces a zero vector on the left and a −1 on the right, the contradiction is apparent.

Thus, Farkas’ lemma can be viewed as a theorem of logical completeness:

Ax{displaystyle Ax}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/4527572c17e687922277b4ffe25b28c9411fc0a9" aria-hidden="true" alt="Ax" width="1323" height="936.9">$ ≤

b{displaystyle b}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/f11423fbb2e967f986e36804a8ae4271734917c3" aria-hidden="true" alt="b" width="429.5" height="936.9">$ is a set of “axioms”, the linear combinations are the “derivation rules”, and the lemma says that, if the set of axioms is inconsistent, then it can be refuted using the derivation rules.^[8]^: 92–94

Variants[edit]

The Farkas Lemma has several variants with different sign constraints (the first one is the original version):^[8]^: 92

The latter variant is mentioned for completeness; it is not actually a “Farkas lemma” since it contains only equalities. Its proof is an exercise in linear algebra.

Generalizations[edit]

Generalized Farkas’ lemma can be interpreted geometrically as follows: either a vector is in a given closed convex cone, or there exists a hyperplane separating the vector from the cone; there are no other possibilities. The closedness condition is necessary, see Separation theorem I in Hyperplane separation theorem. For original Farkas’ lemma,

S{displaystyle mathbf {S} }

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/ac8a515de34f0af7d15de46f73bf674950d444a8" aria-hidden="true" alt="mathbf {S} " width="639.5" height="936.9">$ is the nonnegative orthant

R+n{displaystyle mathbb {R} _{+}^{n}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/cf8024dbc8cbeae85a715a4d414e9c06e2cd68e6" aria-hidden="true" alt="{displaystyle mathbb {R} _{+}^{n}}" width="1373" height="1295.7">$ , hence the closedness condition holds automatically. Indeed, for polyhedral convex cone, i.e., there exists a

B∈Rn×k{displaystyle mathbf {B} in mathbb {R} ^{ntimes k}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/31b5d7587d233c9fffd477a9fe2a061f2b2a3ac8" aria-hidden="true" alt="{displaystyle mathbf {B} in mathbb {R} ^{ntimes k}}" width="4207.9" height="1152.1">$ such that

S={Bx∣x∈R+k}{displaystyle mathbf {S} ={mathbf {B} mathbf {x} mid mathbf {x} in mathbb {R} _{+}^{k}}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/fcba2ebefc000ff60a6ddbef6a4caad918c4d2af" aria-hidden="true" alt="{displaystyle mathbf {S} ={mathbf {B} mathbf {x} mid mathbf {x} in mathbb {R} _{+}^{k}}}" width="8438.1" height="1439.2">$ , the closedness condition holds automatically. In convex optimization, various kinds of constraint qualification, e.g. Slater’s condition, are responsible for closedness of the underlying convex cone

C(A){displaystyle C(mathbf {A} )}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/a9205986721b949b527e680a27070c8304509ce6" aria-hidden="true" alt="{displaystyle C(mathbf {A} )}" width="2409" height="1223.9">$ .

By setting

S=Rn{displaystyle mathbf {S} =mathbb {R} ^{n}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/e649f37108c2d18c0698ac174042cf56416be4a7" aria-hidden="true" alt="{displaystyle mathbf {S} =mathbb {R} ^{n}}" width="3220.7" height="1008.6">$ and

S∗={0}{displaystyle mathbf {S} ^{*}={0}}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/e827b63dd403c08edb4c179ee7e5c585380c7706" aria-hidden="true" alt="{displaystyle mathbf {S} ^{*}={0}}" width="3929" height="1223.9">$ in generalized Farkas’ lemma, we obtain the following corollary about the solvability for a finite system of linear equalities:

Further implications[edit]

Farkas’s lemma can be varied to many further theorems of alternative by simple modifications,^[5] such as Gordan’s theorem: Either

Ax<0{displaystyle Ax<0}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/569b649777653409d253826671eefb72213c9d92" aria-hidden="true" alt="Ax < 0" width="3157.6" height="936.9">$ has a solution x, or

ATy=0{displaystyle A^{mathsf {T}}y=0}

$<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/eb0ed0a12a4f4d3a97ff3de796318bdb216836a8" aria-hidden="true" alt="{displaystyle A^{mathsf {T}}y=0}" width="3664.4" height="1295.7">$ has a nonzero solution y with y ≥ 0.

Common applications of Farkas’ lemma include proving the strong duality theorem associated with linear programming and the Karush–Kuhn–Tucker conditions. An extension of Farkas’ lemma can be used to analyze the strong duality conditions for and construct the dual of a semidefinite program. It is sufficient to prove the existence of the Karush–Kuhn–Tucker conditions using the Fredholm alternative but for the condition to be necessary, one must apply von Neumann’s minimax theorem to show the equations derived by Cauchy are not violated.

Farkas’ lemma – Wikipedia

Statement of the lemma[edit]

Example[edit]

Geometric interpretation[edit]

Logic interpretation[edit]

Variants[edit]

Generalizations[edit]

Further implications[edit]

See also[edit]

Further reading[edit]

Recent Posts

Recent Comments

Archives

Categories

Meta