[{"@context":"http:\/\/schema.org\/","@type":"BlogPosting","@id":"https:\/\/wiki.edu.vn\/en\/wiki5\/vapnik-chervonenkis-dimension-wikipedia\/#BlogPosting","mainEntityOfPage":"https:\/\/wiki.edu.vn\/en\/wiki5\/vapnik-chervonenkis-dimension-wikipedia\/","headline":"Vapnik\u2013Chervonenkis dimension – Wikipedia","name":"Vapnik\u2013Chervonenkis dimension – Wikipedia","description":"Notion in supervised machine learning In Vapnik\u2013Chervonenkis theory, the Vapnik\u2013Chervonenkis (VC) dimension is a measure of the capacity (complexity, expressive","datePublished":"2022-03-27","dateModified":"2022-03-27","author":{"@type":"Person","@id":"https:\/\/wiki.edu.vn\/en\/wiki5\/author\/lordneo\/#Person","name":"lordneo","url":"https:\/\/wiki.edu.vn\/en\/wiki5\/author\/lordneo\/","image":{"@type":"ImageObject","@id":"https:\/\/secure.gravatar.com\/avatar\/c9645c498c9701c88b89b8537773dd7c?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/c9645c498c9701c88b89b8537773dd7c?s=96&d=mm&r=g","height":96,"width":96}},"publisher":{"@type":"Organization","name":"Enzyklop\u00e4die","logo":{"@type":"ImageObject","@id":"https:\/\/wiki.edu.vn\/wiki4\/wp-content\/uploads\/2023\/08\/download.jpg","url":"https:\/\/wiki.edu.vn\/wiki4\/wp-content\/uploads\/2023\/08\/download.jpg","width":600,"height":60}},"image":{"@type":"ImageObject","@id":"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/75a9edddcca2f782014371f75dca39d7e13a9c1b","url":"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/75a9edddcca2f782014371f75dca39d7e13a9c1b","height":"","width":""},"url":"https:\/\/wiki.edu.vn\/en\/wiki5\/vapnik-chervonenkis-dimension-wikipedia\/","wordCount":14781,"articleBody":"Notion in supervised machine learningIn Vapnik\u2013Chervonenkis theory, the Vapnik\u2013Chervonenkis (VC) dimension is a measure of the capacity (complexity, expressive power, richness, or flexibility) of a set of functions that can be learned by a statistical binary classification algorithm. It is defined as the cardinality of the largest set of points that the algorithm can shatter, which means the algorithm can always learn a perfect classifier for any labeling of at least one configuration of those data points. It was originally defined by Vladimir Vapnik and Alexey Chervonenkis.[1]Informally, the capacity of a classification model is related to how complicated it can be. For example, consider the thresholding of a high-degree polynomial: if the polynomial evaluates above zero, that point is classified as positive, otherwise as negative. A high-degree polynomial can be wiggly, so it can fit a given set of training points well. But one can expect that the classifier will make errors on other points, because it is too wiggly. Such a polynomial has a high capacity. A much simpler alternative is to threshold a linear function. This function may not fit the training set well, because it has a low capacity. This notion of capacity is made rigorous below.Table of ContentsDefinitions[edit]VC dimension of a set-family[edit]VC dimension of a classification model[edit]Examples[edit]In statistical learning theory[edit]In computational geometry[edit]VC dimension of a finite projective plane[edit]VC dimension of a boosting classifier[edit]VC dimension of a neural network[edit]Generalizations[edit]See also[edit]References[edit]Definitions[edit]VC dimension of a set-family[edit]Let H{displaystyle H} be a set family (a set of sets) and C{displaystyle C} a set. Their intersection is defined as the following set family:H\u2229C:={h\u2229C\u2223h\u2208H}.{displaystyle Hcap C:={hcap Cmid hin H}.}We say that a set C{displaystyle C} is shattered by H{displaystyle H} if H\u2229C{displaystyle Hcap C} contains all the subsets of C{displaystyle C}, i.e.:|H\u2229C|=2|C|.{displaystyle |Hcap C|=2^{|C|}.}The VC dimension D{displaystyle D} of H{displaystyle H} is the largest cardinality of a set that is shattered by H{displaystyle H}. If arbitrarily large sets can be shattered, the VC dimension is \u221e{displaystyle infty }.VC dimension of a classification model[edit]A binary classification model f{displaystyle f} with some parameter vector \u03b8{displaystyle theta } is said to shatter a set of generally positioned data points (x1,x2,\u2026,xn){displaystyle (x_{1},x_{2},ldots ,x_{n})} if, for every assignment of labels to those points, there exists a \u03b8{displaystyle theta } such that the model f{displaystyle f} makes no errors when evaluating that set of data points[citation needed].The VC dimension of a model f{displaystyle f} is the maximum number of points that can be arranged so that f{displaystyle f} shatters them. More formally, it is the maximum cardinal D{displaystyle D} such that there exists a generally positioned data point set of cardinality D{displaystyle D} can be shattered by f{displaystyle f}.Examples[edit]1. f{displaystyle f} is a constant classifier (with no parameters); Its VC dimension is 0 since it cannot shatter even a single point. In general, the VC dimension of a finite classification model, which can return at most 2d{displaystyle 2^{d}} different classifiers, is at most d{displaystyle d} (this is an upper bound on the VC dimension; the Sauer\u2013Shelah lemma gives a lower bound on the dimension).2. f{displaystyle f} is a single-parametric threshold classifier on real numbers; i.e, for a certain threshold \u03b8{displaystyle theta }, the classifier f\u03b8{displaystyle f_{theta }} returns 1 if the input number is larger than \u03b8{displaystyle theta } and 0 otherwise. The VC dimension of f{displaystyle f} is 1 because: (a) It can shatter a single point. For every point x{displaystyle x}, a classifier f\u03b8{displaystyle f_{theta }} labels it as 0 if "},{"@context":"http:\/\/schema.org\/","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"https:\/\/wiki.edu.vn\/en\/wiki5\/#breadcrumbitem","name":"Enzyklop\u00e4die"}},{"@type":"ListItem","position":2,"item":{"@id":"https:\/\/wiki.edu.vn\/en\/wiki5\/vapnik-chervonenkis-dimension-wikipedia\/#breadcrumbitem","name":"Vapnik\u2013Chervonenkis dimension – Wikipedia"}}]}]