[{"@context":"http:\/\/schema.org\/","@type":"BlogPosting","@id":"https:\/\/wiki.edu.vn\/en\/wiki24\/polya-urn-model-wikipedia\/#BlogPosting","mainEntityOfPage":"https:\/\/wiki.edu.vn\/en\/wiki24\/polya-urn-model-wikipedia\/","headline":"P\u00f3lya urn model – Wikipedia","name":"P\u00f3lya urn model – Wikipedia","description":"before-content-x4 In statistics, a P\u00f3lya urn model (also known as a P\u00f3lya urn scheme or simply as P\u00f3lya’s urn), named","datePublished":"2015-09-13","dateModified":"2015-09-13","author":{"@type":"Person","@id":"https:\/\/wiki.edu.vn\/en\/wiki24\/author\/lordneo\/#Person","name":"lordneo","url":"https:\/\/wiki.edu.vn\/en\/wiki24\/author\/lordneo\/","image":{"@type":"ImageObject","@id":"https:\/\/secure.gravatar.com\/avatar\/c9645c498c9701c88b89b8537773dd7c?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/c9645c498c9701c88b89b8537773dd7c?s=96&d=mm&r=g","height":96,"width":96}},"publisher":{"@type":"Organization","name":"Enzyklop\u00e4die","logo":{"@type":"ImageObject","@id":"https:\/\/wiki.edu.vn\/wiki4\/wp-content\/uploads\/2023\/08\/download.jpg","url":"https:\/\/wiki.edu.vn\/wiki4\/wp-content\/uploads\/2023\/08\/download.jpg","width":600,"height":60}},"image":{"@type":"ImageObject","@id":"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/398c16f942e75bc5908ad484918c54794cfbe477","url":"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/398c16f942e75bc5908ad484918c54794cfbe477","height":"","width":""},"url":"https:\/\/wiki.edu.vn\/en\/wiki24\/polya-urn-model-wikipedia\/","wordCount":9036,"articleBody":" (adsbygoogle = window.adsbygoogle || []).push({});before-content-x4In statistics, a P\u00f3lya urn model (also known as a P\u00f3lya urn scheme or simply as P\u00f3lya’s urn), named after George P\u00f3lya, is a type of statistical model used as an idealized mental exercise framework, unifying many treatments. (adsbygoogle = window.adsbygoogle || []).push({});after-content-x4In an urn model, objects of real interest (such as atoms, people, cars, etc.) are represented as colored balls in an urn or other container. In the basic P\u00f3lya urn model, the urn contains x white and y black balls; one ball is drawn randomly from the urn and its color observed; it is then returned in the urn, and an additional ball of the same color is added to the urn, and the selection process is repeated. Questions of interest are the evolution of the urn population and the sequence of colors of the balls drawn out.This endows the urn with a self-reinforcing property sometimes expressed as the rich get richer.Note that in some sense, the P\u00f3lya urn model is the “opposite” of the model of sampling without replacement, where every time a particular value is observed, it is less likely to be observed again, whereas in a P\u00f3lya urn model, an observed value is more likely to be observed again. In both of these models, the act of measurement has an effect on the outcome of future measurements. (For comparison, when sampling with replacement, observation of a particular value has no effect on how likely it is to observe that value again.) In a P\u00f3lya urn model, successive acts of measurement over time have less and less effect on future measurements, whereas in sampling without replacement, the opposite is true: After a certain number of measurements of a particular value, that value will never be seen again. (adsbygoogle = window.adsbygoogle || []).push({});after-content-x4One of the reasons for interest in this particular rather elaborate urn model (i.e. with duplication and then replacement of each ball drawn) is that it provides an example in which the count (initially x black and y white) of balls in the urn is not concealed, which is able to approximate the correct updating of subjective probabilities appropriate to a different case in which the original urn content is concealed while ordinary sampling with replacement is conducted (without the P\u00f3lya ball-duplication). Because of the simple “sampling with replacement” scheme in this second case, the urn content is now static, but this greater simplicity is compensated for by the assumption that the urn content is now unknown to an observer. A Bayesian analysis of the observer’s uncertainty about the urn’s initial content can be made, using a particular choice of (conjugate) prior distribution. Specifically, suppose that an observer knows that the urn contains only identical balls, each coloured either black or white, but he does not know the absolute number of balls present, nor the proportion that are of each colour. Suppose that he holds prior beliefs about these unknowns: for him the probability distribution of the urn content is well approximated by some prior distribution for the total number of balls in the urn, and a beta prior distribution with parameters (x,y) for the initial proportion of these which are black, this proportion being (for him) considered approximately independent of the total number. Then the process of outcomes of a succession of draws from the urn (with replacement but without the duplication) has approximately the same probability law as does the above P\u00f3lya scheme in which the actual urn content was not hidden from him. The approximation error here relates to the fact that an urn containing a known finite number m of balls of course cannot have an exactly beta-distributed unknown proportion of black balls, since the domain of possible values for that proportion are confined to being multiples of 1\/m{displaystyle 1\/m}, rather than having the full freedom to assume any value in the continuous unit interval, as would an exactly beta distributed proportion. This slightly informal account is provided for reason of motivation, and can be made more mathematically precise.This basic P\u00f3lya urn model has been enriched and generalized in many ways. (adsbygoogle = window.adsbygoogle || []).push({});after-content-x4Table of ContentsDistributions related to the P\u00f3lya urn[edit]Mathematical Derivation[edit]See also[edit]References[edit]Further reading[edit]Bibliography[edit]Distributions related to the P\u00f3lya urn[edit]beta-binomial distribution: The distribution of the number of successful draws (trials), e.g. number of extractions of white ball, given n{displaystyle n} draws from a P\u00f3lya urn.Beta negative binomial distribution: The distribution of number of white balls observed until a fixed number black balls are observed.Dirichlet-multinomial distribution (also known as the multivariate P\u00f3lya distribution): The distribution over the number of balls of each color, given n{displaystyle n} draws from a P\u00f3lya urn where there are k{displaystyle k} different colors instead of only two.Dirichlet negative multinomial distribution: The distribution over the number of balls of each color until a fixed number of stopping colored balls are observed.martingales, the Beta-binomial distribution and the beta distribution: Let w and b be the number of white and black balls initially in the urn, and w+nw{displaystyle w+n_{w}} the number of white balls currently in the urn after n draws. Then the sequence of values w+nww+b+n{displaystyle {frac {w+n_{w}}{w+b+n}}} for n=1,2,3,\u2026{displaystyle n=1,2,3,dots } is a normalized version of the Beta-binomial distribution. It is a martingale and converges to the beta distribution when n\u00a0\u2192\u00a0\u221e.Dirichlet process, Chinese restaurant process, Hoppe urn: Imagine a modified P\u00f3lya urn scheme as follows. We start with an urn with \u03b1{displaystyle alpha } black balls. When drawing a ball from the urn, if we draw a black ball, put the ball back along with a new ball of a new non-black color randomly generated from a uniform distribution over an infinite set of available colours, and consider the newly generated color to be the “value” of the draw. Otherwise, put the ball back along with another ball of the same color, as for the standard P\u00f3lya urn scheme. The colors of an infinite sequence of draws from this modified P\u00f3lya urn scheme follow a Chinese restaurant process. If, instead of generating a new color, we draw a random value from a given base distribution and use that value to label the ball, the labels of an infinite sequence of draws follow a Dirichlet process.[1]Moran model: An urn model used to model genetic drift in theoretical population genetics. This is closely similar to the P\u00f3lya urn model except that, in addition to adding a new ball of the same color, a randomly drawn ball is removed from the urn. The number of balls in the urn thus remains constant. Continued sampling then leads ultimately to an urn with all balls of one color, the probability of each color being the proportion of that color in the original urn. There are variants of the Moran model that insist that the ball removed from the urn be a different ball from one originally sampled in that step, and variants that do the removal of a ball immediately after the new ball is placed in the urn, so that the new ball is one of the balls available to be removed. This makes a small difference in the time taken to reach the state in which all balls are the same color. The Moran process models genetic drift in a population with overlapping generations.Mathematical Derivation[edit]Suppose we have an urn containing \u03b3{displaystyle gamma } white balls and \u03b1{displaystyle alpha } black balls. Each time we pick a random ball from the urn, record its color and return the ball to the urn with another ball of the same color. Every time we draw a new ball from the urn, we define a random variable called Xi{displaystyle X_{i}} for the color of the ball. Xi=1{displaystyle X_{i}=1} if the color of the ball is black and Xi=0{displaystyle X_{i}=0} otherwise. The more random variables prior to Xi{displaystyle X_{i}} that are one, the more likely that Xi{displaystyle X_{i}} will also be one, because more black balls have been added to the urn. Therefore, these variables are not independent of each other, but as shown below, they are interchangeable.[2]Assume that n{displaystyle n} balls are picked from the urn, and out of these n{displaystyle n} balls k{displaystyle k} balls are black and n\u2212k{displaystyle n-k} are white. The first time the number of balls in the urn is \u03b3+\u03b1{displaystyle gamma +alpha }, the second time it is \u03b3+\u03b1+1{displaystyle gamma +alpha +1} and so on. So in the i{displaystyle i}-th time, the number of balls will be \u03b3+\u03b1+i\u22121{displaystyle gamma +alpha +i-1}. Now suppose that we have seen all the black balls before the white balls, the probability of this event is simply:\u03b1\u03b3+\u03b1\u00d7\u03b1+1\u03b3+\u03b1+1\u00d7\u22ef\u00d7\u03b1+k\u22121\u03b3+\u03b1+k\u22121\u00d7\u03b3\u03b3+\u03b1+k\u00d7\u03b3+1\u03b3+\u03b1+k+1\u22ef\u03b3+n\u2212k\u22121\u03b3+\u03b1+n\u22121{displaystyle {frac {alpha }{gamma +alpha }}times {frac {alpha +1}{gamma +alpha +1}}times cdots times {frac {alpha +k-1}{gamma +alpha +k-1}}times {frac {gamma }{gamma +alpha +k}}times {frac {gamma +1}{gamma +alpha +k+1}}cdots {frac {gamma +n-k-1}{gamma +alpha +n-1}}}Now we have to prove that if the order of black and white balls is changed arbitrarily, there will be no change in the final probability. As we can see in the line above, the denominators of the fractions will not change by changing the order of observing the balls, the i{displaystyle i}th denominator will always be \u03b3+\u03b1+i\u22121{displaystyle gamma +alpha +i-1}, because this is the number of balls in the urn at that round.If we see j{displaystyle j}-th black ball in round t{displaystyle t}, the probability Xt=1{displaystyle X_{t}=1} will be equal to \u03b1+j\u22121\u03b3+\u03b1+t\u22121{displaystyle {frac {alpha +j-1}{gamma +alpha +t-1}}}, i.e. the numerator will be equal to \u03b1+j\u22121{displaystyle alpha +j-1}. With a similar argument, we can also calculate the probability for white balls. Therefore, the final probability will be equal to the following expression:p(X1=x1,X2=x2,...,Xn=xn)=\u220fi=1k(\u03b1+i\u22121)\u00d7\u220fi=1n\u2212k(\u03b3+i\u22121)\u220fi=1n(\u03b3+\u03b1+i\u22121)=(\u03b1+k\u22121)!\u00d7(\u03b3+n\u2212k\u22121)!\u00d7(\u03b1+\u03b3\u22121)!(\u03b1\u22121)!\u00d7(\u03b3\u22121)!(\u03b1+\u03b3+n\u22121)!{displaystyle {begin{aligned}p(X_{1}=x_{1},X_{2}=x_{2},…,X_{n}=x_{n})&={frac {prod _{i=1}^{k}left(alpha +i-1right)times prod _{i=1}^{n-k}left(gamma +i-1right)}{prod _{i=1}^{n}left(gamma +alpha +i-1right)}}\\&={frac {left(alpha +k-1right)!times left(gamma +n-k-1right)!times left(alpha +gamma -1right)!}{left(alpha -1right)!times left(gamma -1right)!left(alpha +gamma +n-1right)!}}end{aligned}}}This probability is not related to the order of seeing black and white balls and only depends on the total number of white balls and the total number of black balls.[2]According to the De Finetti’s theorem, there must be a unique prior distribution such that the joint distribution of observing the sequence is a Bayesian mixture of the Bernoulli probabilities. It can be shown that this prior distribution is a beta distribution with parameters \u03b2(\u22c5;\u03b1,\u03b3){displaystyle beta left(cdot ;,alpha ,,gamma right)}. In the De Finetti’s theorem, if \u03c0(\u22c5){displaystyle pi (cdot )} with \u03b2(\u22c5;\u03b1,\u03b3){displaystyle beta left(cdot ;,alpha ,,gamma right)} we get the previous equation:[2]p(X1=x1,X2=x2,...,Xn=xn)=\u222b\u03b8(\u2211i=1nxi)\u00d7(1\u2212\u03b8)(n\u2212\u2211i=1nxi)\u03b2(\u03b8;\u03b1,\u03b3)d(\u03b8)=\u222b\u03b8(\u2211i=1nxi)\u00d7(1\u2212\u03b8)(n\u2212\u2211i=1nxi)(\u03b1+\u03b3\u22121)!(\u03b1\u22121)!(\u03b3\u22121)!\u03b8\u03b1\u22121(1\u2212\u03b8)\u03b3\u22121d(\u03b8)=\u222b\u03b8(\u03b1\u22121+\u2211i=1nxi)\u00d7(1\u2212\u03b8)(n+\u03b3\u22121\u2212\u2211i=1nxi)(\u03b1+\u03b3\u22121)!(\u03b1\u22121)!(\u03b3\u22121)!d(\u03b8)=\u222b\u03b8(\u03b1+k\u22121)\u00d7(1\u2212\u03b8)(n\u2212k\u22121+\u03b3)(\u03b1+\u03b3\u22121)!(\u03b1\u22121)!(\u03b3\u22121)!d(\u03b8)=(\u03b1+\u03b3\u22121)!(\u03b1\u22121)!(\u03b3\u22121)!\u222b\u03b8(\u03b1+k\u22121)\u00d7(1\u2212\u03b8)(n\u2212k+\u03b3\u22121)d(\u03b8)=(\u03b1+\u03b3\u22121)!(\u03b1\u22121)!(\u03b3\u22121)!\u0393(\u03b3+n\u2212k)\u0393(\u03b1+k)\u0393(\u03b1+\u03b3+n)=(\u03b1+k\u22121)!\u00d7(\u03b3+n\u2212k\u22121)!\u00d7(\u03b1+\u03b3\u22121)!(\u03b1\u22121)!\u00d7(\u03b3\u22121)!(\u03b1+\u03b3+n\u22121)!{displaystyle {begin{aligned}p(X_{1}=x_{1},X_{2}=x_{2},…,X_{n}=x_{n})&=int theta ^{left({sum _{i=1}^{n}x_{i}}right)}times left(1-theta right)^{left(n-{sum _{i=1}^{n}x_{i}}right)},beta left(theta ;alpha ,,gamma right)dleft(theta right)\\&=int theta ^{left({sum _{i=1}^{n}x_{i}}right)}times left(1-theta right)^{left(n-{sum _{i=1}^{n}x_{i}}right)},{dfrac {(alpha +gamma -1)!}{(alpha -1)!,(gamma -1)!}}theta ^{alpha -1}(1-theta )^{gamma -1}dleft(theta right)\\&=int theta ^{left({alpha -1+sum _{i=1}^{n}x_{i}}right)}times left(1-theta right)^{left(n+gamma -1-{sum _{i=1}^{n}x_{i}}right)},{dfrac {(alpha +gamma -1)!}{(alpha -1)!,(gamma -1)!}}dleft(theta right)\\&=int theta ^{left({alpha +k-1}right)}times left(1-theta right)^{left(n-k-1+gamma right)},{dfrac {(alpha +gamma -1)!}{(alpha -1)!,(gamma -1)!}}dleft(theta right)\\&={dfrac {(alpha +gamma -1)!}{(alpha -1)!,(gamma -1)!}}int theta ^{left({alpha +k-1}right)}times left(1-theta right)^{left(n-k+gamma -1right)},dleft(theta right)\\&={dfrac {(alpha +gamma -1)!}{(alpha -1)!,(gamma -1)!}}{dfrac {Gamma (gamma +n-k)Gamma (alpha +k)}{Gamma (alpha +gamma +n)}}\\&={dfrac {left(alpha +k-1right)!times left(gamma +n-k-1right)!times left(alpha +gamma -1right)!}{left(alpha -1right)!times left(gamma -1right)!left(alpha +gamma +n-1right)!}}end{aligned}}}In this equation k=\u2211i=1nxi{displaystyle k=sum _{i=1}^{n}x_{i}}.See also[edit]References[edit]Further reading[edit]F. Alajaji and T. Fuja, “A Communication Channel Modeled on Contagion,” IEEE Transactions on Information Theory, Vol. 40, pp.\u00a02035\u20132041, November 1994.A. Banerjee, P. Burlina and F. Alajaji, “Image Segmentation and Labeling Using the P\u00f3lya Urn Model,” IEEE Transactions on Image Processing, Vol. 8, No. 9, pp.\u00a01243\u20131253, September 1999.Bibliography[edit]N.L. Johnson and S.Kotz, (1977) “Urn Models and Their Application.” John Wiley.Hosam Mahmoud, (2008) “P\u00f3lya Urn Models.” Chapman and Hall\/CRC. ISBN\u00a0978-1420059830. (adsbygoogle = window.adsbygoogle || []).push({});after-content-x4"},{"@context":"http:\/\/schema.org\/","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"https:\/\/wiki.edu.vn\/en\/wiki24\/#breadcrumbitem","name":"Enzyklop\u00e4die"}},{"@type":"ListItem","position":2,"item":{"@id":"https:\/\/wiki.edu.vn\/en\/wiki24\/polya-urn-model-wikipedia\/#breadcrumbitem","name":"P\u00f3lya urn model – Wikipedia"}}]}]