Kaplan-Meier estimator-Wikipedia
L’ Kaplan-Meier estimator [ first ] , [ 2 ] , also known as the Limite product estimator, is an estimator to estimate the survival function according to lifespan data. In medical research, it is often used to measure the fraction of patients alive for a certain duration after their treatment. It is also used in economics and ecology.
This estimator owes his name to Edward L. Kaplan and Paul Meier.
A kaplan-meier estimate curve for the survival function is a series of horizontal steps of decreasing magnitude which, when a sufficiently large sample is used, allows to approach the real survival function in this population. The value of the survival function between the successive samples observed is considered to be constant.
An example of Kaplan-Meier curve for 2 variables associated with the survival of patients
An important advantage of the Kaplan-Meier curve is that this method can take into account certain types of censored data, in particular censored by the right, which intervenes when a patient disappears from a study, that is to say To say that we no longer have your data before the expected event (for example the death) is observed. On the graph, the small vertical features indicate these censorship. If no truncation or censorship intervenes, the Kaplan-Meier curve is equivalent to the survival function.
Either S ( t ) the probability that a member of a given population has a lifespan above t . For a size sample N in this population, the durations observed until each death of the members of the sample N are :
Every n i corresponds a t i , n i being the number of people “at risk” just before time t i , And d i The number of deaths at time t i .
We note that the intervals between each event are not uniform. For example, a small amount of data can start with 10 cases. Suppose that subject 1 dies on day 3, subjects 2 and 3 on day 11 and subject 4 disappears from monitoring (censored given) on day 9. The data for the first 2 subjects would be as follows:
first | 2 | |
---|---|---|
3 | 11 | |
first | 2 | |
ten | 8 |
The Kaplan-Meier estimator is the estimation of the maximum non-parametric likelihood of S ( t ). It is a product of the form:
When there is no censorship, n i is the number of survivors just before time t i .
When there is censorship, n i is the number of survivors less the number of losses (censored cases). It is only these surviving cases that continue to be observed (which have not yet been censored) that are “at risk” of death (observed).
Here another possible definition sometimes used:
The two definitions differ only at the times of the events observed. The last definition is “continuous on the right” while the first is “continues on the left”.
Either T the random variable that measures the time of failure and is F ( t ) its cumulative distribution function. We take note that :
- may be preferred to make the estimate compatible with a continuous estimate to the right of F ( t ).
The Kaplan-Meier estimator is a statistic, and some estimators are used to approach its variance. One of these most common estimators is Greenwood’s formula:
In 1983, Edward L. Kaplan recounts the genesis of the Kaplan-Meier estimator [ 3 ] .
It all started in 1952, reveals Kaplan, when Paul Meier (then in post-doctoral internship at Johns-Hopkins University, in Maryland), after having read the article by Greenwood, published in 1926, on the duration of the Cancer, wants to offer a powerful survival estimator supported on clinical trial results. In 1953, the mathematician Kaplan (then working at Bell laboratories, New Jersey) wanted to offer an estimator of the duration that the vacuum tubes used to amplify and broadcast the signals in the underwater telephone cable system. Kaplan submits his article project to Professor John W. Tukey, who also worked for Bell laboratories and who had just been MEIER Master of Meier [ 4 ] In Princeton, New Jersey. Each of the two young researchers had submitted their manuscript to Journal of the American Statistical Association , who recommended them to get in touch with each other, to merge the two articles. Then, Kaplan and Meier undertake, by correspondence (postal mail), to reconcile their points of view. During the four years that this phase lasts, their only fear is that a third party publishes before them an article offering an equivalent solution.
The article Nonparametric estimation from incomplete observations is finally published in 1958 ( Journal of the American Statistical Association , vol. 53, p. 457–481) [ first ] .
Implementation in programming languages [ modifier | Modifier and code ]
Several programming and statistical software languages offer implementations of the Kaplan-Meier estimator. We can notably mention:
- Kaplan, E. L.; Meier, P.: Nonparametric estimation from incomplete observations. J. Amer. Statist. Assn. 53 : 457–481, 1958.
- Kaplan, E.L. in a retrospective on the seminal paper in “This week’s citation classic”. Current Contents 24 , 14 (1983). Available from UPenn as PDF.
- On April 15, 1983, Edward L. Kaplan (then from Department of Mathematics , from the State University of Oregon) recounts the genesis of the 1958 article presenting the estimator of Kaplan-Meier – Notile of retrospective published in the section ” This week’s citation classic ” of Current Contents , n O 24, from June 13, 1983 – Notuel transmitted by the University of Pennsylvania [ (in) read online (page consulted on August 15, 2011)] .
- ‘ Appendix C: Ph.D. Students », P. 1569 by: (in) David R. Brillinger, ‘ John W. Tukey: His life and professional contributions » , Annals of Statistics , Department of Statistics University of California, vol. 30, n O 6, , p. 1535-1575 ( read online )
- The LIFETEST Procedure
- (in) ‘ survival: Survival Analysis » , R Project
- (in) French Willekens , Multistate Analysis of Life Histories with R , Cham, Springer, , 323 p. (ISBN 978-3-319-08383-4 , DOI 10,1007/978-3-319-08383-4_6 , read online ) , « The Survival Package »
- (in) Ding-get Chen Et Karl E. Peace , Clinical Trial Data Analysis Using R , CRC Press, , 99–108 p. ( read online )
- (in) ‘ sts — Generate, graph, list, and test the survivor and cumulative hazard functions » , Was manual
- (in) Mario Cleves , An Introduction to Survival Analysis Using Stata , College Station, Stata Press, , Second ed. , 372 p. (ISBN 978-1-59718-041-2 And 1-59718-041-6 , read online )
- (in) ‘ lifelines » .
- ‘ sksurv.nonparametric.kaplan_meier_estimator — scikit-survival 0.12.1.dev4+gba84551.d20200501 documentation » , on Scikit-Survival.Readthedocs.io (consulted the )
Recent Comments