[{"@context":"http:\/\/schema.org\/","@type":"BlogPosting","@id":"https:\/\/wiki.edu.vn\/en\/wiki21\/isolation-forest-wikipedia\/#BlogPosting","mainEntityOfPage":"https:\/\/wiki.edu.vn\/en\/wiki21\/isolation-forest-wikipedia\/","headline":"Isolation forest – Wikipedia","name":"Isolation forest – Wikipedia","description":"before-content-x4 Algorithm for anomaly detection after-content-x4 Isolation Forest is an algorithm for data anomaly detection initially developed by Fei Tony","datePublished":"2019-04-21","dateModified":"2019-04-21","author":{"@type":"Person","@id":"https:\/\/wiki.edu.vn\/en\/wiki21\/author\/lordneo\/#Person","name":"lordneo","url":"https:\/\/wiki.edu.vn\/en\/wiki21\/author\/lordneo\/","image":{"@type":"ImageObject","@id":"https:\/\/secure.gravatar.com\/avatar\/c9645c498c9701c88b89b8537773dd7c?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/c9645c498c9701c88b89b8537773dd7c?s=96&d=mm&r=g","height":96,"width":96}},"publisher":{"@type":"Organization","name":"Enzyklop\u00e4die","logo":{"@type":"ImageObject","@id":"https:\/\/wiki.edu.vn\/wiki4\/wp-content\/uploads\/2023\/08\/download.jpg","url":"https:\/\/wiki.edu.vn\/wiki4\/wp-content\/uploads\/2023\/08\/download.jpg","width":600,"height":60}},"image":{"@type":"ImageObject","@id":"https:\/\/upload.wikimedia.org\/wikipedia\/commons\/thumb\/8\/84\/Normalized_Anomaly_Scores_of_Isolation_Forest.png\/220px-Normalized_Anomaly_Scores_of_Isolation_Forest.png","url":"https:\/\/upload.wikimedia.org\/wikipedia\/commons\/thumb\/8\/84\/Normalized_Anomaly_Scores_of_Isolation_Forest.png\/220px-Normalized_Anomaly_Scores_of_Isolation_Forest.png","height":"220","width":"220"},"url":"https:\/\/wiki.edu.vn\/en\/wiki21\/isolation-forest-wikipedia\/","wordCount":8334,"articleBody":" (adsbygoogle = window.adsbygoogle || []).push({});before-content-x4Algorithm for anomaly detection (adsbygoogle = window.adsbygoogle || []).push({});after-content-x4Isolation Forest is an algorithm for data anomaly detection initially developed by Fei Tony Liu and Zhi-Hua Zhou in 2008.[1] Isolation Forest detects anomalies using binary trees. The algorithm has a linear time complexity and a low memory requirement, which works well with high-volume data.[2][3] The normalized anomaly scores of the Isolation Forest algorithm fit on the Old Faithful datasetIsolation Forest splits the data space using lines that are orthogonal to the origin and assigns higher anomaly scores to data points that need fewer splits to be isolated. The figure on the right shows an application of the Isolation Forest algorithm to the waiting time between eruptions and the duration of the eruption of the Old Faithful geyser in Yellowstone National Park. Darker shades of red indicate higher estimated anomaly scores. (adsbygoogle = window.adsbygoogle || []).push({});after-content-x4Table of ContentsHistory[edit]Algorithm[edit]Properties of isolation forest[edit]Anomaly detection with isolation forest[edit]Anomaly score[edit]Open source implementations[edit]See also[edit]References[edit]History[edit]The Isolation Forest (iForest) algorithm was initially proposed by Fei Tony Liu, Kai Ming Ting and Zhi-Hua Zhou in 2008.[2] In 2010, an extension of the algorithm – SCiforest [4] was developed to address clustered and axis-paralleled anomalies. In 2012[3] the same authors demonstrated that iForest has linear time complexity, a small memory requirement, and is applicable to high dimensional data.In 2013 Zhiguo Ding and Minrui Fei proposed a framework based on iForest to resolve the problem of detecting anomalies in streaming data.[5] More applications of iForest to streaming data are described in papers by Tan et al.,[6] Susto et al.[7] and Weng et al.[8] (adsbygoogle = window.adsbygoogle || []).push({});after-content-x4In 2018,[9] an extension of iForest, aimed at improving the reliability of the anomaly score produced for a given data point was proposed.In 2022, [10]the algorithm was applied to microscopy data to push the detection limit of single un-labeled proteins.Algorithm[edit] Fig. 2 – an example of isolating a non-anomalous point in a 2D Gaussian distribution.The premise of the Isolation Forest algorithm is that anomalous data points are easier to separate from the rest of the sample. In order to isolate a data point, the algorithm recursively generates partitions on the sample by randomly selecting an attribute and then randomly selecting a split value between the minimum and maximum values allowed for that attribute. Fig. 3 – an example of isolating an anomalous point in a 2D Gaussian distribution.An example of random partitioning in a 2D dataset of normally distributed points is given in Fig. 2 for a non-anomalous point and Fig. 3 for a point that’s more likely to be an anomaly. It is apparent from the pictures how anomalies require fewer random partitions to be isolated, compared to normal points.Recursive partitioning can be represented by a tree structure named Isolation Tree, while the number of partitions required to isolate a point can be interpreted as the length of the path, within the tree, to reach a terminating node starting from the root. For example, the path length of point xi{displaystyle x_{i}} in Fig. 2 is greater than the path length of xj{displaystyle x_{j}} in Fig. 3.Let X={x1,\u2026,xn}{displaystyle X={x_{1},dots ,x_{n}}} be a set of d-dimensional points and X\u2032\u2282X{displaystyle X’subset X}. An Isolation Tree (iTree) is defined as a data structure with the following properties:for each node T{displaystyle T} in the Tree, T{displaystyle T} is either an external-node with no child, or an internal-node with one \u201ctest\u201d and exactly two child nodes (Tl{displaystyle T_{l}} and Tr{displaystyle T_{r}})a test at node T{displaystyle T} consists of an attribute q{displaystyle q} and a split value p{displaystyle p} such that the test q1)\u22122(m\u22121)nfor\u00a0m>21for\u00a0m=20otherwise{displaystyle c(m)={begin{cases}2H(m-1)-{frac {2(m-1)}{n}}&{text{for }}m>2\\1&{text{for }}m=2\\0&{text{otherwise}}end{cases}}}"},{"@context":"http:\/\/schema.org\/","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"https:\/\/wiki.edu.vn\/en\/wiki21\/#breadcrumbitem","name":"Enzyklop\u00e4die"}},{"@type":"ListItem","position":2,"item":{"@id":"https:\/\/wiki.edu.vn\/en\/wiki21\/isolation-forest-wikipedia\/#breadcrumbitem","name":"Isolation forest – Wikipedia"}}]}]