Order-maintenance problem – Wikipedia

before-content-x4

In computer science, the order-maintenance problem involves maintaining a totally ordered set supporting the following operations:

  • insert(X, Y), which inserts X immediately after Y in the total order;
  • order(X, Y), which determines if X precedes Y in the total order; and
  • delete(X), which removes X from the set.
after-content-x4

Paul Dietz first introduced a data structure to solve this problem in
1982.[1] This data
structure supports insert(X, Y) in

O(logn){displaystyle O(log n)}

(in Big O notation)
amortized time and order(X, Y) in constant time but does
not support deletion. Athanasios Tsakalidis used BB[α] trees with the same performance bounds that supports
deletion in

O(logn){displaystyle O(log n)}

and improved insertion and deletion performance to

O(1){displaystyle O(1)}

amortized time with indirection.[2] Dietz and Daniel Sleator published an improvement to worst-case constant time in 1987.[3] Michael Bender, Richard Cole and Jack Zito published significantly simplified alternatives in 2004.[4] Bender, Fineman, Gilbert, Kopelowitz and Montes also published a deamortized solution in 2017.[5]

Efficient data structures for order-maintenance have applications in
many areas, including data structure persistence,[6] graph algorithms[7][8] and fault-tolerant data structures.[9]

List labeling[edit]

A problem related to the order-maintenance problem is the
list-labeling problem in which instead of the order(X,
Y)
operation the solution must maintain an assignment of labels
from a universe of integers

{1,2,,m}{displaystyle {1,2,ldots ,m}}

to the
elements of the set such that X precedes Y in the total order if and
only if X is assigned a lesser label than Y. It must also support an
operation label(X) returning the label of any node X.
Note that order(X, Y) can be implemented simply by
comparing label(X) and label(Y) so that any
solution to the list-labeling problem immediately gives one to the
order-maintenance problem. In fact, most solutions to the
order-maintenance problem are solutions to the list-labeling problem
augmented with a level of data structure indirection to improve
performance. We will see an example of this below.

For a list-labeling problem on sets of size up to

n{displaystyle n}

, the cost of list labeling depends on how large

m{displaystyle m}

is a function of

n{displaystyle n}

. The relevant parameter range for order maintenance are for

m=n1+Θ(1){displaystyle m=n^{1+Theta (1)}}

, for which an

O(logn){displaystyle O(log n)}

amortized cost solution is known,[10] and

2Ω(n){displaystyle 2^{Omega (n)}}

for which a constant time amortized solution is known[11]

O(1) amortized insertion via indirection[edit]

Indirection is a technique used in data structures in which a problem
is split into multiple levels of a data structure in order to improve
efficiency. Typically, a problem of size

n{displaystyle n}

is split into

n/logn{displaystyle n/log n}

problems of size

logn{displaystyle log n}

. For
example, this technique is used in y-fast tries. This
strategy also works to improve the insertion and deletion performance
of the data structure described above to constant amortized time. In
fact, this strategy works for any solution of the list-labeling
problem with

O(logn){displaystyle O(log n)}

amortized insertion and deletion
time.

Depiction of indirection in a tree based solution to the order-maintenance problem.

The order-maintenance data structure with indirection. The total order elements are stored in

The new data structure is completely rebuilt whenever it grows too
large or too small. Let

N{displaystyle N}

be the number of elements of
the total order when it was last rebuilt. The data structure is
rebuilt whenever the invariant

N3n2N{displaystyle {tfrac {N}{3}}leq nleq 2N}

is
violated by an insertion or deletion. Since rebuilding can be done in
linear time this does not affect the amortized performance of
insertions and deletions.

During the rebuilding operation, the

N{displaystyle N}

elements of the
total order are split into

O(N/logN){displaystyle O(N/log N)}

contiguous
sublists, each of size

Ω(logN){displaystyle Omega (log N)}

. The list labeling problem is solved on the set
set of nodes representing each of
the sublists in their original list order. The labels for this subproblem are taken to be polynomial — say

m=N2{displaystyle m=N^{2}}

, so that they can be compared in constant time and updated in amortized

O(logN){displaystyle O(log N)}

time.

For each sublist a
doubly-linked list of its elements is built storing with each element a
pointer to its representative in the tree as well as a local integer
label. The local integer labels are also taken from a range

m=N2{displaystyle m=N^{2}}

, so that the can be compared in constant time, but because each local problem involves only

Θ(logN){displaystyle Theta (log N)}

items, the labels range

m{displaystyle m}

is exponential in the number of items being labeled. Thus, they can be updated in

O(1){displaystyle O(1)}

amortized time.

See the list-labeling problem for details of both solutions.

Order[edit]

Given the sublist nodes X and Y, order(X, Y) can be
answered by first checking if the two nodes are in the same
sublist. If so, their order can be determined by comparing their local
labels. Otherwise the labels of their representatives in the first list-labeling problem are compared.
These comparisons take constant time.

Insert[edit]

Given a new sublist node for X and a pointer to the sublist node Y,
insert(X, Y) inserts X immediately after Y in the sublist
of Y, if there is room for X in the list, that is if the length of the list is no greater than

2logN{displaystyle 2log N}

after the insertion. It’s local label is given by the local list labeling algorithm for exponential labels. This case takes

O(1){displaystyle O(1)}

amortized time.

If the local list overflows, it is split evenly into two lists of size

logN{displaystyle log N}

, and the items in each list are given new labels from their (independent) ranges. This creates a new sublist, which is inserted into the list of sublists, and the new sublist node is given a label in the list of sublists by the list-labeling algorithm. Finally X is inserted into the appropriate list.

This sequence of operations take

O(logN){displaystyle O(log N)}

time, but there have been

Ω(logN){displaystyle Omega (log N)}

insertions since the list was created or last split. Thus the amortized time per insertion is

O(1){displaystyle O(1)}

.

Delete[edit]

Given a sublist node X to be deleted, delete(X) simply
removes X from its sublist in constant time. If this leaves the
sublist empty, then we need to remove the representative of the
list of sublists. Since at least

Ω(logN){displaystyle Omega (log N)}


elements were deleted from the sublist since it was first built we can afford to spend the

O(logN){displaystyle O(log N)}

time, the amortized cost of a deletion is

O(1){displaystyle O(1)}

.

References[edit]

  1. ^ Dietz, Paul F. (1982), “Maintaining order in a linked list”, Proceedings of the 14th Annual ACM Symposium on Theory of Computing (STOC ’82), New York, NY, USA: ACM, pp. 122–127, doi:10.1145/800070.802184, ISBN 978-0897910705.
  2. ^ Tsakalidis, Athanasios K. (1984), “Maintaining order in a generalized linked list”, Acta Informatica, 21 (1): 101–112, doi:10.1007/BF00289142, MR 0747173.
  3. ^ Dietz, P.; Sleator, D. (1987), “Two algorithms for maintaining order in a list”, Proceedings of the 19th Annual ACM Symposium on Theory of Computing (STOC ’87), New York, NY, USA: ACM, pp. 365–372, doi:10.1145/28395.28434, ISBN 978-0897912211. Full version,
    Tech. Rep. CMU-CS-88-113, Carnegie Mellon
    University, 1988.
  4. ^ A. Bender, Michael & Cole, Richard & Zito, Jack. (2004). Two Simplified Algorithms for Maintaining Order in a List. https://www.researchgate.net/publication/2865732_Two_Simplified_Algorithms_for_Maintaining_Order_in_a_List Retrieved 2019-06-14
  5. ^ “File Maintenance: When in Doubt, Change the Layout!”, M. Bender, J. Fineman, S. Gilbert, T. Kopelowitz, P. Montes. Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms. eISBN 978-1-61197-478-2. https://doi.org/10.1137/1.9781611974782.98 Retrieved 2019-06-15
  6. ^ Driscoll, James R.; Sarnak, Neil; Sleator, Daniel D.; Tarjan, Robert E. (1989), “Making data structures persistent”, Journal of Computer and System Sciences, 38 (1): 86–124, doi:10.1016/0022-0000(89)90034-2, MR 0990051.
  7. ^ Eppstein, David; Galil, Zvi; Italiano, Giuseppe F.; Nissenzweig, Amnon (1997), “Sparsification—a technique for speeding up dynamic graph algorithms”, Journal of the ACM, 44 (5): 669–696, doi:10.1145/265910.265914, MR 1492341.
  8. ^ Katriel, Irit; Bodlaender, Hans L. (2006), “Online topological ordering”, ACM Transactions on Algorithms, 2 (3): 364–379, CiteSeerX 10.1.1.78.7933, doi:10.1145/1159892.1159896, MR 2253786.
  9. ^ Aumann, Yonatan; Bender, Michael A. (1996), “Fault tolerant data structures”, Proceedings of the 37th Annual Symposium on Foundations of Computer Science (FOCS 1996), pp. 580–589, doi:10.1109/SFCS.1996.548517, ISBN 978-0-8186-7594-2.
  10. ^ Itai, Alon; Konheim, Alan G.; Rodeh, Michael (1981), “A Sparse Table Implementation of Priority Queues”, ICALP, pp. 417–431
  11. ^ Bulánek, Jan; Koucký, Michal; Saks, Michael E. (2015), “Tight Lower Bounds for the Online Labeling Problem”, SIAM Journal on Computing, vol. 44, pp. 1765–1797.

External links[edit]

after-content-x4