Activity selection problem – Wikipedia

Combinatorial optimization problem

The activity selection problem is a combinatorial optimization problem concerning the selection of non-conflicting activities to perform within a given time frame, given a set of activities each marked by a start time (si) and finish time (fi). The problem is to select the maximum number of activities that can be performed by a single person or machine, assuming that a person can only work on a single activity at a time. The activity selection problem is also known as the Interval scheduling maximization problem (ISMP), which is a special type of the more general Interval Scheduling problem.

A classic application of this problem is in scheduling a room for multiple competing events, each having its own time requirements (start and end time), and many more arise within the framework of operations research.

Formal definition[edit]

Assume there exist n activities with each of them being represented by a start time si and finish time fi. Two activities i and j are said to be non-conflicting if sifj or sjfi. The activity selection problem consists in finding the maximal solution set (S) of non-conflicting activities, or more precisely there must exist no solution set S’ such that |S’| > |S| in the case that multiple maximal solutions have equal sizes.

Optimal solution[edit]

The activity selection problem is notable in that using a greedy algorithm to find a solution will always result in an optimal solution. A pseudocode sketch of the iterative version of the algorithm and a proof of the optimality of its result are included below.


Greedy-Iterative-Activity-Selector(A, s, f): 

    Sort A by finish times stored in f
    S = {A[1]} 
    k = 1
    n = A.length
    for i = 2 to n:
        if s[i]  f[k]: 
            S = S U {A[i]}
            k = i
    return S


Line 1: This algorithm is called Greedy-Iterative-Activity-Selector, because it is first of all a greedy algorithm, and then it is iterative. There’s also a recursive version of this greedy algorithm.

Note that these arrays are indexed starting from 1 up to the length of the corresponding array.

Line 3: Sorts in increasing order of finish times the array of activities

A{displaystyle A}

by using the finish times stored in the array

f{displaystyle f}

. This operation can be done in

O(nlogn){displaystyle O(ncdot log n)}

time, using for example merge sort, heap sort, or quick sort algorithms.

Line 4: Creates a set

S{displaystyle S}

to store the selected activities, and initialises it with the activity

A[1]{displaystyle A[1]}

that has the earliest finish time.

Line 5: Creates a variable

k{displaystyle k}

that keeps track of the index of the last selected activity.

Line 9: Starts iterating from the second element of that array

A{displaystyle A}

up to its last element.

Lines 10,11: If the start time

s[i]{displaystyle s[i]}

of the

ith{displaystyle ith}

activity (

A[i]{displaystyle A[i]}

) is greater or equal to the finish time

f[k]{displaystyle f[k]}

of the last selected activity (

A[k]{displaystyle A[k]}

), then

A[i]{displaystyle A[i]}

is compatible to the selected activities in the set

S{displaystyle S}

, and thus it can be added to

S{displaystyle S}


Line 12: The index of the last selected activity is updated to the just added activity

A[i]{displaystyle A[i]}


Proof of optimality[edit]


S={1,2,,n}{displaystyle S={1,2,ldots ,n}}

be the set of activities ordered by finish time. Assume that

AS{displaystyle Asubseteq S}

is an optimal solution, also ordered by finish time; and that the index of the first activity in A is

k1{displaystyle kneq 1}

, i.e., this optimal solution does not start with the greedy choice. We will show that

B=(A{k}){1}{displaystyle B=(Asetminus {k})cup {1}}

, which begins with the greedy choice (activity 1), is another optimal solution. Since

f1fk{displaystyle f_{1}leq f_{k}}

, and the activities in A are disjoint by definition, the activities in B are also disjoint. Since B has the same number of activities as A, that is,

|A|=|B|{displaystyle |A|=|B|}

, B is also optimal.

Once the greedy choice is made, the problem reduces to finding an optimal solution for the subproblem. If A is an optimal solution to the original problem S containing the greedy choice, then

A=A{1}{displaystyle A^{prime }=Asetminus {1}}

is an optimal solution to the activity-selection problem

S={iS:sif1}{displaystyle S’={iin S:s_{i}geq f_{1}}}


Why? If this were not the case, pick a solution B′ to S′ with more activities than A′ containing the greedy choice for S′. Then, adding 1 to B′ would yield a feasible solution B to S with more activities than A, contradicting the optimality.

Weighted activity selection problem[edit]

The generalized version of the activity selection problem involves selecting an optimal set of non-overlapping activities such that the total weight is maximized. Unlike the unweighted version, there is no greedy solution to the weighted activity selection problem. However, a dynamic programming solution can readily be formed using the following approach:[1]

Consider an optimal solution containing activity k. We now have non-overlapping activities on the left and right of k. We can recursively find solutions for these two sets because of optimal sub-structure. As we don’t know k, we can try each of the activities. This approach leads to an

O(n3){displaystyle O(n^{3})}

solution. This can be optimized further considering that for each set of activities in

(i,j){displaystyle (i,j)}

, we can find the optimal solution if we had known the solution for

(i,t){displaystyle (i,t)}

, where t is the last non-overlapping interval with j in

(i,j){displaystyle (i,j)}

. This yields an

O(n2){displaystyle O(n^{2})}

solution. This can be further optimized considering the fact that we do not need to consider all ranges

(i,j){displaystyle (i,j)}

but instead just

(1,j){displaystyle (1,j)}

. The following algorithm thus yields an

O(nlogn){displaystyle O(nlog n)}


Weighted-Activity-Selection(S):  // S = list of activities

    sort S by finish time
    opt[0] = 0  // opt[j] represents optimal solution (sum of weights of selected activities) for S[1,2..,j]
    for i = 1 to n:
        t = binary search to find activity with finish time <= start time for i
            // if there are more than one such activities, choose the one with last finish time
        opt[i] = MAX(opt[i-1], opt[t] + w(i))
    return opt[n]


External links[edit]