Solution: Majority Element

Solution for the Majority Element Problem.

We'll cover the following

Naive solution

Here is the naive algorithm for solving the Majority Element Problem with quadratic running time:


 MajorityElement(A[1..n])MajorityElement(A[1..n]):
 for ii from 11 to nn:
  currentElementcurrentElement \leftarrow A[i]A[i]
  countcount \leftarrow 0
  for jj from 11 to nn:
    if A[j]A[j] = currentElementcurrentElement:
    countcount \leftarrow count+1count + 1
  if count>n/2count > n/2:
    return 11
  return 00


In practice, we can scan the input sequence and save the number of occurrences of each element in an associative array. The running time of this solution depends on a particular implementation of an associative array. If it is implemented as a balanced search tree, every lookup in the array will cost O(logn)O(logn) and the overall running time will be O(nlogn)O(nlogn). For hash tables, the lookups are efficient in practice, though they might vary depending on the input data.

The divide-and-conquer strategy results in a simple algorithm with running time O(nlogn)O(n log n). A simple, but crucial idea: if ee is a majority element in a sequence, then ee must be a majority element in at least one of its halves. Note that the converse is not true; both halves of a sequence (2,3,3,7,5,7)(2,3,3,7,5,7) contain majority elements (33 and 77, respectively), but none of them is a majority element of the original sequence. This leads to the following algorithm: find a majority element recursively in both halves and for each of them check its number of occurrences in the original sequence. For the last step, we need two linear scans that can be performed in time O(n)O(n). Therefore, the running time T(n)T (n) satisfies T(n)2T(n/2)+O(n)T (n) ≤ 2T (n/2) + O(n) and therefore, T(n)=O(nlogn)T(n)=O(nlogn).

Exercise break: Can you design an even faster O(n)O(n) algorithm? It’s based on the following idea. Partition the input elements into pairs. For each pair, if the two elements are different, discard both of them; otherwise, discard one of them.

Code

The following code implements the divide-and-conquer algorithm discussed above. Note that we are using semiopen intervals for recursive calls.

Level up your interview prep. Join Educative to access 70+ hands-on prep courses.