Online Median Problem

Online Median Problem Statement

Given a list of numbers, the task is to insert these numbers into a stream and find the median of the stream after each insertion.
If the median is a non-integer, consider it’s floor value.

The median of a sorted array is defined as the middle element when the number of elements is odd and the mean of the middle two elements when the number of elements is even.

Example

{
"stream": [3, 8, 5, 2]
}

Output:

[3, 5, 5, 4]
Iteration Stream Sorted Stream Median
1 [3] [3] 3
2 [3, 8] [3, 8] (3 + 8) / 2 => 5
3 [3, 8, 5] [3, 5, 8] 5
4 [3, 8, 5, 2] [2, 3, 5, 8] (3 + 5) / 2 => 4

Notes

Constraints:

  • 1 <= length of stream <= 105
  • 1 <= any value in the stream <= 105
  • The stream can contain duplicates.

We have provided three solutions.

We will start with a brute-force and an optimized brute-force approach that solves the problem in polynomial time complexity, later we present an optimized solution that uses heap. Throughout this editorial, we will refer to the input array as stream and its size as n.

Online Median Solution 1: Brute Force

The idea is to follow the exact same steps described in the problem statement.

for i = 1 to n:

  • Create an array current_stream containing stream[1], stream[2], …, stream[i].
  • Sort current_stream.
  • Compute the median of the current_stream array.

Time Complexity

O(n2 * log(n)).

To iterate in n elements: O(n).

In each iteration,

  • Create and sort the current_stream array: O(n * log(n)).
  • Compute median: O(1).

Total complexity: O(n2 * log(n)).

Auxiliary Space Used

O(n).

Space used to create a temporary array in each iteration: O(n).

Space Complexity

O(n).

Space used for input: O(n).

Auxiliary space used: O(n).

Space used for output: O(n).

So, total space complexity: O(n).

Code For Online Median Solution 1: Brute Force


/*
Asymptotic complexity in terms of `n` = size of the input array:
* Time: O(n^2 * log(n)).
* Auxiliary space: O(n).
* Total space: O(n).
*/

vector<int> online_median(vector<int> &stream) {
    vector<int> medians;

    for (int i = 1; i <= stream.size(); i++) {
        // Slice the stream till i index.
        vector<int> current_stream(stream.begin(), stream.begin() + i);
        sort(current_stream.begin(), current_stream.end());

        int median;
        int size = current_stream.size();
        if (size % 2 == 0)
            median = (current_stream[size / 2] + current_stream[size / 2 - 1]) / 2;
        else
            median = current_stream[size / 2];

        medians.push_back(median);
    }
    return medians;
}

Online Median Solution 2: Optimised Brute Force

As the name suggests, the solution will be similar to brute_force_solution.cpp. We always need a sorted array after fetching a new element from the stream to compute the median, we can use the sorted array from the previous iteration and apply the concept of insertion sort where we insert this new element into the previously sorted array.

for i = 1 to n:

  • Insert stream[i] to an already sorted substream stream[1, 2, ..., i - 1].
  • Compute the median of the substream stream[1, 2, ..., i].

We use the concept of insertion sort to remove the log(n) factor from the time complexity and reduce the auxiliary space requirement to O(1).

Time Complexity

O(n2).

To iterate in n elements: O(n).

In each iteration,

  • Insert new element in the sorted array: O(n).
  • Compute median: O(1).

Total complexity: O(n2).

Auxiliary Space Used

O(1).

Space Complexity

O(n).

Space used for input: O(n).

Auxiliary space used: O(1).

Space used for output: O(n).

So, total space complexity: O(n).

Code For Online Median Solution 2: Optimised Brute Force


/*
Asymptotic complexity in terms of `n` = size of the input array:
* Time: O(n^2).
* Auxiliary space: O(1).
* Total space: O(n).
*/

vector<int> online_median(vector<int> &stream) {
    vector<int> medians;

    for (int i = 0; i < stream.size(); i++) {
        // Applying insertion sort.
        // Insert stream[i] in stream[0, 1,..., i - 1] which is already sorted.
        for (int j = i - 1; j >= 0; j--) {
            if (stream[j + 1] < stream[j])
                swap(stream[j + 1], stream[j]);
            else
                break;
        }

        int median;
        int current_elements = i + 1;
        if (current_elements % 2 == 0)
            median = (stream[current_elements / 2] + stream[current_elements / 2 - 1]) / 2;
        else
            median = stream[current_elements / 2];

        medians.push_back(median);
    }
    return medians;
}

Online Median Solution 3: Heap

The median of an array can be computed only when the array is sorted. To add an element from the stream, we need to maintain a sorted array, and adding an element in the sorted array requires O(sizeofsorted_array) time. As we need only the middle element/s, this complexity can be improved by using a min-heap and a max-heap.

The min-heap will store the larger half of the stream and the max-heap will store the lower half of the sorted stream. For every element that is added from the stream, we keep the sizes of the heaps the same or they differ maximum by 1. Without the loss of generality, we will have the extra element in max-heap whenever required. This way, if the total size of the stream is odd, the element on top of the max-heap is our median, else the floor of the average of the elements on the top of the min-heap and the max-heap is our required value.
Please have a look at the solution for a better understanding.

Time Complexity

O(n * log(n)).

To iterate in n elements: O(n).

In each iteration,

  • Add OR remove element to/from heap: O(log(n)).
  • Access top element of the heap: O(1).

Total complexity: O(n * log(n)).

Auxiliary Space Used

O(n).

Space used for max-heap = O(n / 2).

Space used for min-heap = O(n / 2).

Space Complexity

O(n).

Space used for input: O(n).

Auxiliary space used: O(n).

Space used for output: O(n).

So, total space complexity: O(n).

Code For Online Median Solution 3: Heap


/*
Asymptotic complexity in terms of `n` = size of the input array:
* Time: O(n * log(n)).
* Auxiliary space: O(n).
* Total space: O(n).
*/

priority_queue<int> max_heap;                             // To store the smaller half of the input numbers.
priority_queue<int, vector<int>, greater<int>> min_heap;  // To store the larger half of the input numbers.

void add_new_element(int num) {
    // Balancing heaps to make sure:
    // - smaller half of input numbers are always in the max heap
    // - larger half of input numbers are always in the min heap
    max_heap.push(num);
    min_heap.push(max_heap.top());
    max_heap.pop();

    // Maintain size property.
    // 1. max_heap.size() = min_heap.size(), when number of elements is even
    // 2. max_heap.size() = min_heap.size() + 1, when number of elements is odd
    if (min_heap.size() > max_heap.size()) {
        max_heap.push(min_heap.top());
        min_heap.pop();
    }
}

int get_current_stream_median() {
    // If number of elements in the stream is even.
    if (max_heap.size() == min_heap.size())
        return (max_heap.top() + min_heap.top()) / 2;

    // If number of elements in the stream is odd.
    return max_heap.top();
}

vector<int> online_median(vector<int> &stream) {
    vector<int> medians;

    for (int i = 0; i < stream.size(); i++) {
        add_new_element(stream[i]);
        medians.push_back(get_current_stream_median());
    }
    return medians;
}

We hope that these solutions to online medium problem have helped you level up your coding skills. You can expect problems like these at top tech companies like Amazon and Google.

If you are preparing for a tech interview at FAANG or any other Tier-1 tech company, register for Interview Kickstart’s FREE webinar to understand the best way to prepare.

Interview Kickstart offers interview preparation courses taught by FAANG+ tech leads and seasoned hiring managers. Our programs include a comprehensive curriculum, unmatched teaching methods, and career coaching to help you nail your next tech interview.

We offer 18 interview preparation courses, each tailored to a specific engineering domain or role, including the most in-demand and highest-paying domains and roles, such as:

‍To learn more, register for the FREE webinar.

Try yourself in the Editor

Note: Input and Output will already be taken care of.

Add Two Numbers Represented By Lists Problem

Binary Tree Level Order Traversal Problem with Examples

Jump Game

Zigzag Sort Problem

Boggle Solver Problem

Shortest String Transformation Using A Dictionary Problem

Register for our webinar

How to Nail your next Technical Interview

Loading_icon
Loading...
1 Enter details
2 Select slot
By sharing your contact details, you agree to our privacy policy.

Select a Date

Time slots

Time Zone:

Get tech interview-ready to navigate a tough job market

Best suitable for: Software Professionals with 5+ years of exprerience
Register for our FREE Webinar

Next webinar starts in

00
DAYS
:
00
HR
:
00
MINS
:
00
SEC