Tuesday, April 15, 2008

Nth Smallest Number

Find the nth smallest number in an unsorted array of numbers. Do it in linear time and without using any added memory.


It is easy to come up with a solution to this problem: sort it then return the nth element. The problem is that sorting is O(n log n). The insight comes from knowing how quicksort is implemented and then realizing that it can be modified not to sort all elements but only try to sort those parts of the array that contain the nth element.

One should also ask about the size of n. If n is always a very small or very large (equal to the array size) value then a more straight-forward linear search might be better. The approach shown below works for all values of n and has a linear expected run time.

This algorithm was first published by C.A.R Hoare (quicksort) and appears in Programming Pearls as Problem 11.9.

#!/usr/bin/python
from Numeric import *
from random import *

def split(a):
"""Using a random pivot, order a such that
a[0..x-1] <= (pivot = a[x]) < a[x+1..]
Returns x
"""
pivot = randrange(0,len(a))
a[0],a[pivot] = a[pivot],a[0]
last = 1;
for i in range(1,len(a)):
if (a[i] <= a[0]):
a[last],a[i] = a[i],a[last]
last = last + 1
a[last-1],a[0] = a[0],a[last-1]
return last - 1

def sortNthElement(a, n, first = 0):
"""Sorts at least the nth element of a[]. That is, the nth
element of a sorted a[] is placed in the nth position. Other
elements are also moved in a[]."
""
if (len(a) <= 1):
return
mid = split(a)
if (n < first + mid):
sortNthElement(a[0:mid],n,first)
elif (n > first + mid):
sortNthElement(a[mid+1:],n,first+mid+1)


def findNthSmallest(a,n):
"""Returns the nth smallest number in a[].
Effects: modifies the order of numbers in a[]
"""
sortNthElement(a,n)
return a[n]

#Using a regular array, like a = [0,1,2], will not work because
# slicing (a[0:5]) those creates new copies, but slices of an
# array([]) still refer to the original.

def runTests():
l = []
for i in range(0,1000):
l.append(i)
a = array(l)
for i in range(0,1000):
shuffle(a)
choice = randrange(0,len(a))
assert (findNthSmallest(a,choice) == choice)
#now, with repeats
l = []
for i in range(0,1000):
l.append(i)
l.append(i)
a = array(l)
for i in range(0,1000):
shuffle(a)
choice = randrange(0,len(a))
assert (findNthSmallest(a,choice) == choice / 2)
print "Tests OK"



No comments:

ShareThis