Thursday, April 10, 2008

Find Duplicate in Linear Time

You are given an array of numbers that contains all numbers from 0 to n, in some random order, except that one number is missing and another number is repeated (thus, there are still n numbers in the array). Find the repeated number in linear time and do not use any other data structure.

The numbers in the array can just be placed in the appropriate place in the array, just use the number as it's own position index. If when you go to place a number i in position i you notice that a[i] already contains i then you know you found the duplicate. If not, place it and use the number that was there as your new index.

#!/usr/bin/python
from Numeric import *

def find(a):
i = 0
while True:
print a
if (a[i] == i):
i += 1
continue
if (a[a[i]] == a[i]):
print "Repeat is %d" % a[i]
return a[i]
c = a[a[i]]
a[a[i]] = a[i]
a[i] = c
i = c


#Test cases
a = array([0,3,2,1,4,4,6])
a = array([0,1,2,3,4,4,6])
a = array([4,4,2,3,0,1,6])
a = array([4,4,0,1,2,3,6])

find(a)

3 comments:

jmvidal said...

I don't know of a way to do it without writing to an array. If you have a better way, I'd love to see it!

Anunay said...

instead of i = c in the last line, why can't we do i++ ? If I replace i=c with i++, the program finds the duplicate with less number of iteration in all the test cases I could try.

I would like to know if there is a case where i = c will find the duplicate but the program would fail for i++?

Thanks

Mahaveer said...

What if numbers are signed?

ShareThis