Thursday, April 10, 2008

Find Duplicate in Linear Time

You are given an array of numbers that contains all numbers from 0 to n, in some random order, except that one number is missing and another number is repeated (thus, there are still n numbers in the array). Find the repeated number in linear time and do not use any other data structure.

The numbers in the array can just be placed in the appropriate place in the array, just use the number as it's own position index. If when you go to place a number i in position i you notice that a[i] already contains i then you know you found the duplicate. If not, place it and use the number that was there as your new index.

#!/usr/bin/python
from Numeric import *

def find(a):
i = 0
while True:
print a
if (a[i] == i):
i += 1
continue
if (a[a[i]] == a[i]):
print "Repeat is %d" % a[i]
return a[i]
c = a[a[i]]
a[a[i]] = a[i]
a[i] = c
i = c


#Test cases
a = array([0,3,2,1,4,4,6])
a = array([0,1,2,3,4,4,6])
a = array([4,4,2,3,0,1,6])
a = array([4,4,0,1,2,3,6])

find(a)

7 comments:

parth patel said...

How would you do it if your array was immutable.

parth patel said...

Moreover, there is a simpler solution to above problem than one you have given using mathematical serieses.

Jose M Vidal said...

I don't know of a way to do it without writing to an array. If you have a better way, I'd love to see it!

parth patel said...

Sorry, i was looking at a modified problem, if you are given that the numbers are only from 0 to n-2 (n-1 excluded).

Then you can use series formula along with the sum of the given array.

For the given problem, your solution is the optimal.

parth patel said...

There is one more similar qu.

If you are told that there is more than 1 duplicates and numbers are from 0 to n-2. Then how would you find one of the duplicates ?

Anunay said...

instead of i = c in the last line, why can't we do i++ ? If I replace i=c with i++, the program finds the duplicate with less number of iteration in all the test cases I could try.

I would like to know if there is a case where i = c will find the duplicate but the program would fail for i++?

Thanks

mahaveer said...

What if numbers are signed?

ShareThis