## Thursday, April 10, 2008

### Sorting a Big File

You have a file that contains at most 10^7 positive integers, all distinct, all between 0 and 10^7. You want to sort these numbers but only have about 2 megabytes of RAMs. How do you do it?

Before you read the answer here is a tip: the best answer sorts the file in linear time reading every number into memory only once.

Give up? Since the file contains all distinct integers between 0 and 10,000,000, we can use bits to represent whether every number in this range exists. For this we need only 10^7 bits or 1,250,000 bytes (1.25MB) so it can fit in memory. Here is some pseudocode:

`boolean bit[10000000]; //An array of bits. I'm assuming a boolean array of size 8// uses only one byte.//Set it to 0for (int i = i; i < n; i++){  bit[i] = 0; }//read every number from filef = filestream("filename"); //pseudocodewhile (int i << f){  bit[i] = 1; };//write it outfor (int i = i; i < n; i++){  if (bit[i] == 1)    cout << i; };`

This is a very old question. It appears in Programming Pearls by Jon Bentley, a really fun read!