|< < 22 > >|

Unique

UniqueHash

Duplicates can also be eliminated by hashing.

If the input does not fit in memory

I.e., N > B

open(): Partition the table, so that duplicated rows are always in the same partition as one another:

Create B partitions on disk.
Use each disk buffer page to write out to one of the partitions.
For each input row, hash to one of the partitions.

|< < 22 > >|