Joins
Block Nested Loops Join
Further improvement
This is still enumerating the cartesian product: match(r, s) is called
pR NR pS NS times.
Instead:
- Build an in-memory hash table for each block of R.
- When scanning S: for each row of S, search the hash table for
matching rows of R.
for each block of B−2 pages of R:
H: hash table
for each r in block:
H.insert(join columns(r), r)
for each row s in S:
for each row r in H.find(join columns(s)):
output join(r, s)
This reduces CPU time, but does not change I/O costs.
|