|< < 38 > >|

Joins

Block Nested Loops Join

Further improvement

This is still enumerating the cartesian product: match(r, s) is called pR NR pS NS times. Instead:

  • Build an in-memory hash table for each block of R.

  • When scanning S: for each row of S, search the hash table for matching rows of R.
for each block of B−2 pages of R: H: hash table for each r in block: H.insert(join columns(r), r) for each row s in S: for each row r in H.find(join columns(s)): output join(r, s)

This reduces CPU time, but does not change I/O costs.

|< < 38 > >|