Your goal in this assignment is to implement the operators of the Relational Algebra. You will be extending the code you wrote for Assignment 1. As in the previous assignment, your code must pass provided unit tests, and manage memory properly (no leaks; no reads from, or writes to unallocated memory).
Get started as follows: (You need to be careful to get the newer versions of the provided C++ files. Doing the setup exactly as described here will ensure this.)
Note that the archive contained two source files that were not present in Assignment 1: RelationalAlgebra.h and RelationalAlgebra.cpp. RelationalAlgebra.h specifies the interface you are to implement, with documentation describing the behavior of each function. You will need to fill in the functions in RelationalAlgebra.cpp.
Table *onion(Table *r, Table *s);
Table *intersect(Table *r, Table *s);
Table *diff(Table *r, Table *s);
Table *product(Table *r, Table *s);
Table *rename(Table *r, const NameMap &name_map);
Table *select(Table *r, RowPredicate predicate);
Table *project(Table *r, const ColumnNames &columns);
Table *join(Table *r, Table *s);
(Note that union is a keyword in C++, so the union operator is represented by the onion function.)
Comments in RelationalAlgebra.h explain the behavior of each operator, along with a discussion of possible exceptions. You will need to implement these functions, in RelationalAlgebra.cpp. Your implementation will rely on the Database, Table and Row types, and as you proceed, you may find it useful to extend the interfaces of these types, (i.e., add member functions). Don't modify the provided interfaces, as the tests will then probably not compile.
Tables need names, but we don't really care about the names of these tables, (we don't use the names in any way). You can use Database::new_table_name() to generate a table name that will be guaranteed to be unique.
Each Relational Algebra operator creates a new table containing new rows. Yes, even rename. Specifically:
test.cpp contains 36 tests that test your relational algebra implementation, both "positive" cases (e.g. compute an intersection and see that the results are as expected), and "negative" ones, (e.g. try inserting a row with three columns into a table created with two columns).
Here is a typical positive test:
static void union_compatible_both_non_empty()
{
Table* r = Database::new_table("r", ColumnNames{"a", "b"});
add(r, {"1", "44"});
add(r, {"1", "55"});
add(r, {"2", "66"});
Table* s = Database::new_table("s", ColumnNames{"b", "a"});
add(s, {"1", "44"});
add(s, {"3", "77"});
Table* control = Database::new_table("control", ColumnNames{"x", "y"});
add(control, {"1", "44"});
add(control, {"1", "55"});
add(control, {"2", "66"});
add(control, {"3", "77"});
assert(table_eq(control, onion(r, s)));
}
This creates two tables, r and s, both with columns a and b (in different orders). Rows are added to the tables. Note that the {"1", "44"} row is added to each. Next, a control table is created. This table contains the expected result. In this table, note that {"1", "44"} appears once, because the output of any relational algebra operation is a table, and tables do not have duplicates. A buggy implementation might have two occurrences, violating uniqueness, and this would cause the test to fail.
The last statement in the code compares the union of r and s with the control, and asserts that output from the union and the control should have the same rows.
Here is a typical negative test:
static void join_no_join_columns()
{
Table* r = Database::new_table("r", ColumnNames{"a", "b"});
Table* s = Database::new_table("s", ColumnNames{"c", "d"});
try {
join(r, s);
fail();
} catch (JoinException& e) {
// expected
}
}
The test starts by creating a new table named r with columns named a and b, and a table s with columns c and d. Next, inside a try block, we try to join the tables. This should not succeed because there are no join columns (i.e., no columns with matching names in the two tables). The test expects that JoinException will be thrown. If it is not thrown, the test fails.
jao@mintyzack ~/teaching/tufts/115/a2/a2 $ make
g++ -std=c++11 -Wall -Wno-unused-function -c ColumnNames.cpp -o ColumnNames.o
g++ -std=c++11 -Wall -Wno-unused-function -c Database.cpp -o Database.o
g++ -std=c++11 -Wall -Wno-unused-function -c RelationalAlgebra.cpp -o RelationalAlgebra.o
g++ -std=c++11 -Wall -Wno-unused-function -c Row.cpp -o Row.o
g++ -std=c++11 -Wall -Wno-unused-function -c RowCompare.cpp -o RowCompare.o
g++ -std=c++11 -Wall -Wno-unused-function -c Table.cpp -o Table.o
g++ -std=c++11 -Wall -Wno-unused-function -c main.cpp -o main.o
g++ -std=c++11 -Wall -Wno-unused-function -c test.cpp -o test.o
g++ -std=c++11 -Wall -Wno-unused-function -c unittest.cpp -o unittest.o
g++ -std=c++11 -Wall -Wno-unused-function -c util.cpp -o util.o
g++ -std=c++11 -Wall -Wno-unused-function ColumnNames.o Database.o RelationalAlgebra.o Row.o RowCompare.o Table.o main.o test.o unittest.o util.o -o a2
jao@mintyzack ~/teaching/tufts/115/a2/a2 $ ./a2
:-( 1/36 -- union_incompatible: FAILED -- Not yet implemented!
:-( 2/36 -- union_compatible_different_names: FAILED -- Not yet implemented!
...
:-( 35/36 -- join_both_non_empty: FAILED -- Not yet implemented!
:-( 36/36 -- join_both_non_empty_2: FAILED -- Not yet implemented!
SUMMARY:
36 tests
0 passed
36 failed
When you successfully implement RelationalAlgebra, you should see this:
jao@mintyzack ~/teaching/tufts/115/a2/a2 $
As in Assignment 1, use valgrind frequently to rule out memory corruption and leaks.