1.1. Department, Major, Student

1.2. Department <- Major <- Student

1.3.

Entity ids determine entity attributes:

    DI -> DN, DA
    MI -> MN, MDI
    SI -> SN, SMI, SGD

Entity ids determine ids of associated ids in the one-to-many
direction):

    SI -> MI
    MI -> DI

MDI is a copy of DI, and MI is a copy of SMI. So:

    DI -> MDI
    MDI -> DI
    MI -> SMI
    SMI -> MI

1.4.

University is not BCNF
Candidate keys: (SI)
DI -> DN, DA is a violation
Closure: {DI, DN, DA, MDI}
Decomposition: 
  - D(DI, DN, DA, MDI)
  - MS(MI, MN, SI, SN, SMI, SGD, DI)

D is BCNF

MS is not BCNF
Candidate keys: (SI)
MI -> MN is a violation
Closure: {MI, MN, SMI}
Decomposition:
  - M(MI, MN, SMI)
  - S(SI, SN, SGD)

M is BCNF
S is BCNF

COMMENTS: 

- MDI and SMI are useless, as they just copy DI and MI. They could be
removed from the M and D tables, although this cannot be accomplished
by the BCNF decomposition.

- I was quite lenient in grading, accepting a wide variety of
different decompositions leading to approximately the same result as
obtained above, (e.g. omitting the SI/SMI and DI/MDI
dependencies). Also, any order of decompositions was accepted, (since
the decomposition algorithm specifies the selection of any FD
violating BCNF).

----------------------------------------------------------------------

2.1.

select distinct name
from salesperson join sale using (salesperson_id)
where when = '1/5/2018'

2.2.

select average(price)
from sale join sale_date using (when)
where weekday

2.3.

select name from salesperson
except
select name from salesperson join sale using (salesperson_id)

COMMENTS:

- A number of students had the clever idea of using an outer join:

select name
from salesperson p left join sale s using (salesperson_id)
where s.sale_id is null

This answer received full credit.

- To be really precise, the difference should be done on ids, because
  there could be two different salespeople with the same name, e.g.

select name from (
    select salesperson_id from salesperson
    except
    select salesperson_id from sale
) x

This answer is actually better.

I accepted both answers.

2.4.

select d.when, sum(s.price)
from sale_date d left join sale s using (when)
group by d.when
order by d.when

----------------------------------------------------------------------

3.

GENERAL COMMENTS: This is a question about B+-trees, not
B-trees. Answers that relied on records (as opposed to keys) in
interior nodes were penalized.

3.1. 

[10, 20]

3.2. 

       [3,7]
       | | |
   +---+ | +---+
   |     |     |
[1,2]  [3,6] [7,8]

COMMENTS: 

- Many students provided trees that, while correct for the set of
records, were not obtainable from the starting point by inserting 2
and then 6. 

3.3.

          [37,55]
          |  |  |
   +------+  |  +-----+
   |         |        |
[11,23] [37,42,53] [59,73]

3.4. 

                          [51]
                          |  |
              +-----------+  +---------+
              |                        |
           [27,49]                   [82]
           |  |  |                   |  | 
     +-----+  |  +----+         +----+  +--+
     |        |       |         |          |
[11,23,25] [37,42] [49,50] [64,73,80] [83,92,97]

COMMENTS:

- Interior node capacity is three keys. 49 is promoted to the
  [27,51,82] node, requiring a split and further promotion of 51. This
  leaves [82] all by itself. This violates the rule on space
  utilization (min 50%), but this is unavoidable in a B+-tree with
  wuch a low (and odd) capacity.

3.5.

Two possible answers, depending on whether the middle node is merged
with its left neighbor or its right neighbor, (both acceptable).

1) 

        [55]
        |  |
     +--+  +--+
     |        |
[11,23,42] [64,73]

2) 

      [42]
      |  |
   +--+  +--+
   |        |
[11,23] [42,64,73]

COMMENTS: Leaving a leaf with 0-1 records was not an acceptable
answer.

----------------------------------------------------------------------

4. GENERAL COMMENTS:

There seems to have been some confusion on this question. An index can
be used even if the predicate specifies a PREFIX of the key
columns. E.g., consider 4.1. The predicate is a = 1 and b = 2 and c >
3. The abc index can be used for this predicate. The ca index can ALSO
be used, because the c column is the first column of the index. The ca
index won't help with the a = 1 part of the query, but the index has
still been used.

So many students seem to have assumed that ALL index columns must be
used (to consider the index for the query), that I put down the low
scores on this question to a lack of clarity in the question itself. I
will be scaling up the scores on this question.

The answer below shows what indexes have been used, and which columns
of the index. So for 4.1 again, abc indicates that all columns of the
abc index can be used to evaluate the query. c(a) indicates that only
the c column of the ca index is used. (I did not ask you to specify
which columns are used. That information is provided here to explain
my answers.)

4.1. abc        c(a)

4.2. abc        ca

4.3. a(bc)      ca

4.4. a(bc)      ca

4.5. ab(c)

4.6. a(bc)

4.7. a(bc)      ca

4.8. c(a)