Assignment 3 — Data Modeling and Schema Design

This assignment involves several aspects of data modeling and schema design:

This is purely a paper and pencil (or vi or emacs) assignment — no programming is involved.

Part 1: Create a data model for a university

You are going to model a university. Some of the obvious entities are:

Your model also has to describe each Offering of a course, e.g. COMP115 in the Spring 2018 semester. An Offering of a Course is taught by one Professor, assisted by zero or more TAs. Associated with a course offering are details of evaluation: a set of tasks (exams, assignments), each with a weight. For example, the tasks might be:

These details may vary from one semester to the next, so you need to model this information to be associated with Course Offerings, not Courses. Your model also has to record each student's score (between 0 and 100) for each of these tasks.

With all of this information, the data model does not have to include the student's grade for the course, as it can be computed from the scores on each task. (Since this assignment is about data modeling and not querying, you won't need to write the query that computes the final grade.)

Part 1.1: Entity/Relationship diagram

Create an Entity/Relationship model for the university database. Your work should be in the form of an Entity/Relationship diagram, using the notation discussed in class (boxes for entities, labels or diamonds for relationships). For binary relationships, connecting lines should include arrows to indicate the one side of a relationship, as shown here.

A diagram prepared with a drawing tool would be ideal. Export your drawing as a JPG or PNG file, don't submit the Office document. If you must draw by hand, please be very concerned with legibility, and take a clear photo with no reflections or shadows.

Part 1.2: List of Entity Attributes

For each entity, list the attributes. Don't include the attributes in the diagram, list them separately, in tabular form.

Part 2: Database schema

Part 2.1: DDL

Provide a database schema corresponding to your Entity/Relationship model. This should be provided as DDL: a set of CREATE TABLE statements, (and possibly ALTER TABLE statements), and should be as complete as possible: column names, column types, constraints, primary keys, foreign keys.

Think carefully about the primary keys for each table. While it is always possible to use a surrogate key (e.g., a generated integer, or a UUID), you should also consider using natural keys when possible. For example, should the primary key of a Course be the course's catalog number or a surrogate (artificial) key?

Part 2.2: Discuss primary keys

Write a short discussion of how you designed the primary keys in your schema. What are the reasons for introducing surrogate keys? You don't have to discuss every table individually, as that would probably be repetitive. Just describe your general approach.

Part 3: Ruling out FDs based on data

Functional dependencies rule out certain states of a table. For example, if Y → Z applies to table T(Y, Z), then T can't have both rows (1, 10) and (1, 20). So if you look at data, you can rule out certain functional dependencies. If you see rows (1, 10) and (1, 20), then obviously Y → Z does not hold.

Part 3.1: Enumerate all possible functional dependencies for a 3-column table

Suppose we have a table with three columns, X, Y, and Z.

Enumerate all of the non-trivial functional dependencies that could possibly exist with this table, (e.g. X, Y → X is trivial because X is a subset of (X, Y)).

For example, a table with two columns has two possible non-trivial dependencies, X → Y and Y → X.

Part 3.2: Which functional dependencies are ruled out by this data?

Suppose the tables has these rows:

X Y Z
157
168
268
5710
6810
7120
7220
8930
9930

Of the functional dependencies you enumerated, which ones can be eliminated by examining this data? In each case, show the data that leads you to your conclusion.

Part 3.3: Same question, less data

Suppose, instead, the table has just these two rows:

X Y Z
157
168

What FDs can you rule out due to this data?

Part 3.4: Same question, much less data

Suppose, instead that the table is empty. What FDs can you rule out now? (Again, show the data.)

Part 4: BCNF decomposition

For each of the following problems, do the following:

Part 4.1: Problem 1

Table columns: V, W, X, Y, Z.

Functional dependencies:

Part 4.2: Problem 2

Table columns: W, X, Y, Z.

Functional dependencies:

Checklist

Here is a list of the items to be submitted for this assignment: