Exercise: Database Normalization

Create an ER Diagram in Vertabelo for the Shakespeare database that puts the tables in 3NF.

First import the existing schema in Vertabelo. In pgAdmin, right-click the shakespeare schema and choose Backup.... Make sure the Format type is set to “Plain”, and on the “Dump Options #1” tab, select “Only schema”.

Then in Vertabelo choose File→New Model. Assign a name and set the Database engine to PostgreSQL 9.x. In the Initial model section choose “From SQL” and navigate to the schema-only backup that you just created in pgAdmin. When you hit the Start Modeling button, you should see all of the tables, with foreign key relationships represented as lines connecting the entities (tables).

Delete the wordform table, as we will not be using it.

  1. For each table (other than wordform), state the highest normal form (up to 3NF) the table satisfies. (For example, if you state 2NF, you are indicating that the table is not in 3NF). Explain how the table fails to satisfy the requirements of the lowest normal form that it fails. Consider issues of data redundancy, modification anomalies, and functional dependencies.
  2. Modify all tables based on your answer to #1 to bring them into 3NF. You may need to add or drop fields, change the primary key definition, or add new tables.
  3. Add a table genretype that decodes the single character code appearing in the genretype field of the work table, as follows: c = comedy, h = history, p = poems, s = sonnets, t = tragedy. Add a foreign key to the work table referencing this new table.
  4. Export the SQL from Vertabelo and run it in the scratch schema. Note, you may have to alter table names to avoid conflicting with other tables created by other students. Delete (DROP) the tables after you prove to yourself that they are created successfully.
  5. Write a query that returns the genre type name (not code) and the average number of words in the works in that genre type. Note that you will be writing this query by looking at your ERD, without any data to validate it against. We will go over the solution in class in two weeks.

Please submit via email your answers to #1 in a Word DOC(X), PDF, or Google Doc, with an embedded image of your ERD from #2 & 3, and the query you wrote in #5.