Primary Keys
Adding a primary key:
ALTER TABLE STATE ADD PRIMARY KEY (Name); Removing the primary key:
ALTER TABLE STATE DROP PRIMARY KEY;Lecture Notes
August 8, 2022 (05:47:16 PM)
CAMPUSSQLSQLSQL codingSQL statements for the COFFEE databaseAs of August 2022, the main author of those notes (Dr. Aubert) is not scheduled to teach CSCI 3410 - Database Systems in the forseeable future. As a result, those notes are archived.
Please, also note that pandoc-include-code and pandoc-numbering, required to compile those notes, are not compatible with the current version of pandoc (cf. the INSTALL.md instructions). As a result, compiling those notes will require increasing version tinkering.
These lecture notes are written in an elusive style: they are a support for the explanations that will be made at the board. Reading them before coming to the lecture will help you getting a sense of the next topic we will be discussing, but you may sometimes have trouble deciphering their … unique style.
On top of the notes, you will find in this document:
Any feedback is greatly appreciated. Please refer to https://spots.augusta.edu/caubert/db/ln/README.html#contributing for how to contribute to those notes. The syllabus is at https://spots.augusta.edu/caubert/db/, and the webpage for those notes is at https://spots.augusta.edu/caubert/db/ln/.
Please, refer to those notes using this entry (Aubert 2019):
@report{AubertCSCI3410-DatabaseSystems,
author={Aubert, Clément},
title={CSCI 3410 - Database Systems},
url={https://spots.augusta.edu/caubert/db/ln/},
urldate={2019-11-03},
year={2019},
institution={{School of Computer and Cyber Sciences, Augusta University}},
location={Augusta, Georgia, USA},
langid={en},
type={Lecture notes}
}There are four way to access the code shared in those lecture notes:
For this latter aspect, note that some portion of code starts with a path in comment, and are followed by a link, like so:
HW_HelloWorld.sqlThis means that this code can be found at
and that you can click the link below the code directly to access it1.
The SQL code frequently starts with
DROP SCHEMA IF EXISTS HW_NAME_OF_SCHEMA;
CREATE SCHEMA HW_NAME_OF_SCHEMA;
USE HW_NAME_OF_SCHEMA;This parts starts by deleting the schema HW_NAME_OF_SCHEMA if it exists, then create and use it: it allows the code to run independently of your installation. It needs to be used with care, though, since it would delete everything you have in the HW_NAME_OF_SCHEMA schema before re-creating it, but empty.
Finally, the comments
-- start snippet somethingand
-- end snippet somethingcan be ignored, as their are an artifice from pandoc-include-code to select which portion of the code to display in those notes.
A typical (meeting twice a week, ±17 weeks, ±30 classes) semester is divided as follows:
For information purposes, an indication like this:
marks the (usual) separation between two lectures.
To give you a sense of what you will be asked during the exams, quizzes and projects, or simply to practise, please find below the exams given previous semesters, in reverse chronological order. The quizzes are not indicated, but were generally a mix of up to five exercises and one problem from the relevant chapter(s).
Due to the Covid-19 pandemic, only one exam took place, and the final exam was taken remotely on D2L. A second project, more ambitious, was also asked from the students, and accounted for a large portion of their grade.
OR operator, and Exercise 3.30)SQL coding)SQL statements for the COFFEE database)SQL coding)SQL statements for the COFFEE database)The source code for those notes is hosted at rocketgit, typeset in markdown, and then compiled using pandoc and multiple filters (pandoc-numbering, the citeproc library, pandoc-include-code). The drawings use various LaTeX packages, including PGF, TikZ, tikz-er2, pgf-umlcd and tikz-dependency. The help from the TeX - LaTeX Stack Exchange community greatly improved this document. The u͟n͟d͟e͟r͟l͟i͟n͟e͟3 text is obtained using YayText, the unicode symbols are searched in the “Unicode characters and corresponding LaTeX math mode commands”. Finally, the pdf version of the document uses Linux Libertine fonts, the html version uses Futura.
Those lecture notes were created under an Affordable Learning Georgia Mini-Grant for Ancillary Materials Creation and Revision (Proposal M71).
Those lecture notes have greatly benefited from the contributions of many students, included but not limited to Crystal Anderson, Bobby Mcmanus, Minh Nguyen and Poonam Veeral. Additionally, (Redacted), Mark Holcomb, Assya Sellak, Sydney Strong and Patrick Woolard helped smash some bugs in the tools used to produce this document.
Please refer to https://spots.augusta.edu/caubert/db/ln/README.html#authors-and-contributors for a detail of the contributions.
You can find at the end of this document the list of references, and some particular resources listed at the beginning of each chapter. Let me introduce some of them:
Those resources are listed as complements, but it is not require to read them to understand the content of those notes. (Watt and Eng 2014) –being available free of charge– is more descriptive than the current notes, and as such can constitutes a great complement. Unfortunately, it lacks some technical aspects, and the database program aspect is not discussed in detail.
This work is under Creative Commons Attribution 4.0 International License or later.
Some figures and resources are borrowed from other sources, in which case it is indicated clearly.
There is a good chance that any programming language you can think of is Turing complete. Actually, even some of the extremely basic tools you may be using may be Turing complete. However, being complete does not mean being good at any task: it just means that any computable problem can be solved, but does not imply anything in terms of efficiency, comfort, or usability.
In theory, pretty much any programming language can be used to
But to obtain a system that is fast in reading and writing on the disk, convenient to search in the data, and that provides as many “built-in” tools as possible, one should use a specialized tool.
In those lecture notes, we will introduce one of this tool–the SQL programming language– and the theory underneath it–the relational model–. We will also observe that a careful design is a mandatory step before implementing a catalog, and that how good a catalog is can be assessed, and introduce the tools to do so. Finally, we will discuss how an application interacting with a database can be implemented and secured, and the alternatives to SQL offered by the NoSQL approach, as well as the limitations and highlights of both models.
A database (DB) is a collection of related data.
It has two components, the data (= information, can be anything, really) and the management (= logical organization) of the data, generally through a Database Management System.
A database
The mini-world is the part of the world, or universe, that will be represented in the database: as we can not represent the whole universe (every position of every atom at any given moment since the big-bang!), we must agree on what “slice” of it we should represent. Typically, a data-base designed to help in calculating students’ grades will include students’ names, transcript, classes taken, etc., but certainly not their height, favorite color or where they usually seat in class: altough all of this information is “part of the universe”, we will not need it and decide to exclude it from our data.
A DBMS has multiple components, as follows:
Note that
A DBMS contains a general purpose software that is used to
You can think of a tool to
Exactly like a program can have
a DBMS offers multiple (sub)tasks and can be interacted with different persons with different roles.
| Role | Task |
|---|---|
| Client | Specify the business statement, the specifications |
| DB Administrator | Install, configure, secure and maintain up-to-date the DBMS |
| Designer | Lay out the global organization of the data |
| Programmer | Implement the database, work on the programs that will interface with it |
| User | Provide, search, and edit the data (usually) |
In those lecture notes, the main focus will be on design and implementation, but we will have to do a little bit of everything, without forgetting which role we are currently playing.
From the business statement to the usage, a project generally follows one of this path:
Note that reverse-engineering can sometimes happen, i.e., if you are given a poor implementation and want to extract a relational model from it, to normalize it.
Let us consider the following:
STUDENT
| Name | Student_number | Class | Major |
|---|---|---|---|
| Morgan | 18 | 2 | IT |
| Bob | 17 | 1 | CS |
COURSE
| Course_name | Course_number | Credit_hours | Department |
|---|---|---|---|
| Intro. to CS | 1301 | 4 | CS |
| DB Systems | 3401 | 3 | CS |
| Principles of Scripting and Automation | 2120 | 3 | AIST |
SECTION
| Section_identifier | Course_num | Semster | Year | Instructor |
|---|---|---|---|---|
| 2910 | 1301 | Fall | 2019 | Kate |
| 9230 | 2103 | Spring | 2020 | Todd |
GRADE_REPORT
| Student_number | Section_identifier | Grade |
|---|---|---|
| 17 | 2910 | A |
| 18 | 2910 | B |
PREREQUISITE
| Course_number | Prerequisite_number |
|---|---|
| 2120 | 1301 |
| 1302 | 1301 |
You can describe the structure as a collection of relations, and a collection of columns:
RELATIONS
| Relation Name | Number of Columns |
|---|---|
| STUDENT | 4 |
| COURSE | 4 |
| SECCTION | 5 |
| GRADE_REPORT | 3 |
| PREREQUISITE | 2 |
COLUMNS
| Column Name | Datatype | Belongs to relation |
|---|---|---|
| Name | String | STUDENT |
| Student_number | Integer | STUDENT |
| Class | String | STUDENT |
| Major | String | STUDENT |
| Course_name | String | COURSE |
| Course_number | Integer | COURSE |
| Credit_hours | Integer | COURSE |
| Department | String | COURSE |
| … | … | … |
| Prerequisite_number | Integer | PREREQUISITE |
This organization will allow some interactions. For instance, we can obtain the answer to questions like
“What is the name of the course whose number is 1301?”,
“What courses is Kate teaching this semester?”,
“Does Bob meets the pre-requisite for 2910?”
Note that this last query is a bit different, as it forces us to look up information in multiple relations.
We should also be able to perform updates, removal, addition of records in an efficient way (using auxiliary files (indexes), optimization).
Finally, selection (for any operation) requires care: do we want all the records, some of them, exactly one?
Why are the files separated like that? Why do not we store the section with the course with the students? For multiple reasons:
In separating the datae, we also need to remember to be careful about consistency and referential integrity, which is a topic we will discuss in detail.
There is a gradation, from really abstract specification that is easy to modify, to more solidified description of what needs to be coded. When we will be discussing high-level models, we will come back to those notions. The global idea is that it is easier to move things around early in the conception, and harder once everything is implemented.
What is the difference between a database and the meta-data of the database?
Is a pile of trash a database? Why, or why not?
Define the word “miniworld”.
Expand the acronym “DBMS”.
Name two DBMS.
Name the four different kinds of action that can be performed on data.
Assign each of the following task to one of the “character” (administrator, client, etc.) we introduced:
| Task | Assigned to |
|---|---|
| Install a DBMS on a server. | |
| Sketch the schema so that the data will not be redundant. | |
| Write client-side application that uses the DBMS API. | |
| Establish the purpose of the database. |
List some of the tasks assigned to the Database Administrator.
Why do DBMS include concurrency control?
Do I have to change my DBMS if I want to change the structure of my data?
What is independence between program and data? Why does it matter?
Assume that I have a file where one record corresponds to one student. Should the information about the classes a student is taking (e.g. room, instructor, code, etc.) being stored in the same file? Why, or why not?
Which one comes first, the physical design, the conceptual design, or the logical design?
What is a virtual data? How can I access it?
The data is the information we want to store, the meta-data is its organization, how we are going to store it. Meta-data is information about the data, but of no use on its own.
No, because it lacks a logical structure.
The mini-world is the part of the universe we want to represent in the database. It is supposed to be meaningful and will serve a purpose.
Database Management System
Oracle RDBMS, IBM DB2, Microsoft SQL Server, MySQL, PostgreSQL, Microsoft Access, etc., are valid answers. Are not valid “SQL”, “NoSQL”, “Relational Model”, or such: we are asking for the names of actual softwares!
The four actions are:
We can have something like:
| Task | Assigned to |
|---|---|
| Install a DBMS on a server. | Administrator, IT service |
| Sketch the schema so that the data will not be redundant. | Designer |
| Write client-side application that uses the DBMS API. | Programmer, Developer |
| Establish the purpose of the database. | Client, business owner |
The database administrator is in charge of installing, configuring, securing and keeping up-to-date the database management system. They also control the accesses and the performance of the system, troubleshoot it, and create backup of the data.
DBMS have concurrency control to ensure that several users trying to update the same data will do so in a controlled manner. It is to avoid inconsistency to appear in the data.
Normally no, data and programs are independent. But actually, this is true only if the model does not change: shifting to a “less structured model”, e.g., one of the NoSQL models, can require to change the DBMS.
The application should not be sensible to the “internals” of the definition and organization of the data. It matters because having this independence means that changing the data will not require to change the programs.
If we were to store all the information about the classes in the student records, then we would have to store it as many time as its number of students! It is better to store it in a different file, and then to “link” the two files, to avoid redundancy.
The conceptual design.
It is a set of information that is derived from the database but not directly stored in it. It is accessed through queries. For instance, we can infer the age of a person if their date of birth is in the database, but strictly speaking the age is not an information stored in the database.
CAMPUS)Define a CAMPUS database organized into three files as follows:
BUILDING file storing the name and GPS coordinates of each building.ROOM file storing the building, number and floor of each room.PROF file storing the name, phone number, email and room number where the office is located for each professor.A database catalog is made of two part: a table containing the relations’ name and their number of columns, and a table containing the columns’ name, their data type, and the relation to which they belong. Refer to the example we made previously or consult, e.g., (Elmasri and Navathe 2010, fig. 1.3) or (Elmasri and Navathe 2015, fig. 1.3). Write the database catalog corresponding to the CAMPUS database.
Invent data for such a database, with two buildings, three rooms and two professors.
Answer the following, assuming all the knowledge you have of the situation comes from the CAMPUS database, which is an up-to-date and accurate representation of its miniworld:
CAMPUS)The database catalog should be similar to the following:
RELATIONS
| Relation name | Number of columns |
|---|---|
BUILDING |
3 |
ROOM |
3 |
PROF |
4 |
COLUMNS
| Column name | Datatype | Belongs to relation |
|---|---|---|
| Building_Name | Character(30) | Building |
| GPSLat | Decimal(9,6) | Building |
| GPSLon | Decimal(9,6) | Building |
| Building_Name | Character(30) | ROOM |
| Room_Number | Integer(1) | ROOM |
| Floor | Integer (1) | ROOM |
| Prof_Name | Character (30) | PROF |
| Phone | Integer (10) | PROF |
| Character (30) | PROF |
|
| Room_Number | Integer (1) | PROF |
For the data, you could have:
BUILDING file, we could have:(Allgood Hall, 33.47520, -82.02503)
(Institut Galilé, 48.959001, 2.339999)ROOM file, we could have:(Allgood Hall, 128, 1)
(Institut Galilé, 205, 3)
(Allgood Hall, 228, 2)PROF file, we could have:(Aubert, 839401, dae@ipn.net, 128)
(Mazza, 938130, Dm@fai.net, 205)If everything we knew about the campus came from that database, then
The relational data model (or relational database schema) is:
List_of_major as an enumerated data type, instead of just String), enforce some constraints (i.e., UNIQUE, to force all the values to be different), or even have a default value.NULL value.NULL is N/A, unknown, unavailable (or withheld).We now study constraints on the tuples. There are constraints on the scheme, for instance, “a relation cannot have two attributes with the same name”, but we studied those already. The goal of those constraints is to maintain the validity of the relations, and to enforce particular connexions between relations.
Those are part of the definition of the relational model and are independent of the particular relation we are looking at.
Those constraints are parts of the schema.
NOT NULL, UNIQUE).NULL5.Those last two constraints will be studied in the next section.
Constraints that cannot be expressed in the schema, and hence must be enforced by
Examples: “the age of an employee must be greater than 16”, “this year’s salary increase must be more than last year’s”.
Since we can not have two identical tuples in the same relation, there must be a subset of values that distinguish them. We study the corresponding subset of attributes.
Let us consider the following example:
| A | B | C | D |
|---|---|---|---|
| Yellow | Square | 10 | (5, 3) |
| Blue | Rectangle | 10 | (3, 9) |
| Blue | Circle | 9 | (4, 6) |
and the following sets of attributes:
| {A, B, C, D} | {A} | {B, C} | {D} | |
|---|---|---|---|---|
| Superkey ? | ✔ | ✘ | ✔ | ✔ |
| Key ? | ✘ | ✘ | ✘ | ✔ |
Note that here we “retro-fit” those definitions, in database design, they come first (i.e., you define what attributes should always distinguish between tuples before populating your database). We are making the assumption that the data pre-exist to the specification to make the concept clearer.
A foreign key (FK) is a set of attributes whose values must match the value in a tuple in another, pre-defined relation. Formally, the set of attributes FK in the relation schema R1 is a foreign key of R1 (“referencing relation”) that references R2 (“referenced relation”) if
NULLIf there is a foreign key from R1 to R2, then we say that there is a referential integrity constraint from R1 to R2. We draw it with an arrow from the FK to the PK. Note that it is possible that R1 = R2.
NULL. Note also that all the values must be different, as the same value cannot occur twice as the primary key of tuples: we don’t want to enter the same VIN twice, that would mean we are registering a car that was already registered in our database!NULL. Furthermore, their pair must be different from all the other values. Stated differently, you can have <GA, 1234>, <GA, 0000> and <NC, 1234> as values for the <State, Licence-num> pair, even if they have one element in common, what is forbidden is to have both element in common (i.e., you cannot have <GA, 1234> twice). If both elements were common, that would mean that we are registering a driver that was already in the database.NULL (which it could be), then it has to be a value that occurs as the VIN value of some tuple in the CAR relation. For the Insured-Driver-State and Insured-Driver-Licence-Num, the situation is similar: they must either both be NULL, or be values that occurs paired together as the values for State and Licence-Num in a tuple in the CAR relationship. If e.g. Insured-Car was containing the VIN of a car not in the CAR relation, that would mean we are trying to insure a car that is “not known” from the database’s perspective, something we certainly want to avoid.The operations you can perform on your data are of two kinds: retrievals and updates.
They are two constraints for updates:
A transaction is a series of retrievals and updates performed by an application program, that leaves the database in a consistent state.
In the following, we give examples of insertion, deletion and update that could be performed, as well as how they could lead a database to become inconsistent. The annotations (1.), (2.) and (3.) refer to the “remedies”, discussed afterward.
Insert <109920, Honda, Accord, 2012> into CAR
How things can go wrong:
NULL for any value of the attributes of the primary key (1.)Delete the DRIVER tuple with State = GA and Licence_number = 123
How things can go wrong:
Update Name of tuple in DRIVER where State = GA and Licence_number = 123 to Georges
How things can go wrong:
NULL for the any value of the attributes of the primary key (1.)When the operation leads the database to become inconsistent, you can either:
NULL, the corresponding value(s).What are the meta-data and the data called in the relational model?
Connect the dots:
| Row • | • Attribute | |
| Column header • | • Tuple | |
| Table • | • Relation |
What do we call the number of attributes in a relation?
At the logical level, does the order of the tuples in a relation matter?
What is the difference between a database schema and a database state?
What should we put as a value in an attribute if its value is unknown?
What, if any, is the difference between a superkey, a key, and a primary key?
Name the two kinds of integrity that must be respected by the tuples in a relation.
What is entity integrity? Why is it useful?
Are we violating an integrity constraint if we try to set the value of an attribute that is part of a primary key to NULL? If yes, which one?
If in a relation R1, an attribute A1 is a foreign key referencing an attribute A2 in a relation R2, what does this implies about A2?
Give three examples of operations.
What is the difference between an operation and a transaction?
Consider the following two relations:
COMPUTER(Owner, RAM, Year, Brand)
OS(Name, Version, Architecture)
For each, give
Give three different ways to deal with operations whose execution in isolation would result in the violation of one of the constraint.
Define what is the domain constraint.
Circle the correct statements:
Consider the following three relations:
For each relation, answer the following:
Consider the following three relations
What are the foreign keys in the ASSIGNED-TO relation? What are they refering?
In the ASSIGNED-TO relation, explain why the Date attribute is part of the primary key. What would happen if it was not?
Assuming the database is empty, are the following instructions valid? If not, what integrity constraint are they violating?
Insert <'AM-356', 'Surfliner', 2012> into TRAINInsert <NULL, 'Graham Palmer', 'Senior'> into CONDUCTORInsert <'XB-124', 'GPalmer', '02/04/2018'> into ASSIGNED-TOInsert <'BTed, 'Bobby Ted', 'Senior'> and <'BTed', 'Bobby Ted Jr.', 'Junior'> into CONDUCTORConsider the following relation schema and state:
| A | B | C | D |
|---|---|---|---|
| 2 | Blue | Austin | true |
| 1 | Yellow | Paris | true |
| 1 | Purple | Pisa | false |
| 2 | Yellow | Augusta | true |
Assuming that this is all the data we will ever have, discuss whenever {A, B, C, D}, {A, B} and {B} are superkeys and/or keys.
Consider the following relation and possible state. Assuming that this is all the data we will ever have, give two superkeys, and one key, for this relation.
| A | B | C | D |
|---|---|---|---|
| 1 | Austin | true |
Shelly |
| 1 | Paris | true |
Cheryl |
| 3 | Pisa | false |
Sheila |
| 1 | Augusta | true |
Ash |
| 1 | Pisa | true |
Linda |
Consider the following relation and possible state. Assuming that this is all the data we will ever have, give three superkeys for this relation, and, for each of them, indicate if they are a key as well.
| A | B | C | D |
|---|---|---|---|
| 1 | A | Austin | true |
| 2 | B | Paris | true |
| 1 | C | Pisa | false |
| 2 | C | Augusta | true |
| 1 | B | Augusta | true |
Consider the following two relations:
INSERT and one UPDATE instruction. Both should violate the integrity of your database.Consider the following two relations:
The meta-data is called the schema, and the data is called the relation state. You can refer to the diagram we studied at the beginnig of the Chapter for a reminder.
Row is Tuple, Column header is Attribute, Table is Relation.
The degree, or arity, of the relation.
No, it is a set.
The schema is the organization of the database (the meta-data), while the state is the state is the content of the database (the data).
NULL
A superkey is a subset of attributes such that no two tuples have the same combination of values for all those attributes. A key is a minimal superkey, i.e., a superkey from which we cannot remove any attribute without losing the uniqueness constraint. The primary key is one of the candidate key, i.e., the key that was chosen.
Referential integrity and entity integrity.
Entity integrity ensures that each row of a table has a unique and non-null primary key value. It allows to make sure that every tuple is different from the others, and helps to “pick” elements in the database.
Yes, the entity integrity constraint.
Then we know that A2 is the primary key of R2, and that A1 and A2 have the same domain.
Reading from the database, performing UPDATE or DELETE operations.
An operation is an “atomic action” that can be performed on the database (adding an element, updating a value, removing an element, etc.). A transaction is a series of such operations, and the assumption is that, even if it can be made of operations that, taken individually, could violate a constraint, the overall transaction will leave the database in a consistent state.
An operation whose execution in isolation would result in the violation of a constraint can either a) be “restricted” (i.e., not executed), b) result in a propagation (i.e., the tuples that would violate a constraint are updated or deleted accordingly), or c) result in some values in tuples that would violate a constraint to be set to a default value, or the NULL value (this last option works only if the constraint violated is the referential entity constraint).
The requirement that each tuple must have for an attribute A an atomic value from the domain dom(A), or NULL.
“Every key is a superkey.”, “Every primary key is a key.” and “Every superkey with one element is a key.” are correct statements.
To answer 1 and 2, the diagram would become:
For the last question, the answer is yes: based on the ISSN of the book, we can retrieve the author of the book. Hence, knowing which book was awarded which year, by looking in the GAINED-AWARD table, gives us the answer to that question.
NULL can be given as a value to an attribute that is part of the PK.'XB-124 and 'GPalmer' are not values in TRAIN.Ref and CONDUCTOR.CompanyID.INSERT <"A.H.", NULL> would violate the requirement not to have two tuples with the same value for the attributes that constitute the primary key in the BUILDING relation. UPDATE ROOM with CODE = 12 to Building = "G.C.C." would create an entry referencing a name in the BUILDING relation that does not exist.
Consider the relation representing classes taught in a university:
CLASS(Major, Number, Section, Instructor, Term, Year, Time, Weekdays, Room)
The goal is to be able to have multiple offerings (classes) of courses over several semesters. Here are some examples of values for the attributes:
| Attribute | Possible Value |
|---|---|
| Major | CSCI, AIST, CYBER, HIST, … |
| Number | 1301, 3401, 1201, … |
| Section | A, B, C, … |
| Instructor | John Smith, Sophie Adams, … |
| Term | Spring, Fall, … |
| Year | 1990, 2010, … |
| Time | 1400, 1230, 0900, … |
| Weekdays | M, MW, MWF, … |
| Room | UH 120, GCC 3014, … |
List three possible candidate keys and describe under what conditions each candidate key would be valid.
A cinema company wants you to design a relational model for the following set-up:
Propose a relational model for the following situation:
Propose a relational model for the following situation:
We want to design a relational model for an auction website. Members (that can be buyers, sellers, both or neither) can participate in the sale of items.
When creating your schema, do not add any new information, and try as much as possible to avoid relations that will create redundant data and NULL entries. Note that we should be able to uniquely determine the member account linked to the seller account, and similarly for buyers accounts. Furthermore, members can have at most one buyer and one seller account.
We want to design a relational model for an animal shelter, with three goals in mind: to keep track of the pets currently sheltered, of the veterinarian for each type of pet, and of each pet’s favorite toy (needed during a visit to the veterinarian!).
Follow the specification below:
When creating your schema (that you can draw at the back of previous page), do not add any new information (except possibly “id” attributes), and try as much as possible to avoid relations that will create redundant data and NULL entries. Identify the primary key for each relation that you create. When you are done, answer the true / false question below.
| With your model … | Yes | No |
|---|---|---|
| …it is possible to determine which pet don’t have a favorite toy. | ||
| …it is possible to determine what is the average stay in the shelter. | ||
| …it is possible to determine if a pet’s favorite toy is best suited for their type. | ||
| …it is possible for multiple types of animal to have the same veterinarian. | ||
| …it is possible for multiple veterinarians to be attributed to the same type. |
We discuss four possible choices:
All in all, {Major, Number, Term, Year, Section} seems like the safest choice.
A possible solution is:
Be careful: saying that a bill has a unique sponsor does not imply that a the sponsor is a good primary key for the bills: a house member could very well be the sponsor of multiple bills! It just implies that a single attribute is enough to hold the name of the sponsor.
For simplicity, we added an ID to our MEMBER and BILL relations. Note that having a “role” in the MEMBER relation to store the information about speaker, etc., would be extremely inefficient, since we would add an attribute to the ~435 members that would be NULL in ~430 of them.
A possible solution follows. The part that is the hardest to accomodate is the fact that a course can have multiple codes. We are reading here “cross-listed” as “a course that is offered under more than one departmental heading and can receive different codes (e.g., CSCI XXXX and AIST YYYY)”.
A possible solution follows.
In this model,
…it is possible to determine which pet don’t have a favorite toy.
…it is not possible to determine what is the average stay in the shelter, because their exit date is not stored.
…it is possible to determine if a pet’s favorite toy is best suited for their type.
…it is possible for multiple types of animal to have the same veterinarian, as the same value for “Veterinarian” could occur in multiple tuples in the TYPE relation. If both “Veterinarian” and “Name” were parts of the primary key, then that would not be the case.
…it is not possible for multiple veterinarians to be attributed to the same type, as the name of the type is the primary key in the TYPE relation.
SQL, but none of its implementation.This chapter will be “code-driven”: the code will illustrate and help you understand some concepts. You may want to have a look at the “Setting Up Your Work Environment” Section as early as possible in this lecture. On top of being a step-by-step guide to install and configure a relational database managment system, it contains a list of useful links.
| “Common” / Relational | SQL |
|---|---|
| “Set of databases” | Catalog (named collection of schema)7 |
| “Database” | Schema |
| Relation | Table |
| Tuple | Row |
| Attribute | Column, or Field |
A schema is made of
Type and domains are two different things in some implementations, cf. for instance PostgreSQL, where a domain is defined to be essentially a datatype with constraint.8
SQL is a programming language: it has a strict syntax, sometimes cryptic error messages, it evolves, etc. Some of its salient aspects are:
SQL is “kind of” case-insensitive9, does not care about spaces and new lines.--, multi-line comments uses /* …*/.;.The following is an adaptation of w3resource.com, the canonical source being MySQL’s documentation:
INTEGER (or its short-hand notation INT) or SMALLINT.FLOAT and DOUBLE (or its synonym, REAL). MySQL also allows the syntax FLOAT(M,D) or REAL(M,D), where the values can be stored up to M digits in total where D represents the decimal point.DECIMAL(10, 2) (or its synonym in MySQL NUMERIC).CHAR and VARCHAR: the length (resp. maximal length) of the CHAR (resp. VARCHAR) has to be declared, and CHAR are right-padded with spaces to the specified length. Historically, 255 was the size used, because it is the largest number of characters that can be counted with an 8-bit number, but, whenever possible, the “right size” should be used.BIT(1), and a boolean using BOOLEAN (or BOOL, both actually being aliases for TINYINT(1)).DATE, TIME, DATETIME and TIMESTAMP (which convert the current day / time to from the current time zone to UTC).There are many other datatypes, but they really depends on the particular implementation, so we will not consider them too much.
/* code/sql/HW_Faculty.sql */
-- We first drop the schema if it already exists:
DROP SCHEMA IF EXISTS HW_Faculty;
-- Then we create the schema:
CREATE SCHEMA HW_Faculty;
/*
Or we could have use the syntax:
CREATE DATABASE HW_FACUTLY;
*/
-- Now, let us create a table in it:
CREATE TABLE HW_Faculty.PROF (
Fname VARCHAR(15),
/*
No String!
The value "15" vas picked randomly, any value below 255 would
more or less do the same. Note that declaring extremely large
values without using them can impact the performance of
your database, cf. for instance https://dba.stackexchange.com/a/162117/
*/
Room INT,
/*
shorthand for INTEGER, are also available: SMALLINT, FLOAT, REAL, DEC
The "REAL" datatype is like the "DOUBLE" datatype of C# (they are actually synonyms in SQL):
more precise than the "FLOAT" datatype, but not as exact as the "NUMERIC" datatype.
cf. https://dev.mysql.com/doc/refman/8.0/en/numeric-types.html
*/
Title CHAR(3),
-- fixed-length string, padded with blanks if needed
Tenured BIT(1),
Nice BOOLEAN,
-- True / False (= 0) / Unknown
Hiring DATE,
/*
The DATE is always supposed to be entered in a YEAR/MONTH/DAY variation.
To tune the way it will be displayed, you can use the "DATE_FORMAT" function
(cf. https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_date-format),
but you can enter those values only using the "standard" literals
(cf. https://dev.mysql.com/doc/refman/8.0/en/date-and-time-literals.html )
*/
Last_seen TIME,
FavoriteFruit ENUM ('apple', 'orange', 'pear'),
PRIMARY KEY (Fname, Hiring)
);
/*
Or, instead of using the fully qualified name HW_Faculty.PROF,
we could have done:
USE HW_Faculty;
CREATE TABLE PROF(…)
*/
-- Let us use this schema, from now on.
USE HW_Faculty;
-- Let us insert some "Dummy" value in our table:
INSERT INTO PROF
VALUES (
"Clément", -- Or 'Clément'.
290,
'PhD',
0,
NULL,
'19940101', -- Or '940101', '1994-01-01', '94/01/01'
'090500', -- Or '09:05:00', '9:05:0', '9:5:0', '090500'
-- Note also the existence of DATETIME, with 'YYYY-MM-DD
-- HH:MM:SS'
'Apple' -- This is not case-sensitive, oddly enough.
);The following commands are particularly useful. They allow you to get a sense of the current state of your databases.
In the following, <SchemaName> should be substituted with an actual schema name.
SHOW SCHEMAS; -- List the schemas.
SHOW TABLES; -- List the tables in a schema.
DROP SCHEMA <SchemaName>; -- "Drop" (erase) SchemaName.You can also use the variation
DROP SCHEMA IF EXISTS <SchemaName>;that will not issue an error if <SchemaName> does not exist.
In the following, <TableName> should be substituted with an actual table name.
SHOW CREATE TABLE <TableName>-- Gives the command to "re-construct" TableName.
DESCRIBE <TableName>; -- Show the structure of TableName.
DROP TABLE <TableName>; -- "Drop" (erase) TableName.Note that if the table <TableName> you are trying to erase is referenced by other tables through foreign keys, you will obtain an error
ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint failsyou must delete first the table containing the foreign key, as by default this operation is restricted.
If you want to erase a table if it exists, you can use the variation
DROP TABLE IF EXISTS <TableName>;that will not issue an error if <TableName> does not exist.
SELECT * FROM <TableName> -- List all the rows in TableName.
SHOW WARNINGS; -- Show the content of the latest warning issued.There are six different kind of constraints that one can add to an attribute:
NOT NULLUNIQUEDEFAULTCHECKWe already know the first two from the relational model. The other four are new, and could not be described in this model.
We will review them below, and show how they can be specified at the time the table is declared, or added and removed later. For more in-depth examples, you can refer to https://www.w3resource.com/mysql/creating-table-advance/constraint.php.
Note that all of them but DEFAULT are indeed, constraints, as they prevent the user from inserting some data (i.e. you can not insert NULL if the attribute has the constraint NOT NULL). DEFAULT is not a constraint in that sense, as it does not prevent some data from being inserted, but it is called a constraint nevertheless. We will see another example of such “helper” qualification with AUTO-INCREMENT.
We will now see how to declare those constraints when we create the table (except for the foreign key, which we save for later).
/* code/sql/HW_ConstraintsPart1.sql */
DROP SCHEMA IF EXISTS HW_ConstraintsPart1;
CREATE SCHEMA HW_ConstraintsPart1;
USE HW_ConstraintsPart1;
CREATE TABLE HURRICANE (
Name VARCHAR(25) PRIMARY KEY,
WindSpeed INT DEFAULT 76 CHECK (WindSpeed > 74 AND
WindSpeed < 500),
-- 75mph is the minimum to be considered as a hurricane
-- cf. https://www.hwn.org/resources/bws.html
Above VARCHAR(25)
);
CREATE TABLE STATE (
Name VARCHAR(25) UNIQUE,
Postal_abbr CHAR(2) NOT NULL
);If we wanted to combine multiple constraints, we could10, but we would have to follow the order described at https://dev.mysql.com/doc/refman/8.0/en/create-table.html, which is NOT NULL, DEFAULT, AUTO_INCREMENT, UNIQUE, PRIMARY KEY, CHECK (even if, in practise, derivation from this order is oftentimes accepted by DBMSes).
MySQL can output a description of those tables for us:
MariaDB [HW_ConstraintsPart1]> DESCRIBE HURRICANE;
+-----------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+-------------+------+-----+---------+-------+
| Name | varchar(25) | NO | PRI | NULL | |
| WindSpeed | int(11) | YES | | 76 | |
| Above | varchar(25) | YES | | NULL | |
+-----------+-------------+------+-----+---------+-------+
3 rows in set (0.01 sec)
MariaDB [HW_ConstraintsPart1]> DESCRIBE STATE;
+-------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-------------+------+-----+---------+-------+
| Name | varchar(25) | NO | PRI | NULL | |
| Postal_abbr | char(2) | NO | UNI | NULL | |
+-------------+-------------+------+-----+---------+-------+
2 rows in set (0.00 sec)Note that more than one attribute can be the primary key, in which case the syntax needs to be something like the following:
/* code/sql/HW_PKtest.sql */
DROP SCHEMA IF EXISTS HW_PKtest;
CREATE SCHEMA HW_PKtest;
USE HW_PKtest;
CREATE TABLE TEST (
A INT,
B INT,
PRIMARY KEY (A, B)
);Note that in this case, a statement like
INSERT INTO TEST VALUE (1, NULL);would result in an error: all the values that are part of the primary key needs to be non-NULL.
For the UNIQUE constraint, note that NULL can be inserted: the rationale is that all the values need to be different from one another or NULL.
A couple of comments about the CHECK constraint:
WindSpeed INT CHECK (WindSpeed > 74 AND WindSpeed < 500) would have been parsed but would not have any effect, cf. https://mariadb.com/kb/en/constraint/#check-constraints. Since MariaDB 10.2.1, the CHECK constraint are enforced.CHECK constraint, with a command likeINSERT INTO HURRICANE VALUES ("Test1", 12, NULL);then the insertion would not take place, and the system would issue an error message:
ERROR 4025 (23000): CONSTRAINT `HURRICANE.WindSpeed` failed for `HW_ConstraintsPart1]>`.`HURRICANE`To use the DEFAULT value, use
INSERT INTO HURRICANE VALUES ("Test2", DEFAULT, NULL);Note that, by default, the DEFAULT value is NULL, regardless of the datatype. You can experiment it by running the following code:
Let us know pretend that we want to edit some attributes, by either adding or removing constraints. SQL’s syntax is a bit inconsistent on this topic, because it treats the constraints as being of different natures.
Adding a primary key:
ALTER TABLE STATE ADD PRIMARY KEY (Name); Removing the primary key:
ALTER TABLE STATE DROP PRIMARY KEY;UNIQUE ConstraintAdding a UNIQUE constraint:
ALTER TABLE STATE ADD UNIQUE (Postal_abbr);Removing a UNIQUE constraint:
ALTER TABLE STATE DROP INDEX Postal_abbr;Note the difference between adding and removing the UNIQUE constraint: the parenthesis around (Postal_abbr) are mandatory when adding the constraint, but would cause an error when removing it!
NOT NULL ConstraintAdding the NOT NULL constraint:
ALTER TABLE STATE MODIFY Postal_abbr CHAR(2) NOT NULL;Removing the NOT NULL constraint:
ALTER TABLE STATE MODIFY Postal_abbr CHAR(2);The syntax of NOT NULL comes from the fact that this constraint is taken to be part of the datatype.
Changing the default value:
ALTER TABLE HURRICANE ALTER COLUMN WindSpeed SET DEFAULT 74;Removing the default value:
ALTER TABLE HURRICANE ALTER COLUMN WindSpeed DROP DEFAULT;Note that if you change the default value, it does not change the values you inserted retro-actively. To resume on our previous example, the values inserted with DEFAULT as a value would still be NULL even after executing the following instruction:
Adding a foreign key constraint:
ALTER TABLE HURRICANE ADD FOREIGN KEY (Above) REFERENCES STATE(Name); Removing a foreign key constraint is out of the scope of this lecture. If you are curious, you can have a look at https://www.w3schools.com/sql/sql_foreignkey.asp: dropping a foreign key constraint requires your constraint to have a name, something we did not introduce.
Two important remarks:
Refer to Problem 3.4 (Constraints on foreign keys) for a slightly more accurate picture of the constraints related to the creation of foreign keys. Note that a foreign key could be declared at the time of creation of the table as well, using the syntax we will introduce below.
Let us test our constraints:
INSERT INTO STATE VALUES('Georgia', 'GA');
INSERT INTO STATE VALUES('Texas', 'TX');
INSERT INTO STATE VALUES('FLORIDA', 'FL');
UPDATE STATE SET Name = 'Florida'
WHERE Postal_abbr = 'FL';
-- There's an error with the following request. Why?
INSERT INTO HURRICANE VALUES('Irma', 150, 'FL');
/*
ERROR 1452 (23000): Cannot add or update a child row: a foreign key constraint fails (`HW_ConstraintsPart1`.`HURRICANE`, CONSTRAINT `HURRICANE_ibfk_1` FOREIGN KEY (`Above`) REFERENCES `STATE` (`Name`))
*/
INSERT INTO HURRICANE VALUES('Harvey', DEFAULT, 'Texas');
INSERT INTO HURRICANE VALUES('Irma', 150, 'Florida');
DELETE FROM HURRICANE
WHERE Name = 'Irma';
INSERT INTO HURRICANE VALUES('Irma', 150, 'Georgia');
UPDATE HURRICANE SET Above = 'Georgia'
WHERE Name = 'Irma';
/*
MariaDB [HW_ConstraintsPart1]> SELECT * FROM HURRICANE;
+--------+-----------+---------+
| Name | WindSpeed | Above |
+--------+-----------+---------+
| Harvey | 74 | Texas |
| Irma | 150 | Georgia |
+--------+-----------+---------+
*/
-- There's an error with the following request. Why?
UPDATE HURRICANE SET Above = 'North Carolina'
WHERE Name = 'Irma';
-- Let's patch it, by adding North Carolina to our STATE table.
INSERT INTO STATE VALUES('North Carolina', 'NC');
UPDATE HURRICANE SET Above = 'North Carolina'
WHERE Name = 'Irma';Let us come back more specifically to foreign key.
In the example below, we introduce the foreign key update and delete rules. We also introduce, passing by, the enumerated data type, and how to edit it.
CREATE TABLE STORM (
NAME VARCHAR(25) PRIMARY KEY,
Kind ENUM ("Tropical
Storm", "Hurricane"),
WindSpeed INT,
Creation DATE
);
-- We can change the enumerated datatype:
ALTER TABLE STORM MODIFY Kind ENUM ("Tropical Storm",
"Hurricane", "Typhoon");
CREATE TABLE STATE (
NAME VARCHAR(25) UNIQUE,
Postal_abbr CHAR(2) PRIMARY KEY,
Affected_by VARCHAR(25),
FOREIGN KEY (Affected_by) REFERENCES STORM (NAME) ON
DELETE SET NULL ON UPDATE CASCADE
);
Note that we can “inline” the foreign key constraint like we “inlined” the primary key constraint (cf. https://stackoverflow.com/q/24313143/), but that it will not be enforced!
Let us now illustrate this table by introducing some data in it:
INSERT INTO STORM
VALUES (
"Harvey",
"Hurricane",
130,
"2017-08-17");
-- In the following, the entry gets created, but date is
-- "corrected" to "2017-17-08"!
-- INSERT INTO STORM
-- VALUES ("Dummy", "Hurricane", 120,
-- "2017-17-08");
-- The error message returned is
-- ERROR 1292 (22007) at line 34:
-- Incorrect
-- date
-- value:
-- "2017-17-08" for column
-- `HW_STORM`.`STORM`.`Creation`
-- at
-- row 1
-- In the following, we explicitely use
-- "DATE",
-- and
-- since
-- the date is incorrect, nothing gets
-- inserted.
-- INSERT INTO STORM
-- VALUES ("Dummy2", "Hurricane", 120,
-- DATE
-- "2017-17-08");
-- ERROR 1525 (HY000) at line 40:
-- Incorrect
-- DATE
-- value:
-- "2017-17-08"
-- The next one sets NULL for DATE.
INSERT INTO STORM
VALUES (
"Irma",
"Tropical Storm",
102,
DEFAULT);
MySQL will always notify you if there is an error in a date attribute when you use the DATE prefix.
INSERT INTO STATE
VALUES (
"Georgia",
"GA",
NULL);
INSERT INTO STATE
VALUES (
"Texas",
"TX",
NULL);
INSERT INTO STATE
VALUES (
"Florida",
"FL",
NULL);
-- This instruction is not using the primary key, is that a
-- problem?
UPDATE
STATE
SET Affected_by = "Harvey"
WHERE Name = "Georgia";
UPDATE
STORM
SET Name = "Harley"
WHERE Name = "Harvey";
DELETE FROM STORM
WHERE Name = "Harley";
We will see in the “Reverse-Engineering” section why this schema is poorly designed, but for now, let’s focus on the foreign keys and their restrictions.
The following is a code-driven explanation of the foreign key update and delete rules (or “restrictions”). It is intended to make you understand the default behavior of foreig keys, and to understand how the system reacts to the possible restrictions.
CREATE TABLE F_Key(
Attribute VARCHAR(25) PRIMARY KEY
);
CREATE TABLE Table_default(
Attribute1 VARCHAR(25) PRIMARY KEY,
Attribute2 VARCHAR(25),
FOREIGN KEY (Attribute2) REFERENCES F_Key(Attribute)
);
-- By default, this foreign key will restrict.
CREATE TABLE Table_restrict(
Attribute1 VARCHAR(25) PRIMARY KEY,
Attribute2 VARCHAR(25),
FOREIGN KEY (Attribute2) REFERENCES F_Key(Attribute)
ON DELETE RESTRICT
ON UPDATE RESTRICT
);
CREATE TABLE Table_cascade(
Attribute1 VARCHAR(25) PRIMARY KEY,
Attribute2 VARCHAR(25),
FOREIGN KEY (Attribute2) REFERENCES F_Key(Attribute)
ON DELETE CASCADE
ON UPDATE CASCADE
);
CREATE TABLE Table_set_null(
Attribute1 VARCHAR(25) PRIMARY KEY,
Attribute2 VARCHAR(25),
FOREIGN KEY (Attribute2) REFERENCES F_Key(Attribute)
ON DELETE SET NULL
ON UPDATE SET NULL
);
/*
* You might encounter a
* ON UPDATE SET DEFAULT
* but this reference option (cf. https://mariadb.com/kb/en/library/foreign-keys/ )
* worked only with a particular engine ( https://mariadb.com/kb/en/library/about-pbxt/ )
* and will not be treated here.
*/
INSERT INTO F_Key VALUES('First Test');
INSERT INTO Table_default VALUES('Default', 'First Test');
INSERT INTO Table_restrict VALUES('Restrict', 'First Test');
INSERT INTO Table_cascade VALUES('Cascade', 'First Test');
INSERT INTO Table_set_null VALUES('Set null', 'First Test');
SELECT * FROM Table_default;
SELECT * FROM Table_restrict;
SELECT * FROM Table_cascade;
SELECT * FROM Table_set_null;
-- The following will fail because of the Table_default table:
UPDATE F_Key SET Attribute = 'After Update'
WHERE Attribute = 'First Test';
DELETE FROM F_Key
WHERE Attribute = 'First Test';
-- Let us drop this table, and try again.
DROP TABLE Table_default;-- The following fails too, this time because of the Table_restrict table:
UPDATE F_Key SET Attribute = 'After Update'
WHERE Attribute = 'First Test';
DELETE FROM F_Key
WHERE Attribute = 'First Test';
-- Let us drop this table, and try again.
DROP TABLE Table_restrict;
-- Let's try again:
UPDATE F_Key SET Attribute = 'After Update' WHERE Attribute = 'First Test';
-- And let's print the situation after this update:
SELECT * FROM Table_cascade;
SELECT * FROM Table_set_null;
/*
MariaDB [HW_CONSTRAINTS_PART3]> SELECT * FROM Table_cascade;
+------------+--------------+
| Attribute1 | Attribute2 |
+------------+--------------+
| Cascade | After Update |
+------------+--------------+
1 row in set (0.00 sec)
MariaDB [HW_CONSTRAINTS_PART3]> SELECT * FROM Table_set_null;
+------------+------------+
| Attribute1 | Attribute2 |
+------------+------------+
| Set null | NULL |
+------------+------------+
1 row in set (0.00 sec)
*/
-- Let's make a second test.
INSERT INTO F_Key VALUES('Second Test');
INSERT INTO Table_cascade VALUES('Default', 'Second Test');
INSERT INTO Table_set_null VALUES('Restrict', 'Second Test');
DELETE FROM F_Key
WHERE Attribute = 'Second Test';
/*
MariaDB [HW_CONSTRAINTS_PART3]> SELECT * FROM Table_cascade;
+------------+--------------+
| Attribute1 | Attribute2 |
+------------+--------------+
| Cascade | After Update |
+------------+--------------+
1 row in set (0.00 sec)
MariaDB [HW_CONSTRAINTS_PART3]> SELECT * FROM Table_set_null;
+------------+------------+
| Attribute1 | Attribute2 |
+------------+------------+
| Restrict | NULL |
| Set null | NULL |
+------------+------------+
2 rows in set (0.00 sec)
*/
Then note that we cannot create both tables as pictured directly, as PROF requires DEPARTMENT to exist, to have a foreign key referencing it, and similarly for DEPARTMENT: it is an egg and chicken situation! Hence, we have to first create a table without the foreign key, and then add it later on, as described below:
/* code/sql/HW_ProfExample.sql */
CREATE TABLE PROF (
Login VARCHAR(25) PRIMARY KEY,
NAME VARCHAR(25),
Department CHAR(5)
);
CREATE TABLE DEPARTMENT (
Code CHAR(5) PRIMARY KEY,
NAME VARCHAR(25),
Head VARCHAR(25),
FOREIGN KEY (Head) REFERENCES PROF (LOGIN) ON UPDATE CASCADE
);
ALTER TABLE PROF
ADD FOREIGN KEY (Department) REFERENCES DEPARTMENT (Code);
Note the structure of the ALTER TABLE command:
KEY Department REFERENCES Code;⇒ errorKEY (Department) REFERENCES (Code);⇒ errorKEY PROF(Department) REFERENCES DEPARTMENT(Code); ⇒ okCREATE TABLE STUDENT (
Login VARCHAR(25) PRIMARY KEY,
NAME VARCHAR(25),
Registered DATE,
Major CHAR(5),
FOREIGN KEY (Major) REFERENCES DEPARTMENT (Code)
);
CREATE TABLE GRADE (
Login VARCHAR(25),
Grade INT,
PRIMARY KEY (LOGIN, Grade),
FOREIGN KEY (LOGIN) REFERENCES STUDENT (LOGIN)
);
On a side note, note that we do not have the same difficulty when inserting a value in a table that contains a foreign key referencing itself: it is accepted to insert a value that is referencing itself, as illustrated below.
HW_FK_Self_Reference.sqlWe can insert multiple values at once:
HW_ProfExample.sqlWe can specify which attributes we are giving:
HW_ProfExample.sqlAnd we can even specify the order (even the trivial one):
INSERT INTO PROF (
LOGIN,
Department,
Name)
VALUES (
"caubert",
"CS",
"Clément Aubert");
INSERT INTO PROF (
LOGIN,
Name,
Department)
VALUES (
"aturing",
"Alan Turing",
"CS"),
(
"perdos",
"Paul
Erdős",
"MATH"),
(
"bgates",
"Bill Gates",
"CYBR");
INSERT INTO STUDENT (
LOGIN,
Name,
Registered,
Major)
VALUES (
"jrakesh",
"Jalal Rakesh",
DATE "2017-12-01",
"CS"),
(
"svlatka",
"Sacnite Vlatka",
"2015-03-12",
"MATH"),
(
"cjoella",
"Candice Joella",
"20120212",
"CYBR"),
(
"aalyx",
"Ava Alyx",
20121011,
"CYBR"),
(
"caubert",
"Clément Aubert",
NULL,
"CYBR");
INSERT INTO GRADE
VALUES (
"jrakesh",
3.8),
(
"svlatka",
2.5);
(Note the date literals)
By default, the values that are not given are set to their respective DEFAULT values.
Note that we can create foreign keys to primary keys made of multiple attributes, and to the primary key of the table we are currently creating.
/* code/sql/HW_AdvancedFK.sql */
CREATE TABLE T1 (
A1 INT,
A2 INT,
B INT,
PRIMARY KEY (A1, A2)
);
CREATE TABLE T2 (
A1 INT,
A2 INT,
B1 INT PRIMARY KEY,
B2 INT,
-- We can create a "pair" of foreign key in one line, as
-- follows:
FOREIGN KEY (A1, A2) REFERENCES T1 (A1, A2),
-- We can create a foreign key that references the primary
-- key of the table we are currently creating, and name
-- it, as follows:
CONSTRAINT My_pk_to_T1 FOREIGN KEY (B2) REFERENCES T2 (B1)
);In the example, we are also naming our foreign key. The benefit of naming our fk constraint is that, if we violate it, for instance with
INSERT INTO T2 VALUES (1, 1, 1, 3);then the name of the constraint (here “My_pk_to_T1”) will be displayed in the error message:
Cannot add or update a child row: a foreign key constraint fails (`db_9_9837c1`.`t2`, CONSTRAINT
`My_pk_to_T1` FOREIGN KEY (`B2`) REFERENCES `t2`(`B1`))Order of clauses does not matter, not even for optimization purpose.
UPDATE <table>
SET <attribute1> = <value1>, <attribute2> = <value2>, …
WHERE <condition>; SELECT <attribute list, called projection attributes>
FROM <table list>
WHERE <condition>;DELETE FROM <table list>
WHERE <condition>;Conditions can
=, equal to>, greater than,< less than,>= greater than or equal to,<= less than or equal to,<> not equal to.condition1 AND condition2condition1 OR condition2NOT conditionLIKE,\,_ will match one character (any character), % will match any number of character,REGEXP keyword.SELECT LOGIN
FROM STUDENT;
UPDATE
DEPARTMENT
SET Head = "aturing"
WHERE Code = "MATH";
UPDATE
DEPARTMENT
SET Head = "bgates"
WHERE Code = "CS"
OR Code = "CYBR";
SELECT LOGIN
FROM STUDENT
WHERE NOT Major = "CYBR";
SELECT LOGIN,
Name
FROM PROF
WHERE Department = "CS";
SELECT LOGIN
FROM STUDENT
WHERE Major = "CYBR"
AND Registered > DATE "20121001";
SELECT LOGIN
FROM STUDENT
WHERE Name LIKE "Ava%";
SELECT Name
FROM PROF
WHERE LOGIN LIKE "_aubert";
Note that LIKE is by default case-insensitive, both in MariaDB and in MySQL. The COLLATE operator can be used to force the search to be case-sensitive, as well as LIKE BINARY.
Cf. (Elmasri and Navathe 2010, 5.1.1), (Elmasri and Navathe 2015, 7.1.1)
The Boolean logic in SQL is three-valued: a statement can be true, false or unknown. If you pick the following two commands:
/* code/sql/HW_DefaultTest.sql */
SELECT *
FROM TEST
WHERE TestA = "A";
SELECT *
FROM TEST
WHERE TestA <> "A";
you may believe that they will capture all the tuples in the TEST table, as the value for TestA is either "A" or not "A", but you would be wrong. If the value of TestA is NULL, then both conditions would fail, as SQL cannot say that the value is or is not "A": it is simply undefined!
NULLNULL is
Unknown value (“Nobody knows”)
What is the date of birth of Jack the Ripper?
Unavailable / Withheld (“I do not have that information with me at the moment”)
What is the number of english spies in France?
What is the VIN of your car?
What is the identity of the Tiananmen Square person?
Not Applicable (“Your question does not make sense”)
What is the US SSN of a French person?
What is the email address of an author of the XIXth century?
If NULL is involved in a comparison, the result evaluates to “Unknown.”
| AND | T | F | U |
| T | T | F | U |
| F | F | F | F |
| U | U | F | U |
| OR | T | F | U |
| T | T | T | T |
| F | T | F | U |
| U | T | U | U |
| NOT | |
| T | F |
| F | T |
| U | U |
You can test if a value is NULL with IS NULL.
INSERT INTO DEPARTMENT
VALUES (
"Hist",
"History",
NULL);
SELECT *
FROM DEPARTMENT
WHERE Head IS NULL;
SELECT *
FROM DEPARTMENT
WHERE Head IS NOT NULL;
SELECT COUNT(*)
FROM GRADE
WHERE Grade IS NULL;
Note that you can not use IS to compare values: this key word is reserved to test if a value is (not) NULL, and nothing else.
This means that if you want to capture all the tuples, you cannot write
/* code/sql/HW_DefaultTest.sql */
SELECT *
FROM TEST
WHERE TestA = "A";
SELECT *
FROM TEST
WHERE TestA <> "A";
but should have something like
HW_DefaultTest.sqlThere are no if…then…else statements in SQL, but you can do something similar with CASE (cf. https://dev.mysql.com/doc/refman/8.0/en/case.html). However, note that SQL is probably not the right place to try to control the flow of execution.
This probably depends on the system a lot, but one could wonder if MySQL uses some form of short-cut evaluation when comparing with NULL. Unfortunately, even with three times (!) the verbose option, MySQL does not give more insight as to whenever it drops comparing values once a NULL was encountered (cf. https://dev.mysql.com/doc/refman/8.0/en/mysql-command-options.html#option_mysql_verbose, you can log-in using mysql -u testuser -p --password=password -v -v -v to activate the most verbose mode). Normally, EXPLAIN (https://dev.mysql.com/doc/refman/8.0/en/explain.html) should be useful in answering this question, but failed to answer it as well.
For DISTINCT, ALL and UNION, cf. (Elmasri and Navathe 2010, 4.3.4) or (Elmasri and Navathe 2015, 6.3.4). For ORDER BY, cf. (Elmasri and Navathe 2010, 4.3.6) or (Elmasri and Navathe 2015, 6.3.6). For aggregate functions, cf. (Elmasri and Navathe 2010, 5.1.7) or (Elmasri and Navathe 2015, 7.1.7).
AUTO_INCREMENTSomething that is not exactly a constraint, but that can be used to “qualify” domains, is the AUTO_INCREMENT feature of MySQL. Cf. https://dev.mysql.com/doc/refman/8.0/en/example-auto-increment.html, you can have MySQL increment a particular attribute (most probably intended to be your primary key, or some form of counter) for you.
A typical example could be:
/* code/sql/HW_AutoIncrement.sql */
CREATE TABLE PERSON (
PersonID INT AUTO_INCREMENT,
Name VARCHAR(255),
PRIMARY KEY (PersonID)
);
INSERT INTO PERSON (
Name)
VALUES (
'Lars'),
(
'Kristina'),
(
'Sophie');
SELECT *
FROM PERSON;
This way, the burden of having to keep track of the persons’ ids is left to the program, and not to the person inserting data in the table.
We can save the current state, and start a series of transactions, with the command
START TRANSACTION;All the commands that follows are “virtually” executed: you can undo them all using
ROLLBACK;This puts you back in the state you were in before starting the transaction. If you want all the commands you typed in-between to be actually enforced, you can use the command
COMMIT;Nested transactions are technically possible, but they are counter-intuitive and should be avoided, cf. https://www.sqlskills.com/blogs/paul/a-sql-server-dba-myth-a-day-2630-nested-transactions-are-real/.
DISTINCT / ALLThe result of a SELECT query, for instance, is a table, and SQL treats tables as multi-set, hence there can be repetitions in the result of a query, but we can remove them:
SELECT DISTINCT Major FROM STUDENT;The default behaviour is equivalent to specifying ALL, and it display the duplicates. In this case, it would be
> SELECT Major FROM STUDENT;
+-------+
| Major |
+-------+
| CS |
| CYBR |
| CYBR |
| CYBR |
| MATH |
+-------+UNIONSet-theoretic operations are available as well. For instance, one can use:
(SELECT Login FROM STUDENT) UNION (SELECT Login FROM PROF);to collect all the logins from both tables.
There is also INTERSECT and EXCEPT in the specification, but MySQL does not implement them (cf. https://en.wikipedia.org/wiki/Comparison_of_relational_database_management_systems#Database_capabilities).
ORDER BYYou can have ORDER BY specifications:
SELECT LOGIN
FROM GRADE
WHERE Grade > 2.0
ORDER BY Grade;
SELECT LOGIN
FROM GRADE
WHERE Grade > 2.0
ORDER BY Grade DESC;
SELECT LOGIN,
Major
FROM STUDENT
ORDER BY Major,
Name;
ORDER BY order by ascending order by default.
You can use MAX, SUM, MIN, AVG, COUNT to peform simple operations.
SELECT MAX(Registered) FROM STUDENT;returns the “greatest” date of registration of a student, i.e., the date of the latest registration.
SELECT COUNT(Name) FROM STUDENT;returns the number of names, i.e., the number of students.
SELECT COUNT(DISTINCT Name) FROM STUDENT;returns the number of different names (which in this case is the same as the number of names, since we have no homonyms).
Note that AVG returns the average of all non-NULL values, as we can see on the following example:
/* code/sql/HW_Avg.sql */
CREATE TABLE TEST (
Test INT
);
INSERT INTO TEST
VALUES (
NULL),
(
0),
(
10);
SELECT AVG(Test)
FROM TEST;
-- Returns 5.0The same goes for e.g. MAX:
HW_Max.sqlWe can use aliases for the columns. Compare
SELECT Login FROM PROF;
+---------+
| Login |
+---------+
| aturing |
| caubert |
| bgates |
| perdos |
+---------+with
SELECT Login AS "Username" FROM PROF;
+----------+
| Username |
+----------+
| aturing |
| caubert |
| bgates |
| perdos |
+----------+Aliases can also be used on table names. Adiases for columns are a helpful way of describing the result of the query, while alias on table have a specific purpose that will be clearer as we study select-project-join queries.
For select-project-join, cf. (Elmasri and Navathe 2010, 4.3.1) or (Elmasri and Navathe 2015, 6.3.1). For aliases, cf. (Elmasri and Navathe 2010, 4.3.2) or (Elmasri and Navathe 2015, 6.3.2), For nested queries, cf. (Elmasri and Navathe 2010, 5.1.2) or (Elmasri and Navathe 2015, 7.1.2).
Department.Name = 'Mathematics' is the selection conditionDepartment = Code is the join condition, because it combines two tuples.Name?Grade > 3.0 is the selection conditionSTUDENT.Login = GRADE.Login is the join conditionWe can have two join conditions!
SELECT PROF.Name
FROM PROF,
DEPARTMENT,
STUDENT
WHERE STUDENT.Name = "Ava Alyx"
AND STUDENT.Major = DEPARTMENT.Code
AND DEPARTMENT.Head = PROF.Login;
Note that for the kind of join we are studiying (called “inner joins”), the order does not matter.
In Problem 3.3 (Duplicate rows in SQL), we saw that SQL was treating tables as multi-sets, i.e., repetitions are allowed. This can lead to strange behaviour when performing Select-Project-Join queries. Consider the following example:
We can use aliases on tables to shorten the previous query:
SELECT PROF.Name
FROM PROF,
DEPARTMENT,
STUDENT AS B
WHERE B.Name = "Ava Alyx"
AND B.Major = DEPARTMENT.Code
AND DEPARTMENT.Head = PROF.Login;
We can use multiple aliases to make it even shorter (but less readable):
SELECT A.Name
FROM PROF AS A,
DEPARTMENT AS B,
STUDENT AS C
WHERE C.Name = "Ava Alyx"
AND C.Major = B.Code
AND B.Head = A.Login;
For those two, aliases are convenient, but not required to write the query. In some cases, we cannot do without aliases. For instance if we want to compare two rows in the same table:
SELECT Other.Login
FROM GRADE AS Mine,
GRADE AS Other
WHERE Mine.Login = "aalyx"
AND Mine.Grade < Other.Grade;
Generally, when you want to perform a join within the same table, then you have to “make two copies of the tables” and name them differently using aliases. Let us try to write a query that answers the question
What are the login of the professors that have the same department as the professor whose login is caubert?
We need a way of distinguising between the professors we are projecting on (the one whole login is caubert) and the one we are joining with (the ones that have the same department). This can be done using something like:
SELECT JOINT.Login
FROM PROF AS PROJECT,
PROF AS JOINT
WHERE PROJECT.Login = "caubert"
AND PROJECT.Department = JOINT.Department;
Note that we are “opening up two copies of the PROF tables”, and naming them differently (PROJECT and JOINT).
Another (improved) example of a similar query is
SELECT Fellow.Name AS "Fellow of Ava"
FROM STUDENT AS Ava,
STUDENT AS Fellow
WHERE Ava.Name = "Ava Alyx"
AND Fellow.Major = Ava.Major
AND NOT Fellow.Login = Ava.Login;
A couple of remarks about this query:
AS "Fellow of Ava" is another kind of aliasing, mentioned in a previous section.NOT Fellow.Login = Ava.Login guarantees that we will not select Ava again, and exclude her from the results (Ava is not supposed to be a fellow of herself).NOT Fellow.Name = Me.Name; instead of NOT Fellow.Login = Ava.Login would prevent the homonym from occuring in the results.AND NOT Me = Fellow by NOT Fellow.Login = Ava.Login would not work: you have to compare attributes of the tuples, not the tuples.Let us look at a first example
HW_ProfExample.sqlA nested query is made of an outer query (SELECT Login…) and an inner query (SELECT AVG(Grade)…). Note that the inner query does not terminate with a ;.
Logical operators such as ALL or IN can be used in nested queries. To learn more about those operators, refer to https://www.w3schools.com/sql/sql_operators.asp.
An example could be
HW_ProfExample.sqlNote that
>=, and not >, since no grade is strictly greater than itself.IS NOT NULL is needed: otherwise, if one of the grade is NULL, then the comparison would yelds “unknown”, and no grade would be greater than all of the others.MAX:Answering the question
What are the logins of the professors belonging to a department that is the major of at least one student whose name ends with an “a”?
–that sounds like the what would ask a police officer in a whodunit– could be answer using
HW_ProfExample.sqlFor this query, we could not use =, since more than one major could be returned.
Furthermore, nested query that uses = can often be rewritten without being nested. For instance,
becomes
HW_ProfExample.sqlConversly, you can sometimes write select-project-join as nested queries For instance,
HW_ProfExample.sqlbecomes
HW_ProfExample.sqlA “stored” procedure is a SQL function statements that can take arguments and may be called from another part of your program. Stated differently, a procedure is a serie of statements stored in a schema, that can easily be executed repeatedly, cf. https://dev.mysql.com/doc/refman/8.0/en/create-procedure.html or https://mariadb.com/kb/en/library/create-procedure/.
Imagine we have the following:
/* code/sql/HW_ProcedureExamples.sql */
CREATE TABLE STUDENT (
Login INT PRIMARY KEY,
NAME VARCHAR(30),
Major VARCHAR(30),
Email VARCHAR(30)
);
INSERT INTO STUDENT
VALUES (
123,
"Test A",
"CS",
"a@a.edu"),
(
124,
"Test B",
"IT",
"b@a.edu"),
(
125,
"Test C",
"CYBR",
"c@a.edu");
SQL is extremely litteral: when it reads the delimiter ;, it must execute the command that was shared. But a procedure, being composed of commands, will contain the ; symbol. To “solve” this (weird) issue, and be able to define a procedure, we have to “temporarily alter the language”, using DELIMITER // that makes the delimiter being // instead of ;11.
In any case, we can then define and execute a simpe procedure called STUDENTLIST as follows:
DELIMITER $$
CREATE PROCEDURE STUDENTLIST ()
BEGIN
SELECT *
FROM STUDENT;
-- This ";" is not the end of the procedure definition!
END;
$$
-- This is the delimiter that marks the end of the procedure
-- definition.
DELIMITER ;
-- Now, we want ";" to be the "natural" delimiter again.
CALL STUDENTLIST ();
-- Now, we want ";" to be the "natural" delimiter again.
CALL STUDENTLIST ();
A procedure an also take arguments, and an example could be:
DELIMITER $$
CREATE PROCEDURE STUDENTLOGIN (
NameP VARCHAR(30)
)
BEGIN
SELECT LOGIN
FROM STUDENT
WHERE NameP = Name;
END;
$$
DELIMITER ;
SHOW CREATE PROCEDURE STUDENTLOGIN;
-- This display information about the procedure just created.
-- We can pass quite naturally an argument
-- to
-- our
-- procedure.
CALL STUDENTLOGIN ("Test A");
A trigger is a series of statements stored in a schema that can be automatically executed whenever a particular event in the schema occurs. Triggers are extremely powerfull, and are a way of automating part of the work in your database. In MariaDB, you could have the following program.
Imagine we have the following:
CREATE TABLE STUDENT (
Login VARCHAR(30) PRIMARY KEY,
Average FLOAT
);
CREATE TABLE GRADE (
Student VARCHAR(30),
Exam VARCHAR(30),
Grade INT,
PRIMARY KEY (Student, Exam),
FOREIGN KEY (Student) REFERENCES STUDENT (LOGIN)
);
We want to create a trigger that counts the number of times something was inserted in our STUDENT table. SQL supports some primitive form of variables (cf. https://dev.mysql.com/doc/refman/8.0/en/user-variables.html and https://mariadb.com/kb/en/library/user-defined-variables/). There is no “clear” form of type, https://dev.mysql.com/doc/refman/8.0/en/user-variables.html reads:
In addition, the default result type of a variable is based on its type at the beginning of the statement. This may have unintended effects if a variable holds a value of one type at the beginning of a statement in which it is also assigned a new value of a different type. To avoid problems with this behavior, either do not assign a value to and read the value of the same variable within a single statement, or else set the variable to 0, 0.0, or ’’ to define its type before you use it.
In other words, SQL just “guess” the type of your value and go with it. Creating a simple trigger that increment a variable every time an insertion is performed in the STUDENT table can be done as follows:
SET @number_of_student = 0;
CREATE TRIGGER NUMBER_OF_STUDENT_INC
AFTER INSERT ON STUDENT
FOR EACH ROW
SET @number_of_student = @number_of_student + 1;
Now, assume we want to create a trigger that calculates the average for us. Note that the trigger will need to manipulate two tables (STUDENT and GRADE) at the same time.
CREATE TRIGGER STUDENT_AVERAGE
AFTER INSERT ON GRADE
FOR EACH ROW -- Woh, a whole query inside our trigger!
UPDATE STUDENT
SET STUDENT.Average = (
SELECT AVG(Grade)
FROM GRADE
WHERE GRADE.Student = STUDENT.Login)
WHERE STUDENT.Login = NEW.Student;
-- The "NEW" keyword here refers to the "new" entry
-- that is being inserted by the INSERT
-- statement
-- triggering
-- the trigger.The source code contains examples of insertion and explanations on how to witness the trigger in action.
This part is a short tutorial to install and configure a working relational DBMS. We will proceed in 5 steps:
You will install the MySQL DataBase Managment System, or its community-developed fork, MariaDB. Below are the instruction to install MySQL Community Edition on Windows 10 and macOS, and MariaDB on Linux-based distribution, but both are developped for every major operating system (macOS, Windows, Debian, Ubuntu, etc.): feel free to pick one or the other, it will not make a difference in this course (up to some minor aspects). MySQL is more common, MariaDB is growing, both are released under GNU General Public License, well-documented and free of charge for their “community” versions.
It is perfectly acceptable, and actually encouraged, to install MySQL or MariaDB on a virtual machine for this class. You can use the Windows Subsystem for Linux, VMware or Virtual Box to run a “sandboxed” environment that should keep your data isolated from our experimentations.
Below are precise and up-to-date instructions, follow them carefully, read the messages displayed on your screen, make sure a step was correctly executed before moving to the next one, and everything should be all right. Also, remember:
The following links could be useful:
Visit https://dev.mysql.com/downloads/installer/, click on “Download” next to
Windows (x86, 32-bit), MSI Installer XXX YYY (mysql-installer-web-community-XXX.msi)
where XXX is a number version (e.g., 8.0.13.0.), and YYY is the size of the file (e.g., 16.3M). On the next page, click on the (somewhat hidden) “No thanks, just start my download.” button.
Save the “mysql-installer-web-community-XXX.msi” file, and open it. If there is an updated version of the installer available, agree to download it. Accept the license term.
We will now install the various components needed for this class, leaving all the choices by defaults. This means that you need to do the following:
We now want to make sure that MySQL is running: launch Windows’ “Control Panel”, then click on “Administrative Tools”, and on “Services”. Look for “MySQLXX”, its status should be “Running”. If it is not, right-click on it and click on “Start”.
Open a command prompt (search for cmd, or use PowerShell) and type
cd "C:\Program Files\MySQL\MySQL Server 8.0\bin"If this command fails, it is probably because the version number changed: open the file explorer, go to C:\Program Files\MySQL\, look for the right version number, and update the command accordingly.
Then, enter
mysql -u root -pand enter the password you picked previously for the root account. You are now logged as root in your database management system, you should see a brief message, followed by a prompt
mysql >Now, move on to “Creating a User”.
The instructions are almost the same as for Windows. Read https://dev.mysql.com/doc/refman/8.0/en/osx-installation-pkg.html and download the file from https://dev.mysql.com/downloads/mysql/ once you selected “macOS” as your operating system. Install it, leaving everything by default but adding a password (refer to the instructions for windows). Then, open a command-line interface (the terminal), enter
mysql -u root -pand enter the password you picked previously for the root account. You are now logged as root in your database management system, you should see a brief message, followed by a prompt
mysql >Now, move on to “Creating a User”.
Install, through your standard package management system (apt or aptitude for debian-based systems, pacman for Arch Linux, etc.), the packages mysql-client and mysql-server (or default-mysql-client and default-mysql-server) as well as their dependencies12.
Open a terminal and type
/etc/init.d/mysql statusor, as root,
service mysql statusto see if MySQL is running: if you read something containing
Active: active (running)then you can move on to the next step, otherwise run (as root)
service mysqld startand try again.
As root, type in your terminal
mysql_secure_installationYou will be asked to provide the current password for the root MySQL user: this password has not be defined yet, so just hit “Enter”. You will be asked if you want to set a new password (that you can freely chose, just make sure to memorize it). Then, answer “n” to the question “Remove anonymous users?”, “Y” to “Disallow root login remotely?”, “n” to “Remove test database and access to it?” and finally “Y” to “Reload privilege tables now?”.
Still as root, type in your terminal
mysql -u root -pand enter the password you picked previously for the root account. You are now logged as root in your database management system: you should see a brief message, followed by a prompt
MariaDB [(none)]>Now, move on to “Creating a User”.
This step will create a non-root user13 and grant it some rights. Copy-and-paste or type the following three commands, one by one (that is, enter the first one, hit “enter”, enter the second, hit “enter”, etc.). This step will create a non-root user14 and grant it some rights. Copy-and-paste or type the following three commands, one by one (that is, enter the first one, hit “enter”, enter the second, hit “enter”, etc.).
We first create a new user called testuser on our local installation, and give it the password password:
CREATE USER 'testuser'@'localhost' IDENTIFIED BY 'password';Then, we grant the user all the privileges on the databases whose name starts with HW_:
GRANT ALL PRIVILEGES ON `HW\_%` . * TO 'testuser'@'localhost';Be careful: backticks (`) are surrounding HW\_% whereas single quotes (') are surrounding testuser and localhost.
And then we quit the DBMS, using
EXIT;The message displayed after the two first commands should be
Query OK, 0 rows affected (0.00 sec)and the message displayed after the last command should be
ByeWe now log in as the normal user called “testuser”.
Linux users should type as a normal user, i.e., not as root, in their terminal the following, and Windows users should type in their command prompt the following15:
mysql -u testuser -pEnter password as your password. If you are prompted with a message
ERROR 1045 (28000): Access denied for user 'testuser'@'localhost' (using password: YES)then you probably typed the wrong password. Otherwise, you should see a welcoming message from MySQL or MariaDB and a prompt.
To save yourself the hassle of typing the password, you can use
mysql -u testuser -ppasswordor
mysql -u testuser -p --password=passwordto log-in as testuser immediately.
If at some point you want to know if you are logged as root or testuser, simply enter
\s;Now, let us create our first schema, our first table, populate it with data, and display various information.
We first create the schema (or database) HW_FirstTest:
CREATE DATABASE HW_FirstTest; -- Or CREATE SCHEMA HW_FirstTest;Let us make sure that we created it:
SHOW DATABASES;Let us use it:
USE HW_FirstTest;And see what it contains now:
SHOW TABLES;We now create a table called TableTest, wtih two integer attributes called Attribute1 and Attribute2:
CREATE TABLE TableTest (Attribute1 INT, Attribute2 INT);And can make sure that the table was indeed created:
SHOW TABLES;We can further ask our DBMS to display the structure of the table we just created:
DESCRIBE TableTest; -- Can be abbreviated as DESC TableTest;And even ask to get back the code that would create the exact same structure (but without the data!):
SHOW CREATE TABLE TableTest;Now, let us populate it with some data:
INSERT INTO TableTest
VALUES (1,2),
(3,4),
(5,6);Note that the SQL syntax and your DBMS are completely fine with your statement spreading over multiple lines. Let us now display the data stored in the table:
SELECT * FROM TableTest;After that last command, you should see
+------------+------------+
| Attribute1 | Attribute2 |
+------------+------------+
| 1 | 2 |
| 3 | 4 |
| 5 | 6 |
+------------+------------+Finally, we can erase the content of the table, then erase (“drop”) the table, and finally the schema:
DELETE FROM TableTest; -- Delete the rows
DROP TABLE TableTest; -- Delete the table
DROP DATABASE HW_FirstTest; -- Delete the schemaYou’re all set! All you have to do is to quit, using the command
EXIT;Note that we were quite careless when we set-up our installation:
testuser that is common to all of our installation.mysqldump -u testuser -ppassword means that the password will be stored in the history of your command-line interface (that you should be able to access using history or Get-History for Powershell) and could be accessed by anyone having access to it.All of those are obvious security risks, and make this installation unsafe to be a production environment. We will only use it as a testing / learning environment, but it is strongly recommended to:
For each of the following, fill in the blanks:
SQL, a relation is called a ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ .SQL, every statement ends with ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟, and in-line comments start with a ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ .SQL, there is no string datatype, so we have to use ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟.SQL’s role is to ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ ͟ .What does it mean to say that SQL is at the same time a “data definition language” and a “data manipulation language”?
Name three kind of objects (for lack of a better word) a CREATE statement can create.
Write a SQL statement that adds a primary key constraint to an attribute named ID in an already existing table named STAFF.
Complete each row of the following table with either a datatype or two different examples:
| Data type | Examples |
|---|---|
4, -32 |
|
| Char(4) | |
| VarChar(10) | 'Train', 'Michelle' |
| Bit(4) | |
TRUE, UNKNOWN |
In the datatype CHAR(3), what does the 3 indicate?
Explain this query: CREATE SCHEMA FACULTY;.
Write code to
You are free to come up with an example (even very simple or cryptic) or to re-use an example from class.
Explain this query:
ALTER TABLE TABLEA
DROP INDEX Attribute1;If I want to enter January 21, 2016, as a value for an attribute with the DATE datatype, what value should I enter?
Write a statement that inserts the values "Thomas" and 4 into the table TRAINS.
If PkgName is the primary key in the table MYTABLE, what can you tell about the number of rows returned by the following statement?
SELECT * FROM MYTABLE WHERE PkgName = 'MySQL';.
If you want that every time a referenced row is delted, all the refering rows are deleted as well, what mechanism should you use?
By default, does the foreign key restrict, cascade, or set null on update? Can you justify this choice?
If a database designer is using the ON UPDATE SET NULL for a foreign key, what mechanism is (s)he implementing (i.e., describe how the database will react a certain operation)?
If the following is part of the design of a table:
FOREIGN KEY (DptNumber) REFERENCES DEPARTMENT(Number)
ON DELETE SET DEFAULT
ON UPDATE CASCADE;What happen to the rows whose foreign key DptNumber are set to 3 if the row in the DEPARTEMENT table with primary key Number set to 3 is…
5?If the following is part of the design of a WORKER table:
FOREIGN KEY WORKER(DptNumber) REFERENCES DEPARTMENT(DptNumber)
ON UPDATE CASCADE;What happen to the rows whose foreign key DptNumber are set to 3 if the row in the DEPARTMENT table with primary key Number set to 3 is…
5?Given a relation TOURIST(Name, EntryDate, Address), write a SQL statement printing the name and address of all the tourists who entered the territory after the 15 September, 2012.
Describe what the star do in the statement
SELECT ALL * FROM MYTABLE;What is the fully qualified name of an attribute? Give an example.
If DEPARTMENT is a database, what is DEPARTMENT.*?
What is a multi-set? What does it mean to say that MySQL treats tables as multisets?
What is the difference between
SELECT ALL * FROM MYTABLE;and
SELECT DISTINCT * FROM MYTABLE;How are the results the same? How are they different?
What is wrong with the statement
SELECT * WHERE Name = 'CS' FROM DEPARTMENT;Write a query that returns the number of row (i.e., of entries, of tuples) in a table named BOOK.
When is it useful to use a select-project-join query?
When is a tuple variable useful?
Write a query that changes the name of the professor whose Login is 'caubert' to 'Hugo Pernot' in the table PROF.
Can an UPDATE statement have a WHERE condition using an attribute that is not the primary key? If no, justify, if yes, tell what could happen.
Give the three possible meaning of the NULL value, and an example for each of them.
What are the values of the following expressions (i.e., do they evaluate to TRUE, FALSE, or UNKNOWN)?
TRUE AND FALSETRUE AND UNKNOWNNOT UNKNOWNFALSE OR UNKNOWNWrite the truth table for AND for the three-valued logic of SQL.
What comparison expression should you use to test if a value is different from NULL?
Explain this query:
SELECT Login
FROM PROF
WHERE Department IN ( SELECT Major
FROM STUDENT
WHERE Login = 'jrakesh');Can you rewrite it without nesting queries?
What is wrong with this query?
SELECT Name FROM STUDENT
WHERE Login IN
( SELECT Code FROM Department WHERE head = 'aturing');Write a query that returns the sum of all the values stored in the Pages attribute of a BOOK table.
Write a query that adds a Pages attribute of type INT into a (already existing) BOOK table.
Write a query that removes the default value for a Pages attribute in a BOOK table.
Under which conditions does SQL allow you to enter the same row in a table twice?
Explain this query: ROLLBACK;.
Explain this query: DELIMITER ;.
The blanks can be filled as follow:
SQL, a relation is called a ͟ ͟ ͟ ͟table ͟ ͟ ͟ ͟ ͟ ͟ ͟ .SQL, every statement ends with ͟ ͟a semi-colon (;) ͟ ͟, and in-line comments start with a ͟ ͟ ͟two minus signs (--) ͟ ͟ .SQL, there is no string datatype, so we have to use ͟ ͟ ͟VARCHAR(x) or CHAR(x) where x is an integer reflecting the maximum (or fixed) size of the string ͟ ͟ ͟ ͟ ͟ .SQL’s role is to ͟ ͟ control access to the data stored, by creating users and granting them rights ͟ ͟ .It can specify the conceptual and internal schema, and it can manipulate the data.
Database (schema), table, view, assertion, trigger, etc.
ALTER TABLE STAFF ADD PRIMARY KEY(ID);
| Data type | Examples |
|---|---|
INT |
4, -32 |
CHAR(4) |
'abCD', "dEfG" |
VARCHAR(10) |
'Train', 'Michelle' |
BIT(4) |
B'1010', B'0101' |
BOOL`` |TRUE,FALSE,NULL`
NULL is actually a valid answer for every single type of
That we can store exactly three characters.
It creates a schema, i.e., a database, named Faculty.
/* code/sql/HW_Short.sql */
CREATE TABLE A (
Att1 INT PRIMARY KEY,
Att2 INT
);
CREATE TABLE B (
Att3 INT PRIMARY KEY,
Att4 INT,
FOREIGN KEY (Att4) REFERENCES A (Att1)
);
INSERT INTO A
VALUES (
1,
2);
INSERT INTO B
VALUES (
3,
1);
It removes the UNIQUE constraint on the Attribute1 in the TABLEA table.
DATE'2016-01-21', '2016-01-21', '2016/01/21', '20160121'.
INSERT INTO TRAINS VALUES('Thomas', 4);
We know that at most one (but possibly 0) row will be returned.
We should use a referential triggered action clause, ON DELETE CASCADE.
By default, the foreign key restricts updates. This prevents unwanted update of information: if an update needs to be propagated, then it needs to be “acknowledged” and done explicitely.
If the referenced row is updated, then the attribute of the referencing rows are set to NULL.
In the referencing rows,
DEPARTMENT table with primary key Number set to 3 cannot be deleted if a row in the WORKER table references it.We could use the following:
SELECT Name, Address
FROM TOURIST
WHERE EntryDate > DATE'2012-09-15';It selects all the attributes, it is a wildcard.
The name of the relation with the name of its schema and a period beforehand. An example would be EMPLOYEE.Name.
All the tables in that database.
A multiset is a set where the same value can occur twice. In MySQL, the same row can occur twice in a table.
They both select all the rows in the MYTABLE table, but ALL will print the duplicate values, whereas DISTINCT will print them only once.
You cannot have the WHERE before FROM.
SELECT COUNT(*) FROM BOOK;
We use those query that projects on attributes using a selection and join conditions when we need to construct for information based on pieces of data spread in multiple tables.
It makes the distinction between two different rows of the same table, it is useful when we want to select a tuple in a relation that is in a particular relation with a tuple in the same relation. Quoting https://stackoverflow.com/a/7698796/:
They are useful for saving typing, but there are other reasons to use them:
- If you join a table to itself you must give it two different names otherwise referencing the table would be ambiguous.
- It can be useful to give names to derived tables, and in some database systems it is required… even if you never refer to the name.
We could use the following:
UPDATE PROF SET Name = 'Hugo Pernot'
WHERE Login = 'caubert';Yes, we can have select condition that does not use primary key. In that case, it could be the case that we update more than one tuple with such a command (which is not necessarily a bad thing).
Unknown value (“Will it rain tomorrow?”), unavailable / withheld (“What is the phone number of Harry Belafonte?”), N/A (“What is the email address of Abraham Lincoln?”).
TRUE AND FALSE → FALSETRUE AND UNKNOWN → UNKNOWNNOT UNKNOWN → UNKNOWNFALSE OR UNKNOWN → FALSETRUE AND TRUE → TRUETRUE AND FALSE → FALSETRUE AND UNKNOWN → UNKNOWNFALSE AND FALSE → FALSEUNKNOWN AND UNKNOWN → UNKNOWNFALSE AND UNKNOWN → FALSEFor a more compact presentation, refer to the three-valued truth table.
IS NOT
It list the login of the professors teaching in the department where the student whose login is “jrakesh” is majoring. It can be rewritten as
SELECT PROF.Login
FROM PROF, STUDENT
WHERE Department = Major
AND STUDENT.Login = 'jrakesh';It tries to find a Login in a Code.
SELECT SUM(Pages) FROM BOOK;
ALTER TABLE BOOK ADD COLUMN Pages INT;
ALTER TABLE BOOK ALTER COLUMN Pages DROP DEFAULT;
Essentially, if there are no primary key in the relation, and if no attribute has the UNIQUE constraint. Cf. also this previous problem.
This command, ROLLBACK, undoes the last transaction, i.e. it allows to “roll back” to a previous point identified by START TRANSACTION;: everything that happened between those two commands is “undone”, as if it had never been executed.
This command, DELIMITER ;, changes the SQL interpreter delimiter back to being “;”. It is useful when defining procedures, as the traditional delimiter’s role have to be “suspended” to wrap a series of commands inside a function.
The goal of this problem is to learn where to find the documentation for your DBMS, and to understand how to read the syntax of SQL commands.
You can consult (Elmasri and Navathe 2010, Table 5.2, p. 140) or (Elmasri and Navathe 2015, Table 7.2, p. 235), for a very quick summary of the most common commands. Make sure you are familiar with the Backus–Naur form (BNF) notation commonly used:
<…>
[…]
{…}
(…|…|…)
The most complete lists of commands are probably at
Those are the commands implemented in the DBMS you are actually using. Since there are small variations from one implementation to the other, it is better to take one of this link as a reference in the future.
As a starting point, looking at the syntax for CREATE TABLE commands is probably a good start, cf. https://mariadb.com/kb/en/create-table/ or https://dev.mysql.com/doc/refman/8.0/en/create-table.html.
SQL)This problem will guide you in manipulating a very simple table in SQL.
Log in as testuser, create a database named HW_Address, use it, and create two tables:
CREATE TABLE NAME(
FName VARCHAR(15),
LName VARCHAR(15),
ID INT,
PRIMARY KEY(ID)
);
CREATE TABLE ADDRESS(
StreetName VARCHAR(15),
Number INT,
Habitants INT,
PRIMARY KEY(StreetName, Number)
);Observe the output produced by the command DESC ADDRESS;.
Add a foreign key to the ADDRESS table, using
ALTER TABLE ADDRESS
ADD FOREIGN KEY (Habitants)
REFERENCES NAME(ID);And observe the new output produced by the command DESC ADDRESS;.
Is it what you would have expected? How informative is it? Can you think of a command that would output more detailled information, including a reference to the existence of the foreign key?
Draw the relational model corresponding to that database and identify the primary and foreign keys.
Add this data to the NAME table:
INSERT INTO NAME VALUES ('Barbara', 'Liskov', 003);
INSERT INTO NAME VALUES ('Tuong Lu', 'Kim', 004);
INSERT INTO NAME VALUES ('Samantha', NULL, 080);What command can you use to display this infomation back? Do you notice anything regarding the values we entered for the ID attribute?
Add some data into the ADDRESS table:
INSERT INTO ADDRESS
VALUES
('Armstrong Drive', 10019, 003),
('North Broad St.', 23, 004),
('Robert Lane', 120, NULL);What difference do you notice with the insertions we made in the NAME table? Which syntax seems more easy to you?
Write a SELECT statement that returns the ID number of the person whose first name is “Samantha”.
Write a statement that violates the entity integrity constraint. What is the error message returned?
Execute an UPDATE statement that violates the referential integrity constraint. What is the error message returned?
Write a statement that violates another kind of constraint. Explain what constraint you are violating and explain the error message.
SQL)Log in as testuser and create a database titled HW_REPETITION. Create in that database a table (the following questions refer to this table as EXAMPLE, but you are free to name it whatever you want) with at least two attributes that have different data types. Do not declare a primary key yet. Answer the following:
Add a tuple to your table using
INSERT INTO EXAMPLE VALUES(X, Y);where the X and Y are values have the right datatype. Try to add this tuple again. What do you observe? (You can use SELECT * FROM EXAMPLE; to observe what is stored in this table.)
Alter your table to add a primary key, using
ALTER TABLE EXAMPLE ADD PRIMARY KEY (Attribute);where Attribute is the name of the attribute you want to be a primary key. What do you observe?
Empty your table using
DELETE FROM EXAMPLE;and alter your table to add a primary key, using the command we gave at the previous step. What do you observe?
Try to add the same tuple twice. What do you observe?
From the notes, recall the following about foreign keys:
Two important remarks:
- The datatype of the foreign key has to be the exactly the same as the datatype of the attribute to which we are referring.
- The target of the foreign key must be the primary key.
But, the situation is slightly more complex. Test for yourself by editing the following code as indicated:
/* code/sql/HW_FKtest.sql */
DROP SCHEMA IF EXISTS HW_FKtest;
CREATE SCHEMA HW_FKtest;
USE HW_FKtest;
CREATE TABLE TARGET (
Test VARCHAR(15) PRIMARY KEY
);
CREATE TABLE SOURCE (
Test VARCHAR(25),
FOREIGN KEY (Test) REFERENCES TARGET (Test)
);PRIMARY KEY constraint.PRIMARY KEY with UNIQUE.VARCHAR(25) with CHAR(25).VARCHAR(25) with INT.VARCHAR(25) with VARCHAR(15)Create the PROF, DEPARTMENT, STUDENT and GRADE tables as in the “Constructing and populating a new example” section. Populate them with some data (copy it from the notes or come up with your own data).
To obtain exactly the same schema as the one we developped and edited, you can use mysqldump to “dump” this table, with a command like
mysqldump -u testuser -ppassword\
-h localhost --add-drop-database\
--skip-comments --compact\
HW_ProfExample > dump.sqlThe code we studied during the lecture is more or less the following.
/* code/sql/HW_ProfExampleRevisitedRevisited.sql */
DROP SCHEMA IF EXISTS HW_ProfExampleRevisited;
CREATE SCHEMA HW_ProfExampleRevisited;
USE HW_ProfExampleRevisited;
CREATE TABLE PROF (
Login VARCHAR(25) PRIMARY KEY,
NAME VARCHAR(25),
Department CHAR(5)
);
CREATE TABLE DEPARTMENT (
Code CHAR(5) PRIMARY KEY,
NAME VARCHAR(25),
Head VARCHAR(25),
FOREIGN KEY (Head) REFERENCES PROF (LOGIN) ON UPDATE CASCADE
);
ALTER TABLE PROF
ADD FOREIGN KEY (Department) REFERENCES DEPARTMENT (Code);
CREATE TABLE STUDENT (
Login VARCHAR(25) PRIMARY KEY,
NAME VARCHAR(25),
Registered DATE,
Major CHAR(5),
FOREIGN KEY (Major) REFERENCES DEPARTMENT (Code)
);
CREATE TABLE GRADE (
Login VARCHAR(25),
Grade INT,
PRIMARY KEY (LOGIN, Grade),
FOREIGN KEY (LOGIN) REFERENCES STUDENT (LOGIN)
);
INSERT INTO DEPARTMENT
VALUES (
'MATH',
'Mathematics',
NULL),
(
'CS',
'Computer
Science',
NULL);
INSERT INTO DEPARTMENT (
Code,
Name)
VALUES (
'CYBR',
'Cyber Secturity');
INSERT INTO PROF (
LOGIN,
Department,
Name)
VALUES (
'caubert',
'CS',
'Clément Aubert');
INSERT INTO PROF (
LOGIN,
Name,
Department)
VALUES (
'aturing',
'Alan Turing',
'CS'),
(
'perdos',
'Paul
Erdős',
'MATH'),
(
'bgates',
'Bill Gates',
'CYBR');
INSERT INTO STUDENT (
LOGIN,
Name,
Registered,
Major)
VALUES (
'jrakesh',
'Jalal Rakesh',
DATE '2017-12-01',
'CS'),
(
'svlatka',
'Sacnite Vlatka',
'2015-03-12',
'MATH'),
(
'cjoella',
'Candice Joella',
'20120212',
'CYBR'),
(
'aalyx',
'Ava Alyx',
20121011,
'CYBR'),
(
'caubert',
'Clément Aubert',
NULL,
'CYBR');
INSERT INTO GRADE
VALUES (
'jrakesh',
3.8),
(
'svlatka',
2.5);
We will resume working on this model, and enhance it.
Draw the complete relational model for this database (i.e., for the PROF, DEPARTMENT, STUDENT and GRADE relations).
Create and populate a LECTURE table as follows:
Name, Instructor, Code, and Year, of types VARCHAR(25) for the first two, CHAR(5) for Code, and YEAR(4) for Year.Year and Code attributes should be the primary key (yes, have two attributes be the primary key).Instructor attribute should be a foreign key referencing the Login attribute in PROF.LECTURE table with some made-up data.Try to think about some of the weakenesses of this representation. For instance, can it accomodate two instructors for the same class? Write down two possibles scenarios in which this schema would not be appropriate.
The GRADE table had some limitations too. For example, every student could have only one grade. Add two columns to the GRADE table using:
ALTER TABLE GRADE
ADD COLUMN LectureCode CHAR(5),
ADD COLUMN LectureYear YEAR(4);Add a foreign key:
ALTER TABLE GRADE
ADD FOREIGN KEY (LectureYear, LectureCode)
REFERENCES LECTURE(Year, Code);Use DESCRIBE and SELECT to observe the schema of the GRADE table and its rows. Is it what you would have expected?
GRADE with some made-up data. Now every row should contain, in addition to a login and a grade, a lecture year and a lecture code.
Write SELECT statements answering the following questions (where PROF.Name, LECTURE.Name, YYYY, LECTURE.Code and STUDENT.Login should be relevant values considering your data):
LECTURE.Name in YYYY?”YYYY without any duplicates?”LECTURE.Code?”LECTURE.Code taught?”LECTURE.Code?”STUDENT.Login?”SQL coding)Look at the SQL code below and then answer the following questions.
/* code/sql/HW_Train.sql */
CREATE TABLE TRAIN (
ID VARCHAR(30),
Model VARCHAR(30),
ConstructionYear YEAR (4)
);
CREATE TABLE CONDUCTOR (
ID VARCHAR(20),
NAME VARCHAR(20),
ExperienceLevel VARCHAR(20)
);
CREATE TABLE ASSIGNED_TO (
TrainId VARCHAR(20),
ConductorId VARCHAR(20),
Day DATE,
PRIMARY KEY (TrainId, ConductorId)
);
Modify the CREATE statement that creates the TRAIN table (lines 1–5), so that ID would be declared as the primary key. It is sufficient to only write the line(s) that need to change.
Write an ALTER statement that makes ID become the primary key of the CONDUCTOR table.
Modify the CREATE statement that creates the ASSIGNED_TO table (lines 13–18), so that it has two foreign keys: ConductorId references the ID attribute in CONDUCTOR and TrainId references the ID attribute in TRAIN. It is sufficient to only write the line(s) that need to change.
Write INSERT statements that insert one tuple of your choosing in each relation (no NULL values). These statements should respect all the constraints (including the ones we added in the previous questions) and result in actual insertions. (Remember that four digits is a valid value for an attribute with the YEAR(4) datatype.)
Write a statement that sets the value of the ExperienceLevel attribute to “Senior” in all the tuples where the ID attribute is “GP1029” in the CONDUCTOR relation.
Write a SELECT statement that answers each of the following questions:
SQL statements for the COFFEE database)Suppose we have the relational model depicted below, with the indicated data in it:
COFFEE
| Ref | Origin | TypeOfRoast | PricePerPound |
|---|---|---|---|
| 001 | Brazil | Light | 8.90 |
| 121 | Bolivia | Dark | 7.50 |
| 311 | Brazil | Medium | 9.00 |
| 221 | Sumatra | Dark | 10.25 |
CUSTOMER
| CardNo | Name | |
|---|---|---|
| 001 | Bob Hill | b.hill@isp.net |
| 002 | Ana Swamp | swampa@nca.edu |
| 003 | Mary Sea | brig@gsu.gov |
| 004 | Pat Mount | pmount@fai.fr |
SUPPLY
| Provider | Coffee |
|---|---|
| Coffee Unl. | 001 |
| Coffee Unl. | 121 |
| Coffee Exp. | 311 |
| Johns & Co. | 221 |
PROVIDER
| Name | |
|---|---|
| Coffee Unl. | bob@cofunl.com |
| Coffee Exp. | pat@coffeex.dk |
| Johns & Co. | NULL |
In the following, we will assume that this model was implemented in a DBMS (MySQL or MariaDB), the primary keys being COFEE.Ref, CURTOMER.CardNo, SUPPLY.Provider and SUPPLY.Coffee, and PROVIDER.Name, and the foreign keys being as follows:
| FavCoffee in the CUSTOMER relation | refers to | Ref in the COFFEE relation |
| Provider in the SUPPLY | refers to | Name in the PROVIDER relation |
| Coffee in the SUPPLY | refers to | Ref in the COFFEE relation |
Read and write SQL commands for the following “what-if” scenarios. Assume that:
SQL’s syntax (there’s no “a semi-colon is missing” trap).You can use COFFEE.1 to denote the first tuple (or row) in COFFEE, and similarly for other relations and tuples (so that, for instance, SUPPLY.4 corresponds to "Johns & Co"., 221).
Draw the relational model of this table.
NULL and should be unique”), the referential integrity constraint (“the foreign key must refer to something that exists”), if there would be some other kind of error (ignoring the plausability / revelance of inserting that tuple), or if it would result in successful insertion.
INSERT INTO CUSTOMER VALUES(005, 'Bob Hill', NULL, 001);
INSERT INTO COFFEE VALUES(002, "Peru", "Decaf", 3.00);
INSERT INTO PROVIDER VALUES(NULL, "contact@localcof.com");
INSERT INTO SUPPLY VALUES("Johns Co.", 121);
INSERT INTO SUPPLY VALUES("Coffee Unl.", 311, 221);Assuming that the referential triggered action clause ON UPDATE CASCADE is used for each of the foreign keys in this database, list the tuples modified by the following statements:
UPDATE CUSTOMER SET FavCoffee = 001
WHERE CardNo = 001;
UPDATE COFFEE SET TypeOfRoast = 'Decaf'
WHERE Origin = 'Brazil';
UPDATE PROVIDER SET Name = 'Coffee Unlimited'
WHERE Name = 'Coffee Unl.';
UPDATE COFFEE SET PricePerPound = 10.00
WHERE PricePerPound > 10.00;ON DELETE CASCADE is used for each of the foreign keys in this database, list the tuples modified by the following statements:
DELETE FROM CUSTOMER
WHERE Name LIKE '%S%';
DELETE FROM COFFEE
WHERE Ref = 001;
DELETE FROM SUPPLY
WHERE Provider = 'Coffee Unl.'
AND Coffee = 001;
DELETE FROM PROVIDER
WHERE Name = 'Johns & Co.';SQL queries that answer the following questions:
Consider the following SQL code:
/* code/sql/HW_Department.sql */
CREATE TABLE DEPARTMENT (
ID INT PRIMARY KEY,
NAME VARCHAR(30)
);
CREATE TABLE EMPLOYEE (
ID INT PRIMARY KEY,
NAME VARCHAR(30),
Hired DATE,
Department INT,
FOREIGN KEY (Department) REFERENCES DEPARTMENT (ID)
);
INSERT INTO DEPARTMENT
VALUES (
1,
"Storage"),
(
2,
"Hardware");
INSERT INTO EMPLOYEE
VALUES (
1,
"Bob",
20100101,
1),
(
2,
"Samantha",
20150101,
1),
(
3,
"Mark",
20050101,
2),
(
4,
"Karen",
NULL,
1),
(
5,
"Jocelyn",
20100101,
1);
Write queries that return the following information. The values returned in this set-up will be in parenthesis, but keep the queries general.
Storage department ("Bob", "Samantha", "Karen" and "Jocelyn"),"Mark"),Storage department who has(have) been hired for the longest period of time. Phrased differently, the oldest employees of the Storage department ("Bob" and "Jocelyn").Consider the following SQL code:
/* code/sql/HW_Computer.sql */
DROP SCHEMA IF EXISTS HW_Computer;
CREATE SCHEMA HW_Computer;
USE HW_Computer;
CREATE TABLE COMPUTER (
ID VARCHAR(20) PRIMARY KEY,
Model VARCHAR(40)
);
CREATE TABLE PRINTER (
ID VARCHAR(20) PRIMARY KEY,
Model VARCHAR(40)
);
CREATE TABLE CONNEXION (
Computer VARCHAR(20),
Printer VARCHAR(20),
PRIMARY KEY (Computer, Printer),
FOREIGN KEY (Computer) REFERENCES COMPUTER (ID),
FOREIGN KEY (Printer) REFERENCES PRINTER (ID)
);
INSERT INTO COMPUTER
VALUES (
'A',
'DELL A'),
(
'B',
'HP X'),
(
'C',
'ZEPTO D'),
(
'D',
'MAC Y');
INSERT INTO PRINTER
VALUES (
'12',
'HP-140'),
(
'13',
'HP-139'),
(
'14',
'HP-140'),
(
'15',
'HP-139');
INSERT INTO CONNEXION
VALUES (
'A',
'12'),
(
'A',
'13'),
(
'B',
'13'),
(
'C',
'14');Write queries that return the following information. The values returned in this set-up will be in parenthesis, but keep the queries general.
ID is '13' (2).2).ID is 'A' ('HP-140' and 'HP-139').'D').Consider the following SQL code:
/* code/sql/HW_SocialMedia.sql */
CREATE TABLE ACCOUNT (
ID INT PRIMARY KEY,
NAME VARCHAR(25),
Email VARCHAR(25) UNIQUE
);
CREATE TABLE SUBSCRIBE (
Subscriber INT,
Subscribed INT,
DATE DATE,
FOREIGN KEY (Subscriber) REFERENCES ACCOUNT (ID),
FOREIGN KEY (Subscribed) REFERENCES ACCOUNT (ID),
PRIMARY KEY (Subscriber, Subscribed)
);
CREATE TABLE VIDEO (
ID INT PRIMARY KEY,
Title VARCHAR(25),
Released DATE,
Publisher INT,
FOREIGN KEY (Publisher) REFERENCES ACCOUNT (ID)
);
CREATE TABLE THUMBS_UP (
Account INT,
Video INT,
DATE DATE,
PRIMARY KEY (Account, Video),
FOREIGN KEY (Account) REFERENCES ACCOUNT (ID),
FOREIGN KEY (Video) REFERENCES VIDEO (ID)
);
INSERT INTO ACCOUNT
VALUES (
1,
"Bob Ross",
"bob@ross.com"),
(
2,
NULL,
"anon@fai.com"),
(
3,
"Martha",
NULL);
INSERT INTO SUBSCRIBE
VALUES (
2,
1,
DATE "2020-01-01"),
(
3,
1,
DATE "2019-03-03"),
(
3,
2,
DATE "2019-03-03"),
(
2,
2,
DATE "2019-03-03"),
(
1,
2,
DATE "2019-03-03");
-- The first entry means that 2 subscribed to 1, not the
-- other way around.
-- And similarly for the other entries.
INSERT INTO VIDEO
VALUES (
10,
"My first video!",
DATE "2020-02-02",
1),
(
20,
"My second video!",
DATE "2020-02-03",
1),
(
30,
"My vacations",
DATE "2020-02-04",
2);
INSERT INTO THUMBS_UP
VALUES (
2,
10,
DATE "2020-02-02"),
(
3,
10,
DATE "2020-02-02"),
(
2,
20,
DATE "2020-02-02"),
(
1,
30,
DATE "2020-02-05");
Write queries that return the following information. The values returned in this set-up will be in parenthesis, but keep the queries general.
"My first video!", "My second video!", "My vacations")."My first video!" ("2020-02-02")."2")."video" ("10", "20")."My vacations" ("1")."My first video!")."Bob Ross")."2").Consider the following SQL code:
CREATE TABLE COMPUTER (
ID VARCHAR(20) PRIMARY KEY,
Model VARCHAR(40)
);
CREATE TABLE PERIPHERAL (
ID VARCHAR(20) PRIMARY KEY,
Model VARCHAR(40),
TYPE ENUM ('mouse', 'keyboard', 'screen', 'printer')
);
CREATE TABLE CONNEXION (
Computer VARCHAR(20),
Peripheral VARCHAR(20),
PRIMARY KEY (Computer, Peripheral),
FOREIGN KEY (Computer) REFERENCES COMPUTER (ID),
FOREIGN KEY (Peripheral) REFERENCES PERIPHERAL (ID)
);
INSERT INTO COMPUTER
VALUES (
'A',
'Apple IIc Plus'),
(
'B',
'Commodore SX-64');
INSERT INTO PERIPHERAL
VALUES (
'12',
'Trendcom Model',
'printer'),
(
'14',
'TP-10
Thermal Matrix',
'printer'),
(
'15',
'IBM Selectric',
'keyboard');
INSERT INTO CONNEXION
VALUES (
'A',
'12'),
(
'B',
'14'),
(
'A',
'15');
Write queries that return the following information. The values returned in this set-up will be in parenthesis, but keep the queries general.
ID is 'A' ('Apple IIc Plus').ID is '14' (printer).NAME starts with 'IBM' ('IBM Selectric').ID is 'A' (Trendcom Model, IBM Selectric).2).A friend of yours wants you to review and improve the code for a role-playing game.
The original idea was that each character has a name, a class (e.g., Bard, Assassin, Druid), a certain amount of experience, a level, one or more weapons (providing bonuses) and the ability to complete quests. A quest has a name and rewards the characters who completed it with a certain amount of experience and, on rare occaisions, with a special item.
Your friend came up with the following code:
CREATE TABLE CHARACTER(
Name VARCHAR(30) PRIMARY KEY,
Class VARCHAR(30),
XP INT,
LVL INT,
Weapon_Name VARCHAR(30),
Weapon_Bonus INT,
Quest_Completed VARCHAR(30)
);
CREATE TABLE QUEST(
ID VARCHAR(20) PRIMARY KEY,
Completed_By VARCHAR(30),
XP_Gained INT,
Special_Item VARCHAR(20),
FOREIGN KEY (Completed_By) REFERENCES CHARACTER(Name)
);
ALTER TABLE CHARACTER
ADD FOREIGN KEY (Quest_Completed) REFERENCES QUEST(ID);However, there are several problems with the code:
CHARACTER table to add an arbitrary number of weapons ended up creating horrible messes.)Can you provide a relational model (there is no need to write the SQL code, but do remember to indicate the primary and foreign keys) that would solve all of your friend’s troubles?
Consider the following code:
/* code/sql/HW_SimpleBook.sql */
DROP SCHEMA IF EXISTS HW_SimpleBook;
CREATE SCHEMA HW_SimpleBook;
USE HW_SimpleBook;
CREATE TABLE AUTHOR (
FName VARCHAR(30),
LName VARCHAR(30),
Id INT PRIMARY KEY
);
CREATE TABLE PUBLISHER (
NAME VARCHAR(30),
City VARCHAR(30),
PRIMARY KEY (NAME, City)
);
CREATE TABLE BOOK (
Title VARCHAR(30),
Pages INT,
Published DATE,
PublisherName VARCHAR(30),
PublisherCity VARCHAR(30),
FOREIGN KEY (PublisherName, PublisherCity) REFERENCES
PUBLISHER (NAME, City),
Author INT,
FOREIGN KEY (Author) REFERENCES AUTHOR (Id),
PRIMARY KEY (Title, Published)
);
INSERT INTO AUTHOR
VALUES (
"Virginia",
"Wolve",
01),
(
"Paul",
"Bryant",
02),
(
"Samantha",
"Carey",
03);
INSERT INTO PUBLISHER
VALUES (
"Gallimard",
"Paris"),
(
"Gallimard",
"New-York"),
(
"Jobs Pub.",
"New-York");
INSERT INTO BOOK
VALUES (
"What to eat",
213,
DATE '20170219',
"Gallimard",
"Paris",
01),
(
"Where to live",
120,
DATE '20130212',
"Gallimard",
"New-York",
02),
(
"My Life, I",
100,
DATE '18790220',
"Gallimard",
"Paris",
03),
(
"My Life, II",
100,
DATE '18790219',
"Jobs Pub.",
"New-York",
03);The values inserted in the database is just to provide some examples; you should assume there is more data in it than what we have inserted. In this long problem, you will be asked to write commands to select, update, delete, insert data, and to improve upon the relational model.
Write a command that selects:
Title of all the books.Name of the publishers.Titles and Published dates of the books published since January 31, 2012."Gallimard" (from any city)."New-York".ID of the authors who published a book whose name starts with "Where"."Wolve".ID is 3 to "BANNED". Is there any reason for this command to be rejected by the system? If yes, explain the reason.
ID is 3 and all the books written by that author. Make sure you do not violate any foreign key constraints.
A certificate for a website has a serial number (SN) and a common name (CN). It must belong to an organization and be signed by a certificate authority (CA). The organization and CA must both have an SN and a CN. A CA can be trusted, not trusted, or not evaluated. The code below is an attempt to represent this situation and is populated with examples.
CREATE TABLE ORGANIZATION(
SN VARCHAR(30) PRIMARY KEY,
CN VARCHAR(30)
);
CREATE TABLE CA(
SN VARCHAR(30) PRIMARY KEY,
CN VARCHAR(30),
Trusted BOOL
);
CREATE TABLE CERTIFICATE(
SN VARCHAR(30) PRIMARY KEY,
CN Varchar(30),
Org VARCHAR(30) NOT NULL,
Issuer VARCHAR(30) NOT NULL,
Valid_Since DATE,
Valid_Until DATE,
FOREIGN KEY (Org)
REFERENCES ORGANIZATION(SN)
ON DELETE CASCADE,
FOREIGN KEY (Issuer) REFERENCES CA(SN)
);
INSERT INTO ORGANIZATION VALUES
('01', 'Wikimedia Foundation'),
('02', 'Free Software Foundation');
INSERT INTO CA VALUES
('A', "Let's Encrypt", true),
('B', 'Shady Corp.', false),
('C', 'NewComer Ltd.', NULL);
INSERT INTO CERTIFICATE VALUES
('a', '*.wikimedia.org', '01', 'A',
20180101, 20200101),
('b', '*.fsf.org', '02', 'A',
20180101, 20191010),
('c', '*.shadytest.org', '02', 'B',
20190101, 20200101),
('d', '*.wikipedia.org', '01', 'C',
20200101, 20220101);CN’s of all certificates ("*.wikimedia.org, \*.fsf.org, \*.shadytest.org, \*.wikipedia.org").SN’s of the organizations whose CN contains "Foundation" ("01, 02").CN’s and expiration dates of all the certificates that expired, assuming today is the 6th of December 2019 ("\*.fsf.org", 2019 − 10 − 10).CN’s of the CA’s that are not trusted ("Shady Corp., NewComer Ltd."),CN’s of the certificates that are signed by a CA that is not trusted ("\*.shadytest.org, \*.wikipedia.org").CA whose CN is "Let's encrypt" (2).CN’s of the organizations along with the CN’s of their certificates ("Wikimedia Foundation, \*.wikimedia.org, Free Software Foundation, \*.fsf.org, Free Software Foundation, \*.shadytest.org, Wikimedia Foundation, \*.wikipedia.org").DELETE FROM CA WHERE SN = 'A';UPDATE ORGANIZATION SET CN = "FSF" WHERE SN = '02';UPDATE ORGANIZATION SET SN = "01" WHERE SN = '02';DELETE FROM ORGANIZATION;Consider the following code:
/* code/sql/HW_Work.sql */
CREATE TABLE AUTHOR (
NAME VARCHAR(30) PRIMARY KEY,
Email VARCHAR(30)
);
CREATE TABLE WORK (
Title VARCHAR(30) PRIMARY KEY,
Author VARCHAR(30),
FOREIGN KEY (Author) REFERENCES AUTHOR (NAME) ON DELETE
CASCADE ON UPDATE CASCADE
);
CREATE TABLE BOOK (
ISBN INT PRIMARY KEY,
Work VARCHAR(30),
Published DATE,
Price DECIMAL(10, 2),
FOREIGN KEY (WORK) REFERENCES WORK (Title) ON DELETE
RESTRICT ON UPDATE CASCADE
);
CREATE TABLE EBOOK (
ISBN INT PRIMARY KEY,
Work VARCHAR(30),
Published DATE,
Price DECIMAL(10, 2),
FOREIGN KEY (WORK) REFERENCES WORK (Title) ON DELETE
RESTRICT ON UPDATE CASCADE
);
INSERT INTO AUTHOR
VALUES (
"Virginia W.",
"vw@isp.net"), -- A.1
(
"Paul B.", "pb@isp.net"), -- A.2
(
"Samantha T.", "st@fai.fr") -- A.3
;
INSERT INTO WORK
VALUES (
"What to eat",
"Virginia W.") -- W.1
;
INSERT INTO BOOK
VALUES (
15155627,
"What to eat",
DATE '20170219',
12.89) -- B.1
;
INSERT INTO EBOOK
VALUES (
15155628,
"What to eat",
DATE '20170215',
9.89) -- E.1
;
Assume the following:
SQL’s syntax (there’s no “a semi-colon is missing” trap).Also, note that each row inserted between line 39 and 50 is given a name in comment ("A.1, A.2, A.3, W.1", etc.).
Determine if the following insertion statements would violate the entity integrity constraint, the referential integrity constraint, if there would be some other kind of error, or if it would result in successful insertion.
INSERT INTO EBOOK VALUES (0, NULL, 20180101, 0);
INSERT INTO AUTHOR VALUES("Mary B.", "mb@fai.fr", NULL);
INSERT INTO WORK VALUES("My Life", "Claude A.");
INSERT INTO BOOK VALUES(00000000, NULL, DATE'20001225', 90.9);
INSERT INTO AUTHOR VALUES("Virginia W.", "alt@isp.net");List the rows (A.2, W.1, etc.) modified by the following statements. Be careful about the conditions on foreign keys!
UPDATE AUTHOR SET Email = 'Deprecated' WHERE Email LIKE '%isp.net';
UPDATE WORK SET Title = "How to eat" WHERE Title = "What to eat";
DELETE FROM WORK;
DELETE FROM AUTHOR WHERE Name = "Virginia W.";Assume that there is more data than what we inserted. Write a command that selects:
Consider the following code:
/* code/sql/HW_TextbookAuthored.sql */
DROP SCHEMA IF EXISTS HW_TEXTBOOKAUTHORED;
CREATE SCHEMA HW_TEXTBOOKAUTHORED;
USE HW_TEXTBOOKAUTHORED;
CREATE TABLE TEXTBOOK (
Title VARCHAR(50),
ISBN CHAR(13) PRIMARY KEY,
Price DECIMAL(10, 2)
);
CREATE TABLE AUTHOR (
LName VARCHAR(30),
FName VARCHAR(30),
Email VARCHAR(30),
PRIMARY KEY (Lname, Fname)
);
CREATE TABLE AUTHORED (
Book CHAR(13),
FOREIGN KEY (Book) REFERENCES TEXTBOOK (ISBN),
AuthorLName VARCHAR(30),
AuthorFName VARCHAR(30),
FOREIGN KEY (AuthorLName, AuthorFName) REFERENCES AUTHOR
(LName, Fname)
);
INSERT INTO TEXTBOOK
VALUES (
'Starting Out with Java: Early Objects',
9780133776744,
30.00),
(
'NoSQL for Mere Mortals',
9780134023212,
47.99);
INSERT INTO AUTHOR
VALUES (
'Sullivan',
'Dan',
NULL),
(
'Gaddis',
'Tony',
NULL);
INSERT INTO AUTHORED
VALUES (
9780134023212,
'Sullivan',
'Dan'),
(
9780133776744,
'Gaddis',
'Tony');The meaning of the AUTHORED table is that a tuple < I, L, F> represents that the author whose last name is L and whose first name is F wrote the textbook whose ISBN is I.
Answer the following:
TEXTBOOK table. No value should be NULL.Price attribute of the TEXTBOOK relation.TEXTBOOK table and have the price set to the default value.EDITOR with three attributes: Name, Address, and Website. The Name attribute should be the primary key. Insert two tuples in the EDITOR table, making sure that one should has the Name attribute set to “Pearson.”PUBLISHED with two attributes: Editor and Textbook. The Editor attribute should reference the EDITOR table and the Textbook attribute should reference the TEXTBOOK table.Answer the following short questions based on what is in our model so far:
ISBN be inserted in the TEXTBOOK relation?Consider the following code:
/* code/sql/HW_Capstone.sql */
DROP SCHEMA IF EXISTS HW_CAPSTONE;
CREATE SCHEMA HW_CAPSTONE;
USE HW_CAPSTONE;
CREATE TABLE STUDENT (
FName VARCHAR(50),
Id CHAR(13) PRIMARY KEY,
GraduationYear INT,
GraduationSemester ENUM ("Fall", "Spring", "Summer")
);
CREATE TABLE PROGRAMMING_LANGUAGE (
NAME VARCHAR(50) PRIMARY KEY,
Licence VARCHAR(50)
);
CREATE TABLE PROJECT (
CodeName VARCHAR(50),
Leader CHAR(13),
PRIMARY KEY (CodeName, Leader),
FOREIGN KEY (Leader) REFERENCES STUDENT (Id)
);
CREATE TABLE USED_LANGUAGE (
ProjectCodeName VARCHAR(50),
ProjectLeader CHAR(13),
UsedLanguage VARCHAR(50),
PRIMARY KEY (ProjectCodeName, ProjectLeader, UsedLanguage),
FOREIGN KEY (ProjectCodeName, ProjectLeader) REFERENCES
PROJECT (CodeName, Leader),
FOREIGN KEY (UsedLanguage) REFERENCES PROGRAMMING_LANGUAGE (NAME)
);
/*
*/
INSERT INTO STUDENT
VALUES (
"Mary",
"0123456789100",
2025,
"Summer"),
(
"Steve",
"0000000000000",
2025,
"Fall"),
(
"Claude",
"9999999999999",
2024,
"Fall"),
(
"Meghan",
"0987654321098",
2023,
"Spring");
INSERT INTO PROGRAMMING_LANGUAGE
VALUES (
"Rust",
"MIT"),
(
".NET Core",
"MIT"),
(
"Racket",
"LGPL"),
(
"Python",
"PSF");
-- Taken from
-- https://en.wikipedia.org/wiki/Comparison_of_open-source_programming_language_licensing
INSERT INTO PROJECT
VALUES (
"Brick Break",
"0123456789100"),
(
"Brick Break",
"0000000000000"),
(
"Grade Calculator",
"0123456789100"),
(
"Undecided",
"9999999999999");
INSERT INTO USED_LANGUAGE
VALUES (
"Brick Break",
"0123456789100",
"Rust"),
(
"Brick Break",
"0000000000000",
".NET Core"),
(
"Brick Break",
"0000000000000",
"Python"),
(
"Grade Calculator",
"0123456789100",
"Racket");The meaning of the USED_LANGUAGE table is that a tuple < N, L, U> represents the fact that the project whose code name is N and whose leader is L uses the programming language U.
Answer the following short questions based on the model implemented above. You can simply answer “True” or “False”, or justify your reasoning (e.g. with code).
Draw the relational model corresponding to this code.
Write the following commands.
Consider the following code:
/* code/sql/HW_Vaccine.sql */
CREATE TABLE COMPANY (
Name VARCHAR(50) PRIMARY KEY,
Website VARCHAR(255) CHECK (Website LIKE "https://%")
);
CREATE TABLE DISEASE (
Name VARCHAR(50) PRIMARY KEY,
Communicable BOOL,
-- Whether the disease can be transmitted from a human to
-- another.
TYPE ENUM ("infectious", "deficiency", "hereditary")
);
CREATE TABLE VACCINE (
Name VARCHAR(50) PRIMARY KEY,
Manufacturer VARCHAR(50) NOT NULL,
FOREIGN KEY (Manufacturer) REFERENCES COMPANY (NAME) ON
UPDATE CASCADE
);
CREATE TABLE EFFICACY (
DiseaseName VARCHAR(50),
VaccineName VARCHAR(50),
Efficacy DECIMAl(5, 2),
PRIMARY KEY (DiseaseName, VaccineName),
FOREIGN KEY (DiseaseName) REFERENCES DISEASE (NAME),
FOREIGN KEY (VaccineName) REFERENCES VACCINE (NAME)
);
INSERT INTO COMPANY
VALUES (
"Moderna",
"https://www.modernatx.com/");
INSERT INTO DISEASE
VALUES (
"Coronavirus disease 2019",
TRUE,
"infectious");
INSERT INTO VACCINE
VALUES (
"mRNA-1273",
"Moderna");
INSERT INTO EFFICACY
VALUES (
"Coronavirus disease 2019",
"mRNA-1273",
94.1);
Answer the following short questions. In our implementation…
Answer the following questions:
What does CHECK (Website LIKE "https://*") do?
Why did we picked the DECIMAl(5,2) datatype?
What is the benefit / are the benefits of having a separate EFFICACY table over having something like
CREATE TABLE VACCINE(
Name VARCHAR(50) PRIMARY KEY,
Manufacturer VARCHAR(50),
Disease VARCHAR(50),
Efficacy DECIMAl(5,2),
FOREIGN KEY (Manufacturer) REFERENCES COMPANY (Name)
);?
Draw the relational model corresponding to this code.
Write the following commands.
Consider the following code:
/* code/sql/HW_Residency.sql */
DROP SCHEMA IF EXISTS HW_RESIDENCY;
CREATE SCHEMA HW_RESIDENCY;
USE HW_RESIDENCY;
CREATE TABLE PERSON (
FName VARCHAR(40),
LName VARCHAR(40),
SSN VARCHAR(11) PRIMARY KEY,
Birthdate DATE
);
CREATE TABLE HOUSE (
Address VARCHAR(40) PRIMARY KEY,
Color ENUM ("blue", "white", "green")
);
CREATE TABLE RESIDENCY (
Person VARCHAR(11),
House VARCHAR(40),
PrincipalResidence BOOLEAN,
Status ENUM ("own", "rent", "squat", "other"),
FOREIGN KEY (Person) REFERENCES PERSON (SSN),
FOREIGN KEY (House) REFERENCES HOUSE (Address) ON DELETE CASCADE
);
INSERT INTO PERSON
VALUES (
NULL,
"Doe",
"000-00-0000",
NULL), -- P.1
(
"Michael", "Keal", "000-00-0001", DATE "1983-02-11"), -- P.2
(
"James", "Baldwin", "000-00-0002", DATE
"1967-01-01"), -- P.3
(
"Mridula", "Warrier", "000-00-0003", DATE "1990-02-11");
-- P.4
INSERT INTO HOUSE
VALUES (
"123 Main St.",
"blue"), -- H.1
(
"456 Second St.", "white"), -- H.2
(
"11 Third St.", "blue");
-- H.3
INSERT INTO RESIDENCY
VALUES (
"000-00-0001",
"123 Main St.",
TRUE,
"own"), -- R.1
(
"000-00-0001", "456 Second St.", FALSE, "own"), -- R.2
(
"000-00-0002", "123 Main St.", TRUE, "rent"), -- R.3
(
"000-00-0003", "456 Second St.", TRUE, "own");
-- R.4Note that each row inserted in the PERSON, HOUSE and RESIDENCY tables is given the name and noted as afterwards as a comment ("P.1, P.2, P.3, P.4, H.1", etc.).
Answer the following questions and problems, assuming that none of the commands in the rest of the problem are actually executed.
Draw the relational model corresponding to this series of commands (it is not necessary to include the state).
List the rows (e.g. “P.2”, “H.1”, or “none”) modified by the following statements:
UPDATE HOUSE SET COLOR = "green";DELETE FROM RESIDENCY WHERE House LIKE "1%";DELETE FROM HOUSE WHERE Address = "456 Second St.";DELETE FROM PERSON WHERE Birthdate=DATE"1990-02-11";Write queries that return the following information. The values returned in this set-up will be in parenthesis, but keep the queries general.
Addresses' of the houses in the system (“11 Third St., 123 Main St., 456 Second St.”`).SSN’s of the people whose first name was not entered in the system ("000-00-0000")."white, blue").Address of the residency of "James Baldwin" ("123 Main St.")."James")."Michael Keal"’s principal residency address ("123 Main St.")."Michael Keal, Mridula Warrier").SSN’s of the people that have the same principal residency as "James Baldwin" ("000-00-0001").SSN of "James Baldwin" to "000-00-0010". Is there any reason for this command to be rejected by the system? If yes, explain the reason.
Answer the following short questions from the data in our model, as it is currently:
SSN to be any series of 11 characters?RESIDENCY table and give a possible primary key.
RESIDENCY table is a good choice.
Consider the following code:
/* code/sql/HW_ScientificResearch.sql */
CREATE TABLE SCIENTIST (
SSN INT PRIMARY KEY,
Name VARCHAR(30) NOT NULL
);
CREATE TABLE PROJECT (
Code CHAR(4) PRIMARY KEY,
Name VARCHAR(150) NOT NULL
);
CREATE TABLE CONTRIBUTESTO (
Scientist INT,
Project CHAR(4),
Hours INT,
PRIMARY KEY (Scientist, Project),
FOREIGN KEY (Scientist) REFERENCES SCIENTIST (SSN),
FOREIGN KEY (Project) REFERENCES PROJECT (Code) ON DELETE
CASCADE ON UPDATE CASCADE
);
CREATE TABLE FUNDINGAGENCY (
Name VARCHAR(150) PRIMARY KEY,
TYPE ENUM ("State", "Federal", "Foundation"),
Creation YEAR
);
CREATE TABLE FUNDS (
Agency VARCHAR(150),
Project CHAR(4),
Amount DECIMAL(12, 2),
FOREIGN KEY (Agency) REFERENCES FUNDINGAGENCY (NAME) ON
UPDATE CASCADE ON DELETE RESTRICT,
FOREIGN KEY (Project) REFERENCES PROJECT (Code)
);
INSERT INTO SCIENTIST
VALUES (
"000000000",
"Mike"), -- S.1
(
"000000001", "Sabine"), -- S.2
(
"000000002", "James"), -- S.3
(
"000000003", "Emily"), -- S.4
(
"000000004", "Claire");
-- S.5
INSERT INTO PROJECT
VALUES (
"AA",
"Advancing Airplanes"), -- P.1
(
"BA", "Better Airplanes"), -- P.2
(
"BB", "Better Buildings"), -- P.3
(
"CC", "Creative Creation");
-- P.4
INSERT INTO CONTRIBUTESTO
VALUES (
"000000001",
"AA",
12), -- C.1
(
"000000001", "BB", 10), -- C.2
(
"000000002", "AA", 5), -- C.3
(
"000000003", "BA", 3), -- C.4
(
"000000000", "BB", 1), -- C.5
(
"000000000", "AA", 1);
-- C.6
INSERT INTO FUNDINGAGENCY
VALUES (
"National Science Foundation",
"Federal",
1950), -- FA.1
(
"French-American Cultural Exchange", "Foundation", 2017);
-- FA.2
INSERT INTO FUNDS
VALUES (
"National Science Foundation",
"AA",
100000), -- F.1
(
"French-American Cultural Exchange", "CC", 10000);
-- F.2Note that each row inserted in the tables is given a name and noted as afterwards as a comment ("S.1, S.2, P.1, C.1, FA.1", etc.).
Answer the following questions and problems, assuming that none of the commands in the rest of the problem are actually executed.
Draw the relational model corresponding to this series of commands (it is not necessary to include the state).
Draw the relational model corresponding to this series of commands (no need to include the state).
How could you edit line 12 so that negative values and NULL would not be admitted as values for Hours?
Write a command that would violate the referential integrity constraint.
List the rows affected (updated or deleted) by the following commands. If no rows are affected because the command would would violate the entity integrity constraint, the referential integrity constraint, or if there would be some other kind of error, please indicate it.
UPDATE SCIENTIST SET SSN = "000000001" WHERE Name = "Claire";UPDATE FUNDINGAGENCY SET Name = "NSF" WHERE Name = "National Science Foundation";DELETE FROM FUNDINGAGENCY WHERE Name = "French-American Cultural Exchange";Write a query that selects …(In parenthesis, the values returned in this set-up, but you have to be general.)
…the name of the funding agencies created after 2000 ("French-American Cultural Exchange")
…the code of the projects that contains the word "Airplanes" ("AA", "BA")
…the number of hours scientists contributed to the project "AA" (18)
…the code of the projects to which the scientist named Sabine contributed ("AA", "BB")
…the name of the projects who benefited from federal funds ("Advancing Airplanes")
…the name of the scientist who contributed to the same project as Mike ("Sabine", "James")
…the name of the projects that are not funded by an agency ("Better Airplanes", "Better Buildings")
…the name of the scientist who contributed the most (in terms of hours) to the project named "Advancing Airplanes" (Sabine).
Identify and discuss two limitations of this model, and offer a way to remedy at least one of them.
Consider the following code:
CREATE TABLE ROOM(
Nickname VARCHAR(40) PRIMARY KEY,
Size INT,
ComputerOrPhoneInIt BOOL NOT NULL
);
CREATE TABLE COMPUTER(
Nickname VARCHAR(40) PRIMARY KEY,
OperatingSystem VARCHAR(50),
Room VARCHAR(40),
FOREIGN KEY (Room) REFERENCES ROOM(Nickname)
);
CREATE TABLE PHONE(
Nickname VARCHAR(40) PRIMARY KEY,
OperatingSystem VARCHAR(50),
Room VARCHAR(40),
FOREIGN KEY (Room) REFERENCES ROOM(Nickname)
);
CREATE TABLE PRINTER(
Nickname VARCHAR(40) PRIMARY KEY,
ConnectedTo VARCHAR(40),
Room VARCHAR(40),
FOREIGN KEY (Room) REFERENCES ROOM(Nickname)
);It was written by some friends of yours to store data for their printing station: their shop offers computers, phones and printers located in different rooms, and they want to keep track of some information about those. They have multiple issues with this implementation, and require your help to identify them and design a new model addressing those.
For each issue listed below, explain what causes it, and if there is a way to address it (you do not need to write actual code, simply explain how you would proceed, using keywords if it clarifies).
ROOM.ComputerOrPhoneInIt is a bit cumbersome, as they have to remember to update it to FALSE if the last computer or phone is removed from a room.OperatingSystem attributes in PHONE and COMPUTER are not very convenient, as they do not provide an easy way for instance to list all the windows computers and phones, or all the 64-bits architectures. Examples of values are “Windows 10 IoT Core version 1.8.9, 64 bits”,“Android Pie”, “macOS 11 Big Sur, updated last week”, etc.Draw a relational model (no need to write sql code) that would improve their implementation. You are free (and encouraged!) to create new relations and alter the original attributes. No need to specify the domains unless you want to add particular constraints, but remember to draw the primary and foreign keys.
Consider the following code:
CREATE TABLE COMPUTER (
ID VARCHAR(20) PRIMARY KEY,
Model VARCHAR(40)
);
CREATE TABLE PERIPHERAL (
ID VARCHAR(20) PRIMARY KEY,
Model VARCHAR(40),
Type ENUM ('mouse', 'keyboard', 'screen', 'printer'),
LastConnexion DATETIME
);
CREATE TABLE CONNEXION (
Computer VARCHAR(20),
Peripheral VARCHAR(20),
PRIMARY KEY (Computer, Peripheral),
FOREIGN KEY (Computer) REFERENCES COMPUTER (ID)
ON DELETE CASCADE
ON UPDATE CASCADE,
FOREIGN KEY (Peripheral) REFERENCES PERIPHERAL (ID)
ON DELETE RESTRICT
ON UPDATE CASCADE
);
CREATE TRIGGER last_connexion_update
BEFORE INSERT ON CONNEXION
FOR EACH ROW
UPDATE PERIPHERAL
SET LastConnexion = NOW()
WHERE NEW.Peripheral = PERIPHERAL.ID;
INSERT INTO COMPUTER
VALUES
('A','Apple IIc Plus'), -- C.1
('B','Commodore SX-64'); -- C.2
INSERT INTO PERIPHERAL(ID, Model, Type)
VALUES
('12', 'Trendcom Model', 'printer'), -- P.1
('14', 'TP-10 Thermal Matrix', 'printer'), -- P.2
('15', 'IBM Selectric', 'keyboard'); -- P.3
INSERT INTO CONNEXION
VALUES
('A', '12'), -- X.1
('B', '14'), -- X.2
('A', '15'); -- X.3| True | False | |
| A peripheral can be connected to multiple computers | ||
The ID of a computer must be a letter |
||
| Whether the connexion is wired or wireless can be determined | ||
| Every computer must have a different model | ||
| A peripheral can be a mouse and a keyboard at the same time | ||
| A computer can be connected to multiple peripheral | ||
| A computer can be connected to another computer | ||
| A peripheral can be connected to another peripheral |
List the rows (i.e., C.2, X.1, or even “none”) deleted by the following statements:
DELETE FROM CONNEXION WHERE Computer = 'A';DELETE FROM COMPUTER WHERE ID = 'A';DELETE FROM PERIPHERAL WHERE ID = '15';DELETE FROM CONNEXION WHERE Computer <> 'A';Write a query that selects …(In parenthesis, the values returned in this set-up, but you have to be general.)
Id 12 (printer)ID of the computer whose model name contain "Apple" (A).ID of the computer connected to a keyboard (A)INSERT INTO CONNEXION VALUES ('B', '12'); is executed.
SQL)This problem is supposed to be a straightforward application of what we studied in class. Look back at Setting Up Your Work Environment if you feel like you are stuck before referencing this solution.
We simply log-in as indicated in the “Logging-in as testuser” section. Then we enter:
CREATE DATABASE HW_Address;
USE HW_Address;This creates the tables asked for in the problem.
Ommiting the Extra column, we have:
MariaDB [HW_Address]> DESC ADDRESS;
+------------+-------------+------+-----+---------+
| Field | Type | Null | Key | Default |
+------------+-------------+------+-----+---------+
| StreetName | varchar(15) | NO | PRI | NULL |
| Number | int(11) | NO | PRI | NULL |
| Habitants | int(11) | YES | | NULL |
+------------+-------------+------+-----+---------+We add the foreign key, still omitting the Extra column:
MariaDB [HW_Address]> DESC ADDRESS;
+------------+-------------+------+-----+---------+
| Field | Type | Null | Key | Default |
+------------+-------------+------+-----+---------+
| StreetName | varchar(15) | NO | PRI | NULL |
| Number | int(11) | NO | PRI | NULL |
| Habitants | int(11) | YES | MUL | NULL |
+------------+-------------+------+-----+---------+The only difference is the MUL value, which is a bit surprising: quoting https://dev.mysql.com/doc/refman/8.0/en/show-columns.html,
If
KeyisMUL, then the column is the first column of a nonunique index in which multiple occurrences of a given value are permitted within the column.
In other words, this does not carry any information about the fact that ADDRESS.Habitants is now a foreign key referencing NAME.ID. A way of displaying information about that foreign key is using SHOW CREATE TABLE:
MariaDB [HW_Address]> SHOW CREATE TABLE ADDRESS;
+---------+----------------------+
| Table | Create Table
+---------+----------------------+
| ADDRESS | CREATE TABLE `ADDRESS` (
`StreetName` varchar(15) NOT NULL,
`Number` int(11) NOT NULL,
`Habitants` int(11) DEFAULT NULL,
PRIMARY KEY (`StreetName`,`Number`),
KEY `Habitants` (`Habitants`),
CONSTRAINT `ADDRESS_ibfk_1` FOREIGN KEY (`Habitants`) REFERENCES `NAME` (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 |
+---------+----------------------+
1 row in set (0.01 sec)
To display the information back, we can use
SELECT * FROM NAME;We should notice that the ID attribute values lost their leading zeros.
This syntax is better for “bulk insertion” since it allows for us to write fewer commands and to focus on the data being inserted. However, if an error occurs, then nothing gets inserted.
SELECT ID FROM NAME WHERE FName = 'Samantha';
This is a command that violates the entity integrity constraint:
INSERT INTO NAME VALUES ('Maria', 'Kashi', NULL);The error message that it returns is:
ERROR 1048 (23000): Column 'ID' cannot be nullAnother way of violating the entity integrity constraint is:
INSERT INTO NAME VALUES ('Maria', 'Kashi', 80);The error message that it returns is:
ERROR 1062 (23000): Duplicate entry '80' for key 'PRIMARY'This is an UPDATE statement that violates the entity integrity constraint:
UPDATE ADDRESS SET Habitants = 340 WHERE Number = 120;The error message that it returns is:
ERROR 1452 (23000): Cannot add or update a child row: a foreign key constraint fails (`HW_Address`.`ADDRESS`, CONSTRAINT `ADDRESS_ibfk_1` FOREIGN KEY (`Habitants`) REFERENCES `NAME` (`ID`))Here is the query that violates another type of constraint:
INSERT INTO NAME VALUE ('Hi');The error message that it returns is:
ERROR 1136 (21S01): Column count does not match value count at row 1The query statement violates the implicit constraint by trying to insert a row with fewer values than there are attributes in the table.
Another example of a statement that violates another type of constraint is:
INSERT INTO ADDRESS VALUES ('Maria', 'Random', 98);This is a violation of an explicit constraint, which is that the value must match the domain (datatype) of the attribute where it is inserted. However, MySQL and MariaDB do not return an error, they simply replace 'Random' with 0.
SQL)Here is how we created our table:
CREATE SCHEMA HW_REPETITION;
USE HW_REPETITION;
CREATE TABLE EXAMPLE(
X VARCHAR(15),
Y INT
);The command to add a tuple to our table is:
INSERT INTO EXAMPLE VALUES('Train', 4);If we execute this command twice, then SQL is OK with it, and inserts the same tuple twice:
SELECT * FROM EXAMPLE;Displays:
+-------+------+
| X | Y |
+-------+------+
| Train | 4 |
| Train | 4 |
+-------+------+This is an illustration of the fact that the data in a table in SQL is not a set, as opposed to a state in a relation in the relational model.
The command:
ALTER TABLE EXAMPLE ADD PRIMARY KEY (X);Should return:
ERROR 1062 (23000): Duplicate entry 'Train' for key 'PRIMARY'We tried to declare that X was a primary key, but SQL disagreed, since two rows have the same value for that attribute.
Once the table is empty, X now qualifies as a candidate key, and can now be made a primary key. SQL stops complaining and lets us assign it as a primary key.
After trying this insertion statement twice:
INSERT INTO EXAMPLE VALUES('Train', 4);SQL refuses to insert the tuple after the second attempt:
ERROR 1062 (23000): Duplicate entry 'Train' for key 'PRIMARY'Notice that this is exactly the same error message as before, when we tried to add the primary key while we had a duplicate row of tuples!
Removing the PRIMARY KEY constraint, SQL throws the following error message:
ERROR 1005 (HY000): Can't create table `HW_FK_test`.`SOURCE` (errno: 150 "Foreign key constraint is incorrectly formed")Replacing PRIMARY KEY with UNIQUE does not generate any error messages.
Replacing one of the VARCHAR(25) with CHAR(25) does not generate any error messages.
Replacing VARCHAR(25) with INT results in this error message:
ERROR 1005 (HY000): Can't create table `HW_FK_test`.`SOURCE` (errno: 150 "Foreign key constraint is incorrectly formed")Replacing one of the VARCHAR(25) with VARCHAR(15) does not generate any error messages.
The remarks become:
- The datatype of the foreign key has to be “compatible” with the datatype of the attribute to which we are referring.
- The target of the foreign key must be the primary key or have the
UNIQUEconstraint.
Ignoring the LECTURE relation, we have:
The code is straightforward:
CREATE TABLE HW_Lecture (
NAME VARCHAR(25),
Instructor VARCHAR(25),
Year YEAR (4),
Code CHAR(5),
PRIMARY KEY (Year, Code),
FOREIGN KEY (Instructor) REFERENCES PROF (LOGIN)
);
INSERT INTO HW_Lecture
VALUES (
'Intro to CS',
'caubert',
2017,
'1304'),
(
'Intro
to Algebra',
'perdos',
2017,
'1405'),
(
'Intro to
Cyber',
'aturing',
2017,
'1234');
However, this representation can not handle the following situations:
We come back to those short-coming in the “Reverse-Engineering” section, using more abstract tools (such as Entity Diagrams) that have not been introduced yet.
The statements are immediate:
HW_ProfExampleRevisitedRevisited.sqlWhat may be surprising is that the values for LectureCode and LectureYear are set to NULL in all the tuples.
We use UPDATE statements:
UPDATE
GRADE
SET LectureCode = '1304',
LectureYear = 2017
WHERE LOGIN = 'jrakesh'
AND Grade = '2.85';
UPDATE
GRADE
SET LectureCode = '1405',
LectureYear = 2017
WHERE LOGIN = 'svlatka'
OR (LOGIN = 'jrakesh'
AND Grade = '3.85');
UPDATE
GRADE
SET LectureCode = '1234',
LectureYear = 2017
WHERE LOGIN = 'aalyx'
OR LOGIN = 'cjoella';
We use SELECT statements:
SELECT LOGIN,
Grade
FROM GRADE
WHERE Lecturecode = '1304'
AND LectureYear = '2017';
SELECT DISTINCT Instructor
FROM HW_Lecture
WHERE Year = 2017;
SELECT Name,
Grade
FROM STUDENT,
GRADE
WHERE GRADE.LectureCode = 1405
AND STUDENT.Login = GRADE.Login;
SELECT Year
FROM HW_Lecture
WHERE Code = '1234';
SELECT Name
FROM HW_Lecture
WHERE Year IN (
SELECT Year
FROM HW_Lecture
WHERE CODE = '1234');
SELECT B.name
FROM STUDENT AS A,
STUDENT AS B
WHERE A.Name = 'Ava Alyx'
AND A.Registered > B.Registered;
SELECT COUNT(DISTINCT PROF.Name) AS 'Head Teaching This Year'
FROM HW_Lecture,
DEPARTMENT,
PROF
WHERE Year = 2017
AND Instructor = Head
AND Head = PROF.Login;
SQL coding)The code below includes the answers to all of the questions for this problem:
-- Question 1:
CREATE TABLE TRAIN (
Id VARCHAR(30) PRIMARY KEY, -- This line was changed.
Model VARCHAR(30),
ConstructionYear YEAR (4)
);
-- Question 2 :
CREATE TABLE CONDUCTOR (
Id VARCHAR(20),
NAME VARCHAR(20),
ExperienceLevel VARCHAR(20)
);
ALTER TABLE CONDUCTOR
ADD PRIMARY KEY (Id);
-- Question 3
CREATE TABLE ASSIGNED_TO (
TrainId VARCHAR(20),
ConductorId VARCHAR(20),
Day DATE,
PRIMARY KEY (TrainId, ConductorId),
FOREIGN KEY (TrainId) REFERENCES TRAIN (Id), -- This line was changed
FOREIGN KEY (ConductorId) REFERENCES CONDUCTOR (Id) -- This line was changed
);
-- Question 4:
/*
We insert more than one tuple, to make the SELECT statements that follow easier
to test and debug.
*/
INSERT INTO TRAIN
VALUES (
'K-13',
'SurfLiner',
2019),
(
'K-12',
'Regina',
2015);
INSERT INTO CONDUCTOR
VALUES (
'GP1029',
'Bill',
'Junior'),
(
'GP1030',
'Sandrine',
'Junior');
INSERT INTO ASSIGNED_TO
VALUES (
'K-13',
'GP1029',
DATE '2015/12/14'),
(
'K-12',
'GP1030',
'20120909');
-- Question 5:
UPDATE
CONDUCTOR
SET ExperienceLevel = 'Senior'
WHERE Id = 'GP1029';
-- Question 6:
-- 1.
SELECT Id
FROM TRAIN;
-- 2.
SELECT Name
FROM CONDUCTOR
WHERE ExperienceLevel = 'Senior';
-- 3.
SELECT ConstructionYear
FROM TRAIN
WHERE Model = 'SurfLiner'
OR Model = 'Regina';
-- 4.
SELECT ConductorId
FROM ASSIGNED_TO
WHERE TrainId = 'K-13'
AND Day = '2015/12/14';
-- 5.
SELECT Model
FROM TRAIN,
ASSIGNED_TO
WHERE ConductorID = 'GP1029'
AND TrainId = TRAIN.ID;
SQL statements for the COFFEE database)Solution to Question 1:
The answers to the rest of the questions are in the following code:
/* code/sql/HW_DBCoffee.sql */
-- Question 2:
START TRANSACTION;
INSERT INTO CUSTOMER
VALUES (
005,
'Bob Hill',
NULL,
001);
INSERT INTO COFFEE
VALUES (
002,
"Peru",
"Decaf",
3.00);
-- The following statement raises an error.
-- INSERT INTO PROVIDER
-- VALUES (NULL, "contact@localcof.com");
-- ERROR 1048 (23000) at line 68: Column
-- 'Name'
-- cannot
-- be
-- null
INSERT INTO SUPPLY
VALUES (
"Johns & Co.",
121);
-- The following statement raises an error.
-- -INSERT INTO SUPPLY
-- VALUES ("Coffee Unl.", 311, 221);
-- ERROR 1136 (21S01): Column count
-- doesn't
-- match
-- value
-- count at row 1
-- Rest the changes:
ROLLBACK;
-- Question 3:
START TRANSACTION;
UPDATE
CUSTOMER
SET FavCoffee = 001
WHERE CardNo = 001;
-- Rows matched: 1 Changed: 1 Warnings: 0
SELECT *
FROM CUSTOMER;
ROLLBACK;
START TRANSACTION;
UPDATE
COFFEE
SET TypeOfRoast = 'Decaf'
WHERE Origin = 'Brazil';
-- Rows matched: 2 Changed: 2 Warnings: 0
SELECT *
FROM COFFEE;
ROLLBACK;
START TRANSACTION;
UPDATE
PROVIDER
SET Name = 'Coffee Unlimited'
WHERE Name = 'Coffee Unl.';
-- Rows matched: 1 Changed: 1 Warnings: 0
SELECT *
FROM PROVIDER;
SELECT *
FROM SUPPLY;
ROLLBACK;
START TRANSACTION;
UPDATE
COFFEE
SET PricePerPound = 10.00
WHERE PricePerPound > 10.00;
-- Rows matched: 1 Changed: 1 Warnings: 0
SELECT *
FROM COFFEE;
ROLLBACK;
-- Question 4:
START TRANSACTION;
DELETE FROM CUSTOMER
WHERE Name LIKE '%S%';
-- Query OK, 2 rows affected (0.01 sec)
SELECT *
FROM CUSTOMER;
ROLLBACK;
START TRANSACTION;
DELETE FROM COFFEE
WHERE Ref = 001;
-- Query OK, 1 row affected (0.00 sec)
SELECT *
FROM COFFEE;
SELECT *
FROM SUPPLY;
ROLLBACK;
START TRANSACTION;
DELETE FROM SUPPLY
WHERE Provider = 'Coffee Unl.'
AND Coffee = '001';
-- Query OK, 1 row affected (0.00 sec)
SELECT *
FROM SUPPLY;
ROLLBACK;
START TRANSACTION;
DELETE FROM PROVIDER
WHERE Name = 'Johns & Co.';
-- Query OK, 1 row affected (0.00 sec)
SELECT *
FROM PROVIDER;
SELECT *
FROM SUPPLY;
ROLLBACK;
-- Question 5:
-- 1.
SELECT Origin
FROM COFFEE
WHERE TypeOfRoast = 'Dark';
-- 2.
SELECT FavCoffee
FROM CUSTOMER
WHERE Name LIKE 'Bob%';
-- 3.
SELECT Name
FROM PROVIDER
WHERE Email IS NULL;
-- 4.
SELECT COUNT(*)
FROM SUPPLY
WHERE Provider = 'Johns & Co.';
-- 5.
SELECT Provider
FROM COFFEE,
SUPPLY
WHERE TypeOfRoast = 'Dark'
AND Coffee = Ref;
SELECT EMPLOYEE.Name
FROM EMPLOYEE,
DEPARTMENT
WHERE DEPARTMENT.Name = "Storage"
AND EMPLOYEE.Department = DEPARTMENT.ID;
SELECT EMPLOYEE.Name
FROM EMPLOYEE,
DEPARTMENT
WHERE Hired <= ALL (
SELECT Hired
FROM EMPLOYEE
WHERE Hired IS NOT NULL
AND DEPARTMENT.Name = "Storage"
AND EMPLOYEE.Department = DEPARTMENT.ID)
AND DEPARTMENT.Name = "Storage"
AND EMPLOYEE.Department = DEPARTMENT.ID;
SELECT Model
FROM COMPUTER
WHERE ID = 'A';
SELECT TYPE
FROM PERIPHERAL
WHERE ID = '14';
SELECT Model
FROM PERIPHERAL
WHERE TYPE = 'printer';
SELECT Model
FROM PERIPHERAL
WHERE Model LIKE 'IBM%';
SELECT Model
FROM PERIPHERAL,
CONNEXION
WHERE Computer = 'A'
AND Peripheral = PERIPHERAL.ID;
SELECT COUNT(Computer)
FROM CONNEXION,
COMPUTER
WHERE Model = 'Apple IIc Plus'
AND Computer = COMPUTER.ID;
/* code/sql/HW_SocialMedia.sql */
-- … the title of all the videos ("My first video!", "My
-- second video!", "My vacations").
SELECT TITLE
FROM VIDEO;
-- … the release date of the video whose title is "My first
-- video!" ("2020-02-02").
SELECT Released
FROM VIDEO
WHERE Title = "My first video!";
-- … the ID of the account(s) where the "Name" attribute
-- was not given ("2").
SELECT ID
FROM ACCOUNT
WHERE Name IS NULL;
-- … the ID of the videos whose title contains the word
-- "video" ("10", "20").
SELECT ID
FROM VIDEO
WHERE TITLE LIKE "%video%";
-- or
SELECT ID
FROM VIDEO
WHERE Title REGEXP 'video';
-- … the number of thumbs up for the video with title "My
-- vacations" ("1").
SELECT COUNT(*)
FROM THUMBS_UP,
VIDEO
WHERE VIDEO.Title = "My vacations"
AND VIDEO.ID = THUMBS_UP.Video;
-- … the title of the oldest video ("My first video!").
SELECT Title
FROM VIDEO
WHERE Released <= ALL (
SELECT Released
FROM VIDEO);
-- or
SELECT Title
FROM VIDEO
WHERE Released = (
SELECT Min(Released)
FROM VIDEO);
-- or even
SELECT Title
FROM VIDEO
ORDER BY Released ASC
LIMIT 1;
-- … the names of the accounts who gave a thumbs up to the
-- video with id 30 ("Bob Ross").
SELECT Name
FROM ACCOUNT,
THUMBS_UP
WHERE THUMBS_UP.Video = 30
AND THUMBS_UP.Account = ACCOUNT.ID;
-- … the ID of the account with the greatest number of
-- subscribers ("2").
SELECT Subscribed
FROM SUBSCRIBE
GROUP BY Subscribed
ORDER BY COUNT(Subscriber) DESC
LIMIT 1;
The following solves all the issues with your friend’s code design. As quests only rarely provide a special item, we added a relation to avoid having a Special-item in the QUEST table since that would be NULL too often.
Here are possible ways of getting the required information:
The Title of all the books:
SELECT Title FROM BOOK;The distinct Name of the publishers.
SELECT DISTINCT Name FROM PUBLISHER;The Titles and Published dates of the books published since January 31, 2012.
SELECT Title, Published FROM BOOK
WHERE Published > DATE'20120131';The first and last names of the authors published by "Gallimard" (from any city).
SELECT FName, LName FROM AUTHOR, BOOK
WHERE PublisherName = "Gallimard"
AND Author = ID;The first and last names of the authors who were not published by an editor in "New-York".
SELECT FName, LName FROM AUTHOR, BOOK
WHERE NOT PublisherCity= "New-York"
AND Author = ID;The ID of the authors who published a book whose name starts with "Where".
SELECT Author FROM BOOK
WHERE Title LIKE 'Where%';The total number of pages in the database.
SELECT SUM(Pages) FROM BOOK;The number of pages in the longest book written by the author whose last name is "Wolve".
SELECT MAX(PAGES) FROM BOOK, AUTHOR
WHERE LName = "Wolve"
AND Author = ID; The title of the books published in the 19th century.
SELECT Title FROM BOOK
WHERE Published >= DATE'18010101'
AND Published <= DATE'19001231';UPDATE BOOK SET Title = "BANNED"
WHERE Author = 3;The pair (title, publication date) is the primary key in the BOOK table, so if the author whose ID is 3 has published more than one book at a particular date, then our update will be rejected, as applying it would result in violating the entity integrity constraint.
DELETE FROM BOOK WHERE Author = 3;
DELETE FROM AUTHOR WHERE ID = 3;Note that trying to delete the rows in the AUTHOR table before deleting the rows in the BOOK table could cause a referential integrity violation, since the BOOK table has a foreign key assigned to the AUTHOR table’s Id field.
CREATE TABLE AWARD(
Name VARCHAR(30),
Year DATE,
BookTitle VARCHAR(30),
BookPubDate DATE,
FOREIGN KEY (BookTitle, BookPubDate)
REFERENCES BOOK(Title, Published),
PRIMARY KEY (Name, Year)
);Note that there is no need to store the name of the author in this relation: this information can be recovered by looking in the BOOK table for the name of the author of the awarded book.
Note that having two attributes as the primary key makes the referencing of foreign keys more cumbersome.
Two of the flaws that come to mind are:
BOOK relation: two books with the same title cannot be published on the same day, which is a serious limitation. Using a primary key like ISBN would be much more appropriate.IS_THE_AUTHOR_OF and PUBLISHED_BY, that “maps” the book’s ISBN with author’s or editor’s primary key.The solution can be read from the following code:
/* code/sql/HW_Certificate.sql */
DROP SCHEMA IF EXISTS HW_Certificate;
CREATE SCHEMA HW_Certificate;
USE HW_Certificate;
/*
SN = Serial Number
CN = Common Name
CA = Certificate Authority
*/
CREATE TABLE ORGANIZATION (
SN VARCHAR(30) PRIMARY KEY,
CN VARCHAR(30)
);
CREATE TABLE CA (
SN VARCHAR(30) PRIMARY KEY,
CN VARCHAR(30),
Trusted BOOL
);
CREATE TABLE CERTIFICATE (
SN VARCHAR(30) PRIMARY KEY,
CN VARCHAR(30) NOT NULL,
Org VARCHAR(30) NOT NULL,
Issuer VARCHAR(30),
Valid_Since DATE,
Valid_Until DATE,
FOREIGN KEY (Org) REFERENCES ORGANIZATION (SN) ON DELETE CASCADE,
FOREIGN KEY (Issuer) REFERENCES CA (SN)
);
INSERT INTO ORGANIZATION
VALUES (
'01',
'Wikimedia Foundation'),
(
'02',
'Free
Software Foundation');
INSERT INTO CA
VALUES (
'A',
"Let's Encrypt",
TRUE),
(
'B',
'Shady Corp.',
FALSE),
(
'C',
'NewComer Ltd.',
NULL);
INSERT INTO CERTIFICATE
VALUES (
'a',
'*.wikimedia.org',
'01',
'A',
20180101,
20200101),
(
'b',
'*.fsf.org',
'02',
'A',
20180101,
20191010),
(
'c',
'*.shadytest.org',
'02',
'B',
20190101,
20200101),
(
'd',
'*.wikipedia.org',
'01',
'C',
20200101,
20220101);
-- CN of all certificates.
SELECT CN
FROM CERTIFICATE;
-- (*.wikimedia.org | *.fsf.org | *.shadytest.org |
-- *.wikipedia.org)
-- The SN of the organizations whose CN
-- contains
-- "Foundation"
SELECT SN
FROM ORGANIZATION
WHERE CN LIKE "%Foundation%";
-- (01 | 02)
-- The CN and expiration date of all the
-- certificates
-- that
-- expired (assuming we are the 6th of
-- December
-- 2019).
SELECT CN,
Valid_Until
FROM CERTIFICATE
WHERE Valid_Until < DATE '20191206';
-- (*.fsf.org, 2019-10-10)
-- The CN of the CA that are not trusted.
SELECT CN
FROM CA
WHERE Trusted IS NOT TRUE;
-- (Shady Corp. | NewComer Ltd.)
-- The CN of the certificates that are
-- signed
-- by
-- a
-- CA
-- that
-- is not trusted.
SELECT CERTIFICATE.CN
FROM CERTIFICATE,
CA
WHERE Trusted IS NOT TRUE
AND CA.SN = CERTIFICATE.Issuer;
-- (Shady Corp. | NewComer Ltd.)
-- The number of certificates signed by
-- the
-- CA
-- whose
-- CN
-- is
-- "Let's encrypt".
SELECT COUNT(CERTIFICATE.SN) AS "Number of certificates signed
by Let's encrypt"
FROM CERTIFICATE,
CA
WHERE CERTIFICATE.Issuer = CA.SN
AND CA.CN = "Let's encrypt";
-- (2)
-- A table listing the CN of the
-- organizations
-- along
-- with
-- the CN of their certificates.
SELECT ORGANIZATION.CN AS Organization,
CERTIFICATE.CN AS Certificate
FROM ORGANIZATION,
CERTIFICATE
WHERE CERTIFICATE.Org = ORGANIZATION.SN;
-- ( Wikimedia Foundation, *.wikimedia.org | Free Software
-- Foundation, *.fsf.org | Free Software
-- Foundation
-- ,
-- *.shadytest.org | Wikimedia Foundation ,
-- *.wikipedia.org
-- )
/*
DELETE FROM CA WHERE SN = 'A';
ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint fails (`HW_Certificate`.`CERTIFICATE`, CONSTRAINT `CERTIFICATE_ibfk_2` FOREIGN KEY (`Issuer`) REFERENCES `CA` (`SN`))
=> Rejected, because an entry in CERTIFICATE references this tuple (referential integrity constraint).
UPDATE ORGANIZATION SET CN = "FSF" WHERE SN = '02';
Query OK, 1 row affected (0.008 sec)
Rows matched: 1 Changed: 1 Warnings: 0
=> Ok, change
('02', 'Free Software Foundation');
into
('02', 'FSF');
in ORGANIZATION
MariaDB [HW_Certificate]> UPDATE ORGANIZATION SET SN = "01" WHERE SN = '02';
ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint fails (`HW_Certificate`.`CERTIFICATE`, CONSTRAINT `CERTIFICATE_ibfk_1` FOREIGN KEY (`Org`) REFERENCES `ORGANIZATION` (`SN`) ON DELETE CASCADE)
=> Rejected, because an entry in CERTIFICATE references this tuple (referential integrity constraint).
This query would have been rejected even if this tuple was not referenced, since it would have violated the entity integrity constraint.
DELETE FROM ORGANIZATION;
=> Deletes all the content of organization and of certificate.
*/The relational model for this code is:
The solution to the next questions can be read from the following code:
/* code/sql/HW_Work.sql */
/*
Determine if the following insertion statements would violate the the Entity integrity constraint,
the Referential integrity constraint, if there would be some Other kind of error, or if it would
result in uccessful insertion.
*/
START TRANSACTION;
-- We don't want to perform the actual insertions.
INSERT INTO EBOOK
VALUES (
0,
NULL,
20180101,
0);
/*
Query OK, 1 row affected (0.003 sec)
So, "Successful insertion".
*/
-- The following statement raises an error.
-- INSERT INTO AUTHOR VALUES ("Mary B.", "mb@fai.fr", NULL);
/*
ERROR 1136 (21S01): Column count doesn't match value count at row 1
So, "Other kind of error".
*/
-- The following statement raises an error.
-- INSERT INTO WORK VALUES ("My Life", "Claude A.");
/*
ERROR 1452 (23000): Cannot add or update a child row: a foreign key constraint fails
(`HW_EXAM_1`.`WORK`, CONSTRAINT `WORK_ibfk_1` FOREIGN KEY (`Author`) REFERENCES `AUTHOR` (`Name`)
ON DELETE CASCADE ON UPDATE CASCADE)
So, "Referential integrity constraint"
*/
INSERT INTO BOOK
VALUES (
00000000,
NULL,
DATE '20001225',
90.9);
/*
Query OK, 1 row affected (0.000 sec)
So, "Successful insertion".
*/
-- The following statement raises an error.
-- INSERT INTO AUTHOR VALUES ("Virginia W.", "alt@isp.net");
/*
ERROR 1062 (23000): Duplicate entry 'Virginia W.' for key 'PRIMARY'
So, "Entity integrity constraint".
*/
ROLLBACK;
-- We go back to the previous state.
/*
List the rows (i.e., A.2, W.1, etc.) modified by the following statements
(be careful about the conditions on foreign keys!):
*/
START TRANSACTION;
-- We don't want to perform the following operations.
UPDATE
AUTHOR
SET Email = 'Deprecated'
WHERE Email LIKE '%isp.net';
/*
Query OK, 2 rows affected (0.010 sec)
Rows matched: 2 Changed: 2 Warnings: 0
This changed A.1 and A.2
*/
UPDATE
WORK
SET Title = "How to eat"
WHERE Title = "What to eat";
/*
Rows matched: 1 Changed: 1 Warnings: 0
SQL returns only the number of row changed in the WORK table,
but other rows have been changed as well.
This changed W.1, B.1, E.1.
*/
-- The following statement raises an error.
-- DELETE FROM WORK;
/*
ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint fails
(`HW_EXAM_1`.`BOOK`, CONSTRAINT `BOOK_ibfk_1` FOREIGN KEY (`Work`) REFERENCES `WORK` (`Title`) ON UPDATE CASCADE)
Does not change any row.
*/
-- The following statement raises an error.
-- DELETE FROM AUTHOR WHERE Name = "Virginia W.";
/*
ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint fails
(`HW_EXAM_1`.`BOOK`, CONSTRAINT `BOOK_ibfk_1` FOREIGN KEY (`Work`) REFERENCES `WORK` (`Title`) ON UPDATE CASCADE)
Does not change any row.
*/
ROLLBACK;
-- We go back to the previous state.
-- You can now assume that there is more data than
-- what we inserted, if that helps you. Write a
-- command that selects …
-- We insert some dummy values for this
-- next part.
INSERT INTO WORK
VALUES (
"My Life",
"Paul B."),
(
"What to eat, 2",
"Virginia W.");
INSERT INTO BOOK
VALUES (
15355627,
"My Life",
DATE '20180219',
15.00),
(
12912912,
"What to eat, 2",
DATE '20200101',
13);
INSERT INTO EBOOK
VALUES (
15150628,
"My Life",
DATE '20190215',
10.89),
(
42912912,
"What to eat, 2",
DATE '20200115',
12);
-- … the price of all the ebooks.
SELECT Price
FROM EBOOK;
-- … the (distinct) names of the authors who have authored
-- a piece of work.
SELECT DISTINCT Author
FROM WORK;
-- … the name of the authors using fai.fr for their email.
SELECT Name
FROM AUTHOR
WHERE Email LIKE '%fai.fr';
-- … the price of the ebooks published after 2018.
SELECT Price
FROM BOOK
WHERE Published >= 20180101;
/*
Note that
SELECT Price FROM BOOK WHERE Published > 2018;
would return all the prices, along with a warning:
Incorrect datetime value: '2018'
*/
-- … the price of the most expensive book.
SELECT MAX(Price)
FROM BOOK;
-- … the number of pieces of work written by the author
-- whose name is “Virginia W.”.
SELECT COUNT(*)
FROM WORK
WHERE WORK.Author = "Virginia W.";
-- … the email of the author who wrote the piece of work
-- called “My Life”.
SELECT Email
FROM AUTHOR,
WORK
WHERE WORK.Title = "My Life"
AND WORK.Author = AUTHOR.Name;
-- the isbn(s) of the book containing a work written by the
-- author whose email is "vw@isp.net".
SELECT ISBN
FROM BOOK,
WORK,
AUTHOR
WHERE AUTHOR.Email = "vw@isp.net"
AND WORK.Author = AUTHOR.Name
AND BOOK.Work = WORK.Title;
/*
Write a command that updates the title of all the pieces of work written by the author whose name is “Virginia W. to”BANNED".
Is there any reason for this command to be rejected by the system? If yes, explain which one.
*/
-- The following statement raises an error.
/*
UPDATE
WORK
SET
Title = "BANNED"
WHERE
Author = "Virginia W.";
*/
/*
Gives an error, since "Title" is the primary key in the WORK table, and Virginia W. has authored two pieces of work or more,
they are both given the title "BANNED", which violates the unicity of value in primary keys.
*/
-- Write one or multiple commands that would delete the work
-- whose title is “My Life”, as well as
-- all
-- of
-- the
-- books
-- and ebooks versions of it.
-- The following statement raises an
-- error.
-- DELETE FROM WORK
-- WHERE Title = "My Life";
/*
Fails
ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint fails
(`HW_EXAM_1`.`BOOK`, CONSTRAINT `BOOK_ibfk_1` FOREIGN KEY (`Work`) REFERENCES `WORK` (`Title`) ON UPDATE CASCADE)
*/
-- We have to first delete the corresponding publications:
DELETE FROM BOOK
WHERE WORK = "My Life";
DELETE FROM EBOOK
WHERE WORK = "My Life";
-- And then we can delete the work:
DELETE FROM WORK
WHERE Title = "My Life";
/*
And, no, we cannot delete "simply" from multiple tables in one command.
Some workaround exists, cf. https://stackoverflow.com/q/1233451/ .
*/Finally, to answer the last question, here is a list of the possible limitations:
AUTHOR and WORK tables) is not a good idea: we cannot have two authors with the same name or two pieces of work with the same title!BOOK and the EBOOK tables are going to be the same, then we should probably have only one table called PUBLICATION with a boolean to indicate whenever the publication is digital or on paper.ON DELETE CASCADE and ON DELETE RESTRICT is not really justified and makes the tables harder to use. We should have used the same update policy on both tables./*
code/sql/HW_TEXTBOOK_AUTHORED_SOL.sql
*/
/*
EXERCISE 1
Write a command that updates the email address of 'Gaddis', 'Tony' to "tgaddis@pearson.com"
*/
UPDATE
AUTHOR
SET Email = "tgaddis@pearson.com"
WHERE LName = 'Gaddis'
AND FName = 'Tony';
/*
You can use
SELECT * FROM AUTHOR;
to check that the modification took place.
*/
/*
EXERCISE 2
Write a command that inserts the textbook of your choice in the
TEXTBOOK table. No value should be NULL, but you can invent
the values.
*/
INSERT INTO TEXTBOOK
VALUES (
'Fundamentals of Database Systems',
9780133970777,
165.89);
/*
You can use
SELECT * FROM TEXTBOOK;
to check that the insertion was correctly made.
*/
/*
EXERCISE 3
Write a command that makes 'Gaddis', 'Tony' the author of the
textbook you just added to our database.
*/
INSERT INTO AUTHORED
VALUES (
9780133970777,
'Gaddis',
'Tony');
/*
You can use
SELECT * FROM AUTHORED;
to check that the insertion was correctly made.
EXERCISE 4
Write a command that makes "0.01" becomes the
default value for the Price attribute of the
TEXTBOOK relation.
*/
ALTER TABLE TEXTBOOK
ALTER COLUMN Price SET DEFAULT 0.01;
/*
You can use
DESCRIBE TEXTBOOK;
to check that the Price attribute now has a default
value.
EXERCISE 5
Write a command that insert a textbook of
your choice in the TEXTBOOK table, with the
price set to the default value.
*/
INSERT INTO TEXTBOOK
VALUES (
'Proof Theory',
9780486490731,
DEFAULT);
/*
You can use
SELECT * FROM TEXTBOOK;
to check that the insertion was correctly made.
EXERCISE 6
Write a command that creates a table called EDITOR
with 3 attributes, "Name", "Address" and "Website".
The "Name" attribute should be the primary key.
Then, insert two tuples in the EDITOR table, one
should have the "Name" attribute set to "Pearson".
*/
CREATE TABLE EDITOR (
NAME VARCHAR(30) PRIMARY KEY,
Address VARCHAR(255),
Website VARCHAR(100)
);
INSERT INTO EDITOR
VALUES (
'Pearson',
NULL,
'http://pearsoned.com/'),
(
'Dover',
NULL,
'https://store.doverpublications.com/');
/*
You can use
DESCRIBE EDITOR;
to check that the table was actually created, and
SELECT * FROM EDITOR;
to check that the values were inserted.
EXERCISE 7
Write a command that creates a table called PUBLISHED
with 2 attributes, "Editor", and "Textbook".
The "Editor" attribute should references the EDITOR
table, and the "Textbook" attribute should reference
the TEXTBOOK table.
*/
CREATE TABLE PUBLISHED (
Editor VARCHAR(30),
FOREIGN KEY (Editor) REFERENCES EDITOR (NAME),
Textbook CHAR(13),
FOREIGN KEY (Textbook) REFERENCES TEXTBOOK (ISBN)
);
/*
You can use
DESCRIBE PUBLISHED;
to check that the table was actually created.
EXERCISE 8
Write a command that makes "Pearson" the editor of
the textbook whose ISBN is 9780133776744.
*/
INSERT INTO PUBLISHED
VALUES (
"Pearson",
9780133776744);
/*
You can use
SELECT * FROM PUBLISHED;
to check that the table was actually created.
EXERCISE 9
Answer the following short questions. In our model, as it is, …
Can an author have authored more than one textbook?
Yes.
Can a textbook have more than one author?
Yes.
Can a textbook without ISBN be inserted in the TEXTBOOK relation?
No, unless you create a "dummy" (fake) value for it,
like 0000000000000, but this value can be used only
once, since ISBN is the primary key.
Can the price of a textbook be negative?
Yes. We can actually test it:
INSERT INTO TEXTBOOK VALUES ("Test", 0000000000000, -1);
Can two author have the same first and last name?
No. The query:
INSERT INTO AUTHOR VALUES ('Smith', 'Bob', NULL), ('Smith', 'Bob', NULL);
returns
ERROR 1062 (23000): Duplicate entry 'Smith-Bob' for key 'PRIMARY'
Can two textbooks have the same title?
Yes, as long as they have different ISBN. The command
INSERT INTO TEXTBOOK VALUES ("Test", 0000000000001, NULL), ("Test", 0000000000002, NULL);
is processed just fine.
Can two editiors have the same address?
Yes. The command:
INSERT INTO EDITOR VALUES ("Test 1", "123 Main St.", NULL), ("Test 2", "123 Main St.", NULL);
is processed just fine.
/*
code/sql/HW_CapstoneSol.sql
*/
/*
I. Short Questions (6 pts)
Answer the following short questions based on the model implemented above.
You can simply answer "True" or "False", or justify your reasoning (e.g. with code).
*/
-- 1. Can a project uses multiple programming languages?
-- Yes.
-- 2. Can a student be the leader of multiple
-- projects?
-- Yes.
-- 3. Can multiple projects have the same code name?
-- Yes.
-- 4. Could Claude simply enter NULL for the value
-- of his pproject's code name, since he's undecided?
-- No.
-- 5. Can a project be created without project
-- leader?
-- No.
-- 6. Can we know who is working on a project
-- without being its leader?
-- No.
/*
II. Relational Model (6 pts.)
Draw the relational model corresponding to this code.
You can hand-draw it and join a scan or a picture, or simply hand me back the sheet where you drew it.
*/
/*
III. Simple Commands (8 pts.)
Below, you are asked to write commands that perform various actions.
Please, leave them uncommented, unless you can't write them correctly, in which case it's ok to leave them commented.
The first question is answered as an example.
*/
-- 0. Write a command that list all the names of the
-- programming languages.
SELECT Name
FROM PROGRAMMING_LANGUAGE;
-- 1. Write a command that insert a new student in the
-- STUDENT table.
-- (You should invent the values).
INSERT INTO STUDENT
VALUES (
"Bob",
"0987654321234",
NULL,
NULL);
-- 2. Write a command that updates the code name of the
-- project ("Undecided", "9999999999999") to "VR in
-- ER".
UPDATE
PROJECT
SET CodeName = "VR in ER"
WHERE CodeName = "Undecided"
AND Leader = "9999999999999";
-- 3. Write a command that updates the graduation year of the
-- student whose id is "0987654321098" to 2024, and
-- the semester to "Fall".
UPDATE
STUDENT
SET GraduationYear = 2024,
GraduationSemester = "Fall"
WHERE id = "0987654321098";
-- 4. Write a command that changes the STUDENT table to make
-- it impossible to enter NULL for the first name of
-- a student, without changing the primary key.
ALTER TABLE STUDENT MODIFY FName VARCHAR(50) NOT NULL;
-- 5. Write a command that changes the datatype of
-- GraduationYear to SMALLINT.
ALTER TABLE STUDENT MODIFY GraduationYear SMALLINT;
-- 6. Write a command that adds an attribute "ReleaseDate" to
-- the PROJECT table.
ALTER TABLE PROJECT
ADD COLUMN ReleaseDate DATE;
-- 6.bis If you managed to write the previous command
-- correctly, write a command that sets the release
-- date of the project ("Brick Break",
-- "0123456789100")
-- to
-- the 26th of November 2022.
UPDATE
PROJECT
SET ReleaseDate = DATE "20221126"
WHERE CodeName = "Brick Break"
AND Leader = "0123456789100";
-- 7. Write a command that makes it impossible for a student
-- to be the leader in more than one project
-- (This command should return an error)
-- ALTER TABLE PROJECT ADD UNIQUE (Leader);/* code/sql/HW_VaccineSol.sql */
/*
I. Short Questions (3 pts.)
Answer the following short questions. In our implementation…
1. … can two companies have exactly the same name?
No, as COMPANY.Name is the only attribute in the primary key of COMPANY.
2. … can two companies have the same website?
Yes, nothing prevents it.
3. … can a company not have a website?
Yes, the domain of COMPANY.Website is "VARCHAR(255)", without a constraint preventing it from being "NULL".
4. … can the same vaccine be manufactured by multiple companies?
No, as VACCINE.Manufacturer is an attribute in VACCINE that accepts only one value.
5. … can a vaccine not have a manufacturer?
No, as VACCINE.Manufacturer bears the "NOT NULL" constraint.
6. … can a disease being neither communicable nor not communicable?
Yes, as DISEASE.Communicable is of type "BOOL", it accepts the "NULL" value.
7. … can the same vaccine have different efficacies for different diseases?
Yes, the EFFICACY table has for primary key VaccineName and DiseaseName, which implies that the same vaccine can occur repeatedly as long as it is associated with different diseases.
*/
/*
II. Longer Questions (6 pts.)
Answer the following questions:
1. What does `CHECK (Website LIKE "https://*")` do?
It refrains any value not starting with "https://" to be inserted as a value for the COMPANY.Website attribute.
Note that in particular it forbids a website from not being secured (that is, http:// is not a valid protocol).
2. Why did we picked the `DECIMAl(5,2)` datatype?
It is the appropriate datatype to represent percentage values represented as ranging from 100.00 to 0.00.
The discussion at https://stackoverflow.com/a/2762376/ also highlights that percent can be represented as decimal(5,4) with a check to insure that the value will range between 1.0000 and 0.0000.
3. What is the benefit / are the benefits of having a separate EFFICACY table over having something like
CREATE TABLE VACCINE(
Name VARCHAR(50) PRIMARY KEY,
Manufacturer VARCHAR(50),
Disease VARCHAR(50),
Efficacy DECIMAl(5,2),
FOREIGN KEY (Manufacturer) REFERENCES COMPANY (Name)
);
?
This implementation does not allow to record that the same vaccine can have different efficacies for different diseases.
Stated differently, it forbids to represent vaccines efficient against multiple diseases faitfully.
*/
/*
III. Relational Model (6 pts.)
Draw the relational model corresponding to this code.
You can hand-draw it and join a scan or a picture, or simply hand me back a sheet.
*/
/*
IV. Simple Commands (5 pts.)
Below, you are asked to write commands that perform various actions.
Please, leave them uncommented, unless
- you can not write them correctly, but want to share your attempt,
- it is specified that it should return an error.
The first question is answered as an example.
*/
-- 0. Write a command that list the names of
-- all the diseases.
SELECT Name
FROM DISEASE;
-- 1. Write a command that insert "Pfizer" in the
-- COMPANY table (you can make up the website or look
-- it)
INSERT INTO COMPANY
VALUES (
"Pfizer",
"https://www.pfizer.com/");
-- 2. Write a command that insert the "Pfizer-BioNTech
-- COVID-19 Vaccine" in the VACCINE table, and a
-- command
-- that store the efficacy of that vaccine against
-- the "Coronavirus disease 2019" disease
-- ( you can make up the values or look them up).
INSERT INTO VACCINE
VALUES (
"Pfizer-BioNTech COVID-19 Vaccine",
"Pfizer");
INSERT INTO EFFICACY
VALUES (
"Coronavirus disease 2019",
"Pfizer-BioNTech COVID-19 Vaccine",
89);
-- 3. Write a command that updates the name of the
-- company "Moderna" to "Moderna, Inc." everywhere.
UPDATE
COMPANY
SET Name = "Moderna, Inc."
WHERE Name = "Moderna";
-- 4. Write a command that lists the name of all the
-- companies.
SELECT Name
FROM COMPANY;
-- 5. Write a command that deletes the "Coronavirus disease
-- 2019" entry from the DISEASE table (if only!).
/*
DELETE FROM DISEASE
WHERE Name = "Coronavirus disease 2019";
*/
-- This command should return an error. Explain it and leave
-- the command commented.
-- The "Coronavirus disease 2019" value in DISEASE.Name
-- is
-- refereed to by two entries in the EFFICACY table.
-- As the foreign key from EFFICACY.DiseaseName to
-- DISEASE.Name does not specify its policy "ON DELETE",
-- its
-- default behavior is to restrict deletion, causing the
-- error.
-- 6. Write two commands: one that adds "physiological"
-- to
-- the possible types of diseases, and one that
-- inserts
-- a physiological disease in the DISEASE table.
ALTER TABLE DISEASE MODIFY TYPE ENUM ("infectious",
"deficiency", "hereditary", "physiological");
INSERT INTO DISEASE
VALUES (
"Asthma",
FALSE,
"physiological");
-- 7 (difficult). Write a command that return the list of
-- all the companies that manufacture a
-- vaccine against "Coronavirus disease
-- 2019".
SELECT VACCINE.Manufacturer
FROM VACCINE,
EFFICACY
WHERE VACCINE.Name = EFFICACY.VaccineName
AND EFFICACY.DiseaseName = "Coronavirus disease 2019";
The file code/sql/HW_ResidencySol.sql contains the solution to the code part of this problem.
To violate the entity integrity constraint, it suffices to insert a tuple with NULL as a value for one of the attributes of a primary key or to insert a value that was already inserted.
Two examples are:
INSERT INTO PERSON VALUES ("Bob", "Ross", NULL, DATE"1942-10-29");which would return ERROR 1048 (23000): Column 'SSN' cannot be null.
INSERT INTO HOUSE VALUES ("123 Main St.", "green");which would return ERROR 1062 (23000): Duplicate entry '123 Main St.' for key 'PRIMARY'.
To violate the referential integrity constraint, it suffices to insert a tuple where the value for one of the attributes of a foreign key does not exist in the referenced table.
For instance,
INSERT INTO RESIDENCY VALUES ("999-99-9999", NULL, NULL, NULL);would return
ERROR 1452 (23000): Cannot add or update a child row: a foreign key constraint fails (`HW_Residency_SOL`.`RESIDENCY`, CONSTRAINT `RESIDENCY_ibfk_1` FOREIGN KEY (`Person`) REFERENCES `PERSON` (`SSN`))Since there is no row in the PERSON table with the value "999-99-9999" for SSN.
The answers can be found in the following snippet:
/*
In the following we use transactions
to be able to simulate the "what if"
aspect of the questions: we will not
commit the changes we are testing,
and roll back on them before moving to
the next question.
*/
-- Exercise 4
-- List the rows (i.e., P.2, H.1, or even
-- “none”)
-- modified by the following statements:
START TRANSACTION;
UPDATE
HOUSE
SET COLOR = "green";
-- H.1, H.2 and H.3
ROLLBACK;
START TRANSACTION;
DELETE FROM RESIDENCY
WHERE House LIKE "1%";
-- R.1,and R.3
ROLLBACK;
START TRANSACTION;
DELETE FROM HOUSE
WHERE Address = "456 Second St.";
-- H.2, R.2 and R.4 (because of the foreign key).
ROLLBACK;
START TRANSACTION;
-- Commented, because it causes an error.
-- DELETE FROM PERSON
-- WHERE Birthdate = DATE "1990-02-11";
-- None, because of the foreign key and
-- the
-- referential
-- integrity constraint.
-- ERROR 1451 (23000): Cannot delete or
-- update
-- a
-- parent
-- row:
-- a foreign key constraint fails
-- (`HW_RESIDENCY_SOL`.`RESIDENCY`,
-- CONSTRAINT
-- `RESIDENCY_ibfk_1` FOREIGN KEY
-- (`Person`)
-- REFERENCES
-- `PERSON` (`SSN`))
ROLLBACK;
The answers can be found in the following snippet:
-- Exercise 5
/* Write a query that selects …
… the addresses of the houses in the system (11 Third St., 123 Main St., 456 Second St.).
*/
SELECT Address
FROM HOUSE;
-- … the SSN of the persons whose first name was not
-- entered in the system (000-00-0000).
SELECT SSN
FROM PERSON
WHERE FName IS NULL;
-- … all the different colors of houses (white, blue).
SELECT DISTINCT COLOR
FROM HOUSE;
-- … the address of the residency of James Baldwin (123
-- Main St.).
SELECT House
FROM RESIDENCY,
PERSON
WHERE PERSON.Fname = "James"
AND PERSON.LName = "Baldwin"
AND PERSON.SSN = RESIDENCY.Person;
-- … the first name of the oldest person in the database
-- (James).
SELECT FName
FROM PERSON
WHERE Birthdate = (
SELECT MIN(Birthdate)
FROM PERSON
WHERE Birthdate IS NOT NULL);
-- … Michael Keal’s principal residency address (123 Main
-- St.).
SELECT RESIDENCY.House
FROM RESIDENCY,
PERSON
WHERE PERSON.FName = "Michael"
AND PERSON.LName = "Keal"
AND PERSON.SSN = RESIDENCY.Person
AND RESIDENCY.PrincipalResidence = TRUE;
-- … the (distinct) first and last names of the homeowners
-- (Michael Keal, Mridula Warrier).
SELECT DISTINCT (PERSON.FName),
PERSON.LName
FROM PERSON,
RESIDENCY
WHERE RESIDENCY.Status = "own"
AND RESIDENCY.Person = PERSON.SSN;
-- cf comment at snippet homonyms
SELECT PERSON.FName,
PERSON.LName
FROM PERSON
WHERE SSN IN ( SELECT DISTINCT (RESIDENCY.Person)
FROM RESIDENCY
WHERE RESIDENCY.Status = "own");
-- … the SSN of the persons that have the same principal
-- residency as James Baldwin
-- (000-00-0001).
SELECT RoomMate.Person
FROM RESIDENCY AS James,
RESIDENCY AS RoomMate,
PERSON
WHERE PERSON.FName = "James"
AND PERSON.LName = "Baldwin"
AND PERSON.SSN = James.Person
AND James.House = RoomMate.House
AND NOT James.Person = RoomMate.Person
AND RoomMate.PrincipalResidence = TRUE;
Note that the query that returns the name of the homeowners can be improved.
-- If we have homonymns in our database,
-- e.g.
INSERT INTO PERSON
VALUES (
"A",
"B",
"000-00-0010",
NULL),
(
"A",
"B",
"000-00-0011",
NULL);
INSERT INTO HOUSE
VALUES (
"H",
NULL);
-- H.3
INSERT INTO RESIDENCY
VALUES (
"000-00-0010",
"H",
TRUE,
"own"),
(
"000-00-0011",
"H",
TRUE,
"own");
-- Then the query below fails, in the sense that it reports
-- the name "A, B" only once.
SELECT DISTINCT (PERSON.FName),
PERSON.LName
FROM PERSON,
RESIDENCY
WHERE RESIDENCY.Status = "own"
AND RESIDENCY.Person = PERSON.SSN;
-- A better (and not much more complicated) solution would
-- have been
SELECT PERSON.FName,
PERSON.LName
FROM PERSON
WHERE SSN IN ( SELECT DISTINCT (RESIDENCY.Person)
FROM RESIDENCY
WHERE RESIDENCY.Status = "own");
To update the SSN of "James Baldwin" to "000-00-0010", we could use:
UPDATE PERSON SET SSN = "000-00-0010" WHERE FName = "James" AND LName = "Baldwin";However, this command would be rejected because of the foreign key constraint. On UPDATE, the foreign key from RESIDENCY.Person to PERSON.SSN restricts by default. The error would be:
ERROR 1451 (23000) at line 75: Cannot delete or update a parent row: a foreign key constraint fails (`HW_Residency_SOL`.`RESIDENCY`, CONSTRAINT `RESIDENCY_ibfk_1` FOREIGN KEY (`Person`) REFERENCES `PERSON` (`SSN`))In our model, as it is currently,
Considering the given state for the RESIDENCY table, the following two are possible primary keys:
Person and PrincipalResidencePerson and HouseThe first key would not accomodate a person with multiple secondary residencies, which is not a good thing. The second key could make sense, since it would refrain a person from declaring the same address twice as their residency. The only case that could be hard to work around is if a person was trying to own multiple units at the same address; however, this is more an issue with the primary key of HOUSE than an issue with the primary key we suggested for RESIDENCY.
/* code/sql/HW_ScientificResearchSol.sql */
-- List the rows affected (updated or deleted) by the
-- following commands.
-- If no rows are affected because the command would
-- would
-- violate the entity integrity constraint, the
-- referential
-- integrity constraint, or if there would be some
-- other
-- kind
-- of error, please indicate it.
START TRANSACTION;
/*
UPDATE
SCIENTIST
SET SSN = "000000001"
WHERE Name = "Claire";
*/
-- ERROR 1062 (23000) at line 106: Duplicate entry '1'
-- for
-- key 'PRIMARY'
ROLLBACK;
START TRANSACTION;
UPDATE
FUNDINGAGENCY
SET Name = "NSF"
WHERE Name = "National Science Foundation";
SELECT *
FROM FUNDINGAGENCY;
-- FA. 1
SELECT *
FROM FUNDS;
-- F.1
ROLLBACK;
START TRANSACTION;
/*
DELETE FROM FUNDINGAGENCY
WHERE Name = "French-American Cultural Exchange";
*/
-- ERROR 1451 (23000): Cannot delete or update a parent row:
-- a foreign key constraint fails
-- (`HW_SCIENTIFIC_RESEARCH`.`FUNDS`, CONSTRAINT
-- `FUNDS_ibfk_1` FOREIGN KEY (`Agency`) REFERENCES
-- `FUNDINGAGENCY` (`Name`) ON UPDATE CASCADE)
ROLLBACK;
-- List the name of the funding agencies created after 2000
-- ("French-American Cultural Exchange")
SELECT Name
FROM FUNDINGAGENCY
WHERE Creation >= 2000;
-- List the code of the projects that contains the word
-- "Airplanes" ("AA", "BA")
SELECT CODE
FROM PROJECT
WHERE Name LIKE ("%Airplanes%");
-- List the number of hours scientists contributed to the
-- project "AA" (18)
SELECT SUM(Hours)
FROM CONTRIBUTESTO
WHERE Project = "AA";
-- List the code of the projects to which the scientist named
-- Sabine contributed ("AA", "BB")
SELECT Project
FROM CONTRIBUTESTO,
SCIENTIST
WHERE SCIENTIST.Name = "Sabine"
AND SCIENTIST.SSN = CONTRIBUTESTO.Scientist;
-- Give the name of the projects who benefited from federal
-- funds ("Advancing Airplanes")
SELECT PROJECT.Name
FROM PROJECT,
FUNDS,
FUNDINGAGENCY
WHERE FUNDINGAGENCY.Type = "Federal"
AND FUNDINGAGENCY.Name = FUNDS.Agency
AND FUNDS.Project = PROJECT.Code;
-- Give the name of the scientist who contributed to the same
-- project as Mike ("Sabine", "James")
SELECT DISTINCT (Fellow.Name) AS "Mike's fellow"
FROM SCIENTIST AS Mike,
SCIENTIST AS Fellow,
CONTRIBUTESTO AS A,
CONTRIBUTESTO AS B
WHERE Mike.Name = "Mike"
AND Mike.SSN = A.Scientist
AND A.Project = B.Project
AND B.Scientist = Fellow.SSN
AND NOT Fellow.Name = "Mike";
-- List the name of the projects that are not funded by an
-- agency ("Better Airplanes", "Better Buildings")
SELECT DISTINCT (PROJECT.Name)
FROM PROJECT,
FUNDS
WHERE NOT PROJECT.Code IN (
SELECT FUNDS.Project
FROM FUNDS);
-- Give the name of the scientist who contributed the most
-- (in terms of hours) to the project named
-- "Advancing
-- Airplanes" (Sabine)
SELECT SCIENTIST.Name
FROM SCIENTIST,
CONTRIBUTESTO
WHERE CONTRIBUTESTO.Hours >= (
SELECT MAX(Hours)
FROM CONTRIBUTESTO,
PROJECT
WHERE PROJECT.Name = "Advancing Airplanes"
AND PROJECT.Code = CONTRIBUTESTO.Project)
AND CONTRIBUTESTO.Scientist = SCIENTIST.SSN;
id with auto-increment as a primary key for the computers, rooms and phones without having to come up with new names all the times.SELECT query that returns this information whenever they need it.Nickname and the ConnectedTo attributes be the primary key would solve their issue, but could potentiall introduce a lot of reduncdancy. The best way is probably to have a separate table that list the connections. Since computers and phones are in two different tables, this creates an additional challenge, since we would need to have a “connection table for computers”, and a “connection table for phones”. We recommend actually merging those two tables into one, that would additionaly have an attribute to set if the device is a phone or a computer.A possible solution would consist in
OS relation with attributes such as manufacturer, architecture, last update, and an id attribute for the primary key,Computer and Phone into a single Device relation, which would contain an additional attribute Type to distinguish between computers and phones, and a foreign key to OS,Connection relation whose attributes would be foreign keys to Device and Printer,ComputerOrPhoneInIt attribute,/* code/sql/HW_ComputerVariationAdvancedSol.sql */
START TRANSACTION;
DELETE FROM CONNEXION
WHERE Computer = 'A';
SELECT *
FROM COMPUTER;
SELECT *
FROM PERIPHERAL;
SELECT *
FROM CONNEXION;
ROLLBACK;
START TRANSACTION;
DELETE FROM COMPUTER
WHERE ID = 'A';
SELECT *
FROM COMPUTER;
SELECT *
FROM PERIPHERAL;
SELECT *
FROM CONNEXION;
ROLLBACK;
START TRANSACTION;
DELETE FROM PERIPHERAL
WHERE ID = '15';
SELECT *
FROM COMPUTER;
SELECT *
FROM PERIPHERAL;
SELECT *
FROM CONNEXION;
ROLLBACK;
START TRANSACTION;
DELETE FROM CONNEXION
WHERE Computer <> 'A';
SELECT *
FROM COMPUTER;
SELECT *
FROM PERIPHERAL;
SELECT *
FROM CONNEXION;
ROLLBACK;
SELECT TYPE
FROM PERIPHERAL
WHERE ID = '12';
SELECT ID
FROM COMPUTER
WHERE Model LIKE '%Apple%';
SELECT COUNT(ID)
FROM COMPUTER;
SELECT DISTINCT (TYPE)
FROM PERIPHERAL;
SELECT CONNEXION.Computer
FROM CONNEXION,
PERIPHERAL
WHERE PERIPHERAL.Type = 'keyboard'
AND PERIPHERAL.ID = CONNEXION.Peripheral;
SELECT COMPUTER.Model
FROM CONNEXION,
PERIPHERAL,
COMPUTER
WHERE PERIPHERAL.Model = 'TP-10 Thermal Matrix'
AND PERIPHERAL.ID = CONNEXION.Peripheral
AND CONNEXION.Computer = COMPUTER.ID;
INSERT INTO CONNEXION
VALUES (
'B',
'12');
SELECT *
FROM COMPUTER;
SELECT *
FROM PERIPHERAL;
-- Note that the "LastConnexion" attribute has been updated.
SELECT *
FROM CONNEXION;
This part of the lecture covers significantly more material than the other, hence we give the details of the references below:
Previous relational models have mistakes and limitations:
HW_Lecture and Grade tables, is difficult and error-prone.We could go back and forth between relational models (~ logical level) and SQL implementations (~ physical level), but we will use even more high-level tools (~ conceptual level):
Entity Relationship Models (ER, static: DB)
Unified Modelling Diagrams (UML, dynamic: DB + software)
Enhanced Entity Relationship Models (EER, adds operations to ER)
| Feature | Conceptual | Logical | Physical |
|---|---|---|---|
| (Main) Audience | Business | Designer | Programmer |
| Entity Names | ✔ | ✔ | |
| Entity Relationships | ✔ | ✔ | |
| Attributes | (✔) | ✔ | |
| Cardinalities | ✔ | ✔ | |
| Primary Keys | ✔ | ✔ | |
| Foreign Keys | ✔ | ✔ | |
| Column Data types (Domain) | (✔) | ✔ | |
| Table Names | ✔ | ||
| Column Names | ✔ |
The conceptual data model is (in theory at least) independent of the choice of database technology.
Remember that in relational models, relations were representing entities (Student) and relationships (Majors_In). At the conceptual level, and more particularly in ER diagram, the distinction is made between entities and relationship.
Data is organized into entities (with attributes), relationships between entities (with attributes as well).
Entity A :
Some vocabulary:
Attributes can be
{…} = multi-valued
(…) = complex
For instance, one could
{Address(Street, Number, Apt, City, State, ZIP)},A key attribute is an attribute whose value is distinct for each entity in the entity set.
| If the attribute is …, | then… |
|---|---|
| composite | other attributes are connected to it |
| multi-valued | the box have double lines |
| derived | the box have dotted lines |
| a key | the name of the attribute is underlined |
In the following, we’ll focus on the relationship between the entities more than on the attributes of particular entities, so we’ll sometimes simply draw
leaving the attributes un-specified (but that does not mean that they all have to be atomic) or even just
but that does not mean that the entity type have no attribute!
E1, … En participate in R, e1, …, en participate in r1, n is the degree.
Note that we can have Entity Set 1 = Entity Set 2, in which case we say the relation is recursive17.
Naming convention:
Convenient, and sometimes mandatory, to give role names.
If we want to stress that we are considering only one aspect of an entity type (that is, a person is not only an employee, a company is not only an employer, but this aspect is crucial for the “EMPLOYS” relation):
We can also use it to make the “right-side” and the “left-side” of a recursive relationship explicit:
Finally, we will sometimes use “Role Name of Entity 1 : Role Name of Entity 2” as a notation for the relation between them. For instance, we can write “Employer:Employee” to denote the “EMPLOYS” relation, and we will also use this notation when the relationship is between different entities, and write e.g. “PERSON:POSITION” for the “OCCUPIES” relation.
Two constraints, called “structural constraints”, applies to relationship types: cardinality ratio and participation constraint. They both concerns the number of relationship instances an entity can participate in (which is different from the cardinality of a relationship type).
Maximum number of relationships instances that an entity can participate in.
For binary relations, can be 1 : 1, N : 1, 1 : N, or M : N. The 1 stands for “at most 1”, and the M, N, and P stand for “possibly more than 1”, or “no maximum”. In ER diagram, we do not count, and do not make the distinction between “at most 5” and “at most 10”, for instance18.
Possible examples include:
| Relation | Possible Ratio | Explanation |
|---|---|---|
| MENTOR : MENTEE | 1 : N | “A mentor can have multiple mentees, a mentee has at most one mentor.” |
| PERSON : SSN | 1 : 1 | “A person has one SSN, a SSN belongs to one person.” |
| COURSE : DEPARTMENT | N : 1 | “A course is offered by one department, a department can offer any number of courses.” |
| STUDENT : TEAM | M : N | “A student can participate in multiple team, a team can have multiple students.” |
We indicate the ratio on the edges:
Note that reflexive relations can have any ratio as well. An example of M : N recursive relation could be:
Minimum number of relationships instances that an entity can participant it, a.k.a. “minimum cardinality constraint.”
The participation can be total (a.k.a. existence dependency, the entity must be in that relationship at least once) or partial (the entity may or may not be in that relationship).
Total is drawn with a double line, partial is drawn with a single line:
This reads “a course must be offered by a department, but a department may or may not offer courses.”
Relationships can have attributes too. The typical example is a date attribute, but other examples include
Note that an attribute on a relationship type can be atomic or composite, single or multi-valued, stored or derived, but that it cannot be a key attribute (after all, there are no entity to identify!).
Note that there are some moving aspects here: atributes on 1 : 1, 1 : N, N : 1 relationships can be migrated (to the N side when there is one, or to either side where there is none).
For instance, imagine that every phone uses exactly (= “at most and at least”) one carrier, that a carrier can provide network to multiple phones, and that the average quality of the network is an attribute in this relationship:
Then each instance of the relation would be of the form (“Phone X”, “Carrier Y”, “9/10”) for some way of ranking the average quality from 0 to 10. Note that, from the fact that the relationship is N : 1, this means that there is only one tuple involving “Phone X”: this means that the average quality could actually be seen as a property of the phone, and hence be migrated as an attribute to the phone side:
Note that we could not migrate the “average phone quality” to the “Carrier” side: imagine if we had the instances (“Phone X”, “Carrier Y”, “9/10”) and (“Phone Z”, “Carrier Y”, “3/10”), then should the attribute of “Carrier Y” be “9/10” or “3/10”: we have no way of deciding based on this model. Whenever it is a good choice to migrate this attribute or not will depend on the requirement of the models, and it may not always be appropriate to migrate the attribute to the entity. In the case of 1 : 1 relationship, migrating the attribute to both sides (i.e., to both entities) would be a mistake, since it would introduce redundancy in your model.
As an exercise, you can look at the relationships TEACHING, MENTORING and EMITED_DRIVING_LICENCE that are listed above, and see if the attributes can be migrated or not, and if yes, on which side.
Of course, relationships can have a degree higher than two. An example of a ternary relation could be:
To determine cardinality ratio, one should fix all but one parameters, and wonder how many values of the remaining parameter can be in that relationship. Another wording for the same idea can be found in this thread.
Four our example, Customer Y and Bank Z could be in relationship with more than one account (hence the “N”). On the opposite, Customer Y and Account K would be in relationship with only one bank (hence the “1” on the bottom), and Bank Z and Account K would belong to only one customer (hence the “1” on the left).
Let us look at two other examples. First, assume we want to collect information about the treatment prescribed by physicians to patients, we could use a relationship like the following one:
Where
Now, if we want to store information about who is the president of a country during a term, we could get something like:
Note that this representation of the data assumes that a citizen cannot be the president of two different countries during the same term (the right 1), which could be debatable.
It is sometimes impossible to do without relations with arity greater than 2. For instance, consider the following two diagrams19:
You should realize that they convey different information. For instance, you can know for a fact that a person visit a bookshop only if they bought something in it, while the second diagram de-correlate the act of buying with the visit to a bookshop. Similarly, the second diagram could give you a hint that a person that owns a copy of a book Z and visits a bookshop X that sells it could also visit it, but you will not know that for sure.
An example of recursive ternary relation could be:
An example of relation of degree 4 could be:
The cardinality ratio are computed using the same method as described before.
There are actually two sorts of entity types:
Weak (or child) entity types are identified by identifying / owner type that is related to it, in conjunction with one attribute (the partial key). That relation is called identifying (or supporting) relationship, and weak entities have a total participation constraint. The partial key is an attribute, that, when paired with an entity with which they are in relation through their identifying relationship, allows to identify a particular entity.
Weak entities and identifying relationships have a double border, and partial key have a dotted underline, as follows:
The idea here is that we do not need to gather data about all the dependent in the world, or in isolation, but are interested in dependent only if they are related to en employee in our database. Just having the name of a dependent is not enough to identify them, but having their name and the SSN of the employee they are related to is enough. The identifying relation always have ratio 1 : M or 1 : 1: a weak entity cannot be related to more than one entity of the owner type, so that M : N ratio are not possible (cf. e.g. https://dba.stackexchange.com/q/17207). If you need to have, for instance, a dependant connected to multiple employees, then that means that your dependent entity should be strong, because it has an existence “of its own”.
You may wonder why we do not represent weak entities simply as (composite, multi-valued) attributes of their owner type. For instance, why would we use
instead of
? The answer depends whenever we need to have the ability to represent our weak entities (here, PET) as being in relationship with other entities (that can themselves be weak!), as follows:
This would be impossible if PET was an attribute of FRIEND! Whenever the pet entity type is involved in other relationships or not should help you in deciding which representation to choose.
Another example of weak entity whose owner is weak as well could be:
The idea being that the Health care provider cares about an insure only if they are covered by them, and that they care about the doula only if they are currently helping one of their insure.
Multiple notations have been used to represent the ratio and constraint on relationship.
In the following, we introduce two of them: the Min/Max and the Crow’s foot notations.
The two constraints can be written on the same side, and the N, M, P ratio can be replaced by actual number, providing more information.
For instance,
could be drawn as
meaning that
More generally, we have the following:
Extended (or Enhanced) ER Models (EER) have additionaly:
Closer to object-oriented programming.
It is possible to go from relational models to ER models, and sometimes needed: if you are given an implementation that seems poorly design, this can be a way of “backing up” and thinking about the (sometimes implicit) choices that were made during the implementation, to eventually correct them.
For instance, consider the code we studied in “A First Example”:
CREATE TABLE STORM (
NAME VARCHAR(25) PRIMARY KEY,
Kind ENUM ("Tropical
Storm", "Hurricane"),
WindSpeed INT,
Creation DATE
);
-- We can change the enumerated datatype:
ALTER TABLE STORM MODIFY Kind ENUM ("Tropical Storm",
"Hurricane", "Typhoon");
CREATE TABLE STATE (
NAME VARCHAR(25) UNIQUE,
Postal_abbr CHAR(2) PRIMARY KEY,
Affected_by VARCHAR(25),
FOREIGN KEY (Affected_by) REFERENCES STORM (NAME) ON
DELETE SET NULL ON UPDATE CASCADE
);
It corresponds to the following relational model:
which in turn corresponds to the following ER diagram:
Looking at this diagram made it obvious that our code has a flaw: a stom can affect more than one state! Turning the 1 on the left-hand side of the “AFFECTS” relationship into a M is immediate on the diagram, but, of course, mapping it back to a relational model, and then implementing it correctly, will require more work. In any case, if you had not noted already this flaw, reverse-engineering this code highlighted it quite clearly.
If we look back at Problem 3.5 (Revisiting the PROF table), we had already made a first step, since we converted the code into the following relational model:
Going a bit further, we could extrapolate just a little bit and get the following ER diagram:
As we noted in our solution to the second question, this model has several limitations. To list a few, this representation can not handle the following situations:
Looking at it as an ER diagram should help you in understanding why we have those flaws, and how they could be addressed, and “testing” the model should be made easier in its ER form than as SQL code.
We have to map all of the following:
| Entity | Strong, Weak |
| Attributes | Composite, Key, Atomic, Multi-valued, Partial Key |
| Relationships | Binary (1 : 1, N : 1, 1 : N, N : M), n-ary |
Using four tools: Relations, Attributes, Primary Keys, Foreign Keys.
We will use three techniques to represent some of the relationships, the foreign key approach, the merged relations approach and the cross-reference approach. They are detailed and illustrated after the algorithm, which goes as follows:
| # | is mapped to | |
|---|---|---|
| 1 | Strong Entity | Relation with all the simple attributes. Decompose complex (composite) attributes. Pick a key to be the PK, if it is composite, take its elements. |
| 2 | Weak Entity | Relation with all the simple attributes. Decompose complex attributes. Add as a foreign key the primary key of the relation corresponding to the owner entity type, and make it a primary key, in addition to the partial key of the weak entity. If the owner entity type is itself weak, start with it. |
| 3 | Binary 1 : 1 Relationship Types | Foreign Key, Merge Relations or Cross-Reference approach |
| 4 | Binary 1 : N Relationship Types | Foreign Key or Cross-Reference approach |
| 5 | Binary M : N Relationship Types | Cross-Reference approach |
| 6 | n-ary Relationship Types | Cross-Reference approach |
| 7 | Multi-valued Attributes | Create a new relation, add as a foreign key the primary key of the relation corresponding to the original strong entity type. Make all the attributes be the primary key. |
whose primary key is the foreign key to the relation corresponding to the entity.
NOT NULL constraint on the attribute that is not part of the primary key anymore.Every time a relationships have attributes, they are mapped to the resulting relation.
Let us look in more details at some of those steps. For strong entities, using steps 1 and 7, the following:
would give:
And note that if Serial was a complex attribute, we would just “unfold” it, or decompose it, and make all the resulting attributes the primary key of the relation. If one of the attribute was at the same time multi-valued and composite, as follows:
Then we would obtain:
For relationships, things are a bit more complicated. Consider the following:
Since it is a 1 : 1 relationship where one of the side has a partial constraint, we have the choice between two approaches. The foreign key approach would give:
Note that we could also have added the foreign key on the side of ENT.B, referencing the key of ENT.A. But since ENT.A has a total participation constraint, we know that the value of FK will always exist, whereas some entities in ENT.B may not be in relationship with an entity from ENT.A, creating the (nefast) need for NULL values.
For the same diagram, the cross-reference approach would give:
Similarly, note that, in MAPPING, KeyB, or KeyA and KeyB, would also be valid primary keys, but that it makes more sense to have KeyA being the primary key, since we know that ENT.A has a total participation constraint, but ENT.B does not.
If both participation constraints were total, as follows:
Then we could use the merged relations approach, and get:
We picked KeyA to be the primary key for the same reason as before. Note that merging the two entities into one relation also means that you have eventually to do some work on the relations that were referring to them.
Of course, if ENT.A and ENT.B are the same entity (that is, REL is recursive), we would get:
or
depending on the approach we chose.
Binary 1 : N and binary M : N relationships are dealt with in a similar way, using foreign key or cross-reference approaches. The most difficult part of the mapping is with n-ary relationships: we have to use cross-reference approaches, but determining the primary key is not an easy task. Consider the following20:
The arity constraints here can be rephrased as:
And note that there is no total participation constraint.
To reprent the RESERVES relationship, we need to create a relation with attributes referencing the primary key of MEMBER, the primary key of TIME_SLOT, and the primary key of EQUIPMENT. Making them all the primary key does not represent the fact that the same equipment cannot be booked twice during the same slot, nor that a member can book only one equipment per slot, but allows members to reserve a particular equipment at multiple time slots. To improve this situation, we can either
Both solutions enforce only some of the requirement expressed by the ER diagram.
| ER Model | Relational Model |
|---|---|
| Entity type | Entity relation |
| 1 : 1 or 1 : N relationship type | Foreign key (or relationship relation) |
| M : N relationship type | Relationship relation and two foreign keys |
| n-ary relationship type | Relationship relation and n foreign keys |
| Simple attribute | Attribute |
| Composite attribute | Set of simple component attributes |
| Multivalued attribute | Relation and foreign key |
| Value set | Domain |
| Key attribute | Primary key |
You can have a look at e.g. http://holowczak.com/converting-e-r-models-to-relational-models/ to get a slightly different explanation of this conversion, and additional pointers.
What makes a good database? At the logical (conceptual) and physical (implementation) levels. We will answer belowe this question broadly, and will then use the concept of functional dependency to capture some of those notions precisely, mathematically, and be able to detect issues preventing our database from meeting the usual goals.
In general, a good data base should:
SELECT and select-project-join easy)Normally, consistency will simply follows if those goals are met.
For ER diagrams, some of the usual techniques21 are:
1 relation corresponds to 1 entity or 1 relationship type
NULL to insert tuples, especially on a key attribute!
(Bad!) Example:
---------- (Login, Name, AdvisoryName, AdvisorOffice, Major, MajorHead)
-----------(Office, PhoneNumber, Building)NULL Should Be RareNULL has 3 meanings, wastes space, and makes join / nested projections harder.
Example:
STUDENT(Login, …, siblingEnrolled)Transform into “Emergency Contact in University” relation (bonus: allow multiple contacts).
Example with advisorOffice and Office: if we try to write a join to obtain the phone number of a student’s advisor, we will obtain all the phone.
MARKER(Owner, Color, OwnerOffice, Brand, BrandEmail)
TEACHER(Office, Name, Phone)Corrected to:
MARKER(Owner, Color, B͟r͟a͟n͟d͟)
TEACHER(Office, N͟a͟m͟e͟, Phone)
BRAND(N͟a͟m͟e͟, Email)Functional dependencies (FD) is a formal tool used to assess how “good” a database is, a property of the relation schema. Functional dependencies list the constraints between two sets of attributes from the database. For instance, if X and Y are (sets of) attributes, X → Y reads “X fixes Y”, and implies that the value(s) of Y is fixed by the value(s) of X.
“What should be.”
Let us list all the attributes of our previous example:
MARKER.Owner, MARKER.Color, MAKER.Brand, TEACHER.Office, TEACHER.Name,
TEACHER.Phone, BRAND.Name, BRAND.EmailThink about their dependencies, and list them:
TEACHER.Name → TEACHER.OfficeBRAND.Name → BRAND.EmailTEACHER.Office → TEACHER.NameTEACHER.Office → TEACHER.PhoneMAKER.Owner and MARKER.Color → MARKER.Brand ?“What is.”, can disprove some of the assumptions made previously, but should not add new dependencies based on it (they may be by chance!).
TEACHER.Office → TEACHER.Name does not hold, because teachers share offices?TEACHER.Name → MARKER.Brand and MARKER.Color seemed to be enforced by the state, but we should not add a functional dependency based on that: there are no “requirement” that a Teacher must always buy the same brand and color, this could simply true be by chance so far and should not be imposed to the teachers.A particular state cannot enforce a FD, but it can negate one.
Example:
| Att. 1 | Att. 2 | Att. 3 |
|---|---|---|
| Bob | 15 | Boston |
| Bob | 13 | Boston |
| Jane | 12 | Augusta |
| Emily | 12 | Augusta |
| May hold | Will not hold |
|---|---|
| Att. 2 → Att. 3 | Att1 → Att2 |
| Att. 3 → Att. 2 | Att. 3 → Att. 2 |
| Att. 1 → Att. 3 | Att. 2 → Att. 1 |
| {Att. 1, Att. 2} → Att. 3 | {Att. 3, Att. 2} → Att. 1 |
Or, more conveniently:
If an attribute is a foreign key to another, we will draw an arrow between relations:
Note that:
We will assume that the consequence of those axioms always hold (“closure under those rules”), but will generaly not write them explicitely, since they do not carry any new or additional information.
Remember superkey (not minimal key), key, candidate key, secondary key? We now have a formal definition.
In one particular relation R(A1,…,An),
Given a FD {A1, …, An} → Y,
A FD : X → Y is a transivive dependency if there exist a set of attribute B s.t.
There exists multiple normal forms: First, Second, Third, Fourth, Fifth normal form (“X”NF), … Stronger than the Third, there is the Boyce-Codd NF (BCNF), but we will focus on the first three, that are “cumulative”: to you satisfy N, a relation have to satisfy N − 1, N − 2, etc. The normal form of a relation is the highest normal form condition that it meets.
The domain of all attributes must be atomic (simple, indivisible): exclude multi-valued and composite attributes.
Sometimes, additional requirement that every relation has a primary key. We will take this requirement to be part of the definition of 1NF, but some authors take a relation to be in 1NF if it has at least candidate keys (i.e., multiple possible keys, but no primary key, which makes their definition more general, cf. (Elmasri and Navathe 2015, 14.4.1)). Hence, we will always assume that a primary key is given, and it will be underlined.
This essentially consists in
1NF + Every non-prime attribute is fully functionnaly dependent on the primary key.
For each attribute A of the relation whose primary key is A1, …, An:
becomes
Refinment: note that if more than one attribute depends of the same subset {A′1, …, A′k}, we will create two relations: that is useless, we could have created just one. For instance, considering
applying the algorithm would give (the incorrect, since a foreign key can not refer two attributes in two different tables)
whereas a more subtle algorithm would give
Note that in both cases, all the relations are in Second Normal Form, though (and valid if we ignore the foreign key issue discussed above).
Note also that, sometimes, removing the “original” relation may be preferable: cf. an example in Problem 4.33 (COFFEE relation: primary key and normal form).
Note also that if our primary key is a singleton (a set with only one element), then there is nothing to do, we are in 2NF as soon as we are in 1NF: every functional dependency from a single element is always full!
2NF + no non-prime attribute is transitively dependent on the primary key.
For each attribute A of the relation whose primary key is A1, …, An:
We can have a look at another example:
Note that {State, Driver_Licence_Num}, would be a valid primary key for this relation, and that adding it would make it a relation in 1NF.
As we can see, the name “Driver” is somehow counter-intuitive, since the relation also carries information about Governors. This relation is actually not in 2NF, because the FD {State, Driver_Licence_Num} → Governor is not fully functional. A possible way to fix it is to get:
As you can see, the 2NF helped us in separating properly the entities.
An example of a relation that is in 2NF but not in 3NF could be:
As we can see, all the non-prime attributes are fully functionally dependent from Login, which is our primary key. But, obviously, one of this dependecy is transitive, and breaks the 3NF. A way to fix it is:
As we can see, 3NF also helped us in separating properly the entities, in a slightly different way.
In conclusion, we can observe that every FD X → Y s.t. X is a proper subset of the primary key, or a non-prime attribute, is problematic. 2NF is a guarantee that every entity has its own relation, 3NF is a way to avoid data inconsistency.
One approach for analysis, design, implementation and deployment of databases and their applications. Databases interact with multiple softwares and users, we need a common language.
Unified Modeling Language is a standard:
Wide, powerful, but also intimidating.
You know UML from object-oriented programming language:
That is an example of a class diagram (with class name, attributes and operators, as well as a particular way to represent that a class extends another) , there are other types of diagrams, they are not unrelated! For instance, using communication diagrams, deployment diagrams, and state chart diagrams, you can collect the requirements needed to draw a class diagram! They each offer a viewpoint on a software that will help you in making sure the various pieces will fit together: it is a tool commonly used in software engineering, and useful in database design.
There are 14 different types of diagrams, divided between two categories: structural and behavioral.
(Source: https://commons.wikimedia.org/wiki/File:UML_diagrams_overview.svg)
They describe structural, or static, relationships between objects, softwares.
In this category also exist Composite structure diagram, Package diagram and Profile diagram.
They describe the behavioral, or dynamic, relationship, between components.
Then there is the sub-category of “Interaction diagrams”:
This sub-category also comprise Timing diagram and Interaction overview diagram.
Looking at the “COMPANY conceptual schema in UML class diagram notation”, and comparing it with the “ER schema diagram for the COMPANY database” from the textbook, can help you in writing your own “Rosetta Stone” between ER and UML diagram. Let us introduce some UML terminology for the class diagrams.
| UML | ER |
|---|---|
| Class | Entity Type |
| Class Name | Entity Name |
| Attributes | Attributes |
| Operations (or Method) | Sometimes Derived Attributes |
| Association | Relationship Type |
| Link | Relationship Instance |
| Multiplicities | Structural Constraint |
As well as for ER diagram, the domain (or data type) of the attributes is optional. A composite attribute in a ER diagram can be interpreted as a structured domain in a UML diagram (think of a struct), and a multi-valued attribute requires to create a new class.
Associations are, to some extend, more expressive than relationship types:
min .. max, with * for “no maximum”, and the following shorthands: * stands for 0..* and 1 stands for 1..1. An association with 1 on one side and * on the other (resp. 1 and 1, * and 1, * and *) is sometimes called “one-to-many” (resp., “one-to-one”, “many-to-one”, “many-to-many”). The notation in partially inverted w.r.t. ER diagrams:Additionally, associations can be “extended”, and they are not the only kind of relationship that can be expressed between two classes.
Qualified associations can be used for weak entities, but not only.
Some of those subtleties depend on your need, and are subjective, but are important tool to design properly a database, and relieving the programmer from the burden of figuring out many details.
Name the three high-level models we will be learning about in this class (expand the acronyms).
What could be the decomposition of an attribute used to store an email address? When could that be useful?
What would be the benefit of having a composite attribute “Phone” made of two attributes (Number and Description) being multi-valued? Answer this question, and draw the resulting attribute.
Draw the ER diagram for a “COMPUTER” entity that has one multivalued attribute “Operating_System”, a composite attribute “Devices” (decomposed into “Keyboard” and “Mouse”) and an “ID” key attribute.
Draw the ER diagram for a “CELLPHONE” entity that has a composite attribute “Plan” (decomposed into “Carrier” and “Price”), an “MIN” (Mobile Identification Number) key attribute, and a multi-valued “App_Installed” attribute.
Name one difference between a primary key in the relational model and a key attribute in the ER model.
What is a derived attribute? Give two examples and justify them.
Invent an entity type with at least one composite attribute and one atomic attribute, but no multi-valued attributes. Identify a possible key attribute and draw the entity type you obtained using the conventions we used in class.
What is the degree of a relationship type?
What is a self-referencing, or recursive, relationship type? Give two examples.
What does it mean for a binary relationship type “Owner” between entity types “Person” and “Computer” to have a cardinality ratio of M : N?
What are the two possible structural constraints on a relationship type?
Draw the diagram for a “VideoGame” entity that would allow to store the name of the game, the supported platform(s) and the release date. Then, add a recursive relationship on that entity called “Is the sequel of” and specify all the constraints.
Draw a diagram to represent a relationship type R between two entities types A and B such that:
Express the constraints represented in the following diagram in plain English.
What does it mean for a binary relationship type “is the Chair of” between entity types “Professor” and “Department” to have a cardinality ratio of 1:N? Would it make sense to be have a total participation constraint on one side, and if yes, on which side?
Express the constraints represented in the following diagram in plain English.
For the following binary relationships, suggest cardinality ratios based on the common-sense meaning of the entity types.
| Entity 1 | Cardinality Ratio | Entity 2 |
|---|---|---|
| STUDENT | : | MAJOR |
| CAR | : | TAG |
| INSTRUCTOR | : | LECTURE |
| INSTRUCTOR | : | OFFICE |
| COMPUTER | : | OPERATING_SYSTEM |
Give an example of a binary relationship type of cardinality 1 : N.
Give an example of a binary relationship type of cardinality N : 1 and draw the corresponding diagram (you do not have to include details on the participating entity types).
Draw an ER diagram with a single entity type, with two stored attributes, and one derived attribute. In your answer, it should be clear that the value for the derived attribute can always be obtained from the value(s) of the other attribute(s).
Draw an ER diagram expressing the total participation of an entity type “BURGER” in a binary relation “CONTAINS” with an entity type “INGREDIENT”. What would be the cardinality ratio of such a relation?
Under what condition(s) can an attribute of a binary relationship type be migrated to become an attribute of one of the participating entity types?
Consider the following diagram:
Consider the following diagram:
Suppose a “PRODUCES” relationship with an attribute “Amount” exists between a “PRODUCER” entity type and a “MOVIE” entity type, with ratio 1 : M. Migrate the “Amount” attribute to one of the entity types and draw the resulting diagram.
Suppose a “MEMBERSHIP relationship with an attribute”Level” (e.g., “silver”, “platinium”, etc.) exists between a “PERSON” entity type and a “CLUB” entity type, with ratio M : 1. Migrate the “Level” attribute to one of the entity types and draw the resulting diagram.
Assume with have three entity types, “Lecture Notes”, “Class” and “Professor.”
You can specify role names in your diagrams for added clarity, and remember to list all the constraints.
Can we always replace a ternary relationship with three binary relationships? Give an example.
What is the difference between an entity type and a weak entity type?
What is a partial key?
Why do weak entity type have a total participation constraint?
Invent a weak entity type, its identifying (owner) entity type and the identifiying (or supporting) relationship. Both entities should have (partial) key, and each should have at least one composite attribute.
Convert the following ER diagram into a relational model:
Convert the following ER diagram into a relational model:
What is insertion anomaly? Give an example.
What is deletion anomaly? Is it a desirable feature?
Why should we avoid attributes whose value will often be NULL? Can the usage of NULL be completely avoided?
Consider the following relation:
PROF(S͟S͟N͟, Name, Department, Bike_brand)
Why is it a poor design to have a “Bike_brand” attribute in such a relation? How should we store this information?
Consider the following relation:
STUDENT(S͟S͟N͟, Name, …, Sibling_On_Campus)
Why is it a poor design to have a “Sibling_On_Campus” attribute in such a relation? How should we store this information?
Consider the following relational database schema:
STUDENT(L͟o͟g͟i͟n͟, Name, …, Major, Major_Head)
DEPARTMENT(C͟o͟d͟e͟, Name, Major_Head)
Assuming that “Major” is a foreign key referencing “DEPARTMENT.Code”, what is the problem with that schema? How could you address it?
Why can we not infer a functional dependency automatically from a particular relation state?
Consider the relation R(A,B,C,D,E,F) and the following functional dependencies:
For each set of functional dependency, give a key for R. We want a key, so it has to be minimal.
Consider the relation R(A,B,C,D,E,F) and the following functional dependencies:
A → {D, E}, D → {B, F}, {B, E} → A, {A, C} → {B, D, F}, A → F
Answer the following:
What is a composite attribute in a ER diagram? Can a relational schema with composite attribute be in Second Normal Form?
Consider the relation R(A,B,C,D) and answer the following:
Consider the relation R(A,B,C,D,E,F) with candidate keys {A, B} and C. Remember that, in all generality, to be a prime attribute, you just need to be part of a possible candidate key. Answer the following:
Consider the relation R(A,B,C,D,E) and the following functional dependencies:
For each one, give one candidate key for R.
Consider the relation R(A,B,C,D,E) and answer the following:
Consider the relation R(A,B,C,D,E,F), and let {B, D} be the primary key, and have additionnaly the functional dependencies {A, D} → E, C → F. This relation is not in 3NF, can you tell why?
Consider the relation R(A,B,C,D) and answer the following:
Consider the relation R(A,B,C,D,E) and the functional dependencies {A, B} → C, B → D, C → E. Answer the following:
What are the two different categories of UML diagram?
Can a C++ developer working on Linux and a Java developer working on MacOS use the same class diagram as a basis to write their programs? Justify your answer.
What kind of diagram should we use if we want to …
Name two reasons why one would want to use a UML class diagram over an ER diagram to represent a conceptual schema.
Consider the following diagram:
Give the number of attributes for both classes, and suggest two operations for the class that does not have any. Discuss the multiplicities: why did the designer picked those values?
Briefly explain the difference between an aggregation and a composition association.
How is generalization (or inheritance) represented in a UML class diagram? Why is such a concept useful?
Convert the following ER diagram into a UML class diagram:
Convert the following UML class diagram into an ER diagram:
The three high-level models we will be learning about are the Unified Modeling Language, Entity Relationship, and Enhanced Entity–Relationship models.
A useful decomposition of an email address attribute could be: the username part before the @ sign, and the domain part afterwards (that could even be sub-divided between the domain name and its top-level domain). It might be useful to have statistics about the domains of the users or to sort the usernames by length, etc.
Having a “Phone” attribute being multi-valued would allow to store multiple phone numbers for the same entity. Typically, one would want to store a pair (Number, Description) for their office phone, their cell, etc. The resulting attribute would be drawn as follows:
There can be more than one key in the ER model, but it has to be made of a single attribute, whereas a primary key can be made of multiple attributes.
A derived attribute is an attribute whose value can be determined by the value of other attributes. For instance:
- The value of an “Age” attribute could be determined from the value of an “Date of birth” attribute and the current day.
- The value of a “State” attribute can be determined from the value of a “Zip code” attribute.
- The value of a “Body Mass Index” attribute could be calculated from the values of height and weight attributes.
- The value of an “Initials” attribute could be determined using the values of the “First Name”, “Middle Name”, and “Last Name” attributes.
The degree of a realationship type is the number of its participating entity types.
A self-referencing relationship type is where the same entity type participates more than once. On a SEATS entity type, it would be an attribute like “is to the left of” or on a PERSONS entity type, it would be and attribute like “is married to”.
The cardinality ratio on the binary relationship type “Owner” between the entity types “Person” and “Computer” means that a person can own multiple computers, and a computer can have multiple owners.
The two possible structural constraints on a relationship type are the cardinality ratio and participation constraints.
We would obtain the following diagram:
Note that the “M / N” part could be discussed: having 1 instead of M would mean that a videogame that is a sequel is the sequel of at most one videogame. This could make sense, but would forbid for instance to register “Battletoads & Double Dragon - The Ultimate Team” as the sequel of both Double Dragon and Battletoads. Having 1 instead of N would mean that every videogame has at most one sequel: it would prevent from registering both “Super Mario Land” and “Super Mario World” as the sequels of “Super Mario Bros. 3”.
However, note that all the participation constraints are partial: having a total participation constraint would mean (on the left side) that every game is a sequel, or (on the right side) that every game have at least one sequel, two statements that are obviously wrong.
We would obtain the following diagram:
A key opens only one door, and every key must open at least one door. A door can be opened by multiple keys, and some doors may not be opened by any key (think of doors that do not have a lock).
The binary relation type “is the Chair of” with a cardinality ratio of 1:N between entity types “Professor” and “Department” means that a department can have at most one professor as its chair, but that a professor can be the chair of multiple departments. It could make sense to require that every department has a chair, hence writing a double line between the Department entity and the “is the Chair of” relationship, but it would not make sense to have a total participation constraint on the side of the professor (which would mean that every professor has to be the chair of a department).
An operating system may be supported by many computers, but it is also possible that no computer supports it (think of an operating system in development, or developed for embeded devices). A computer must support at least one operating system and can support multiple operating systems.
| Entity 1 | Cardinality Ratio | Entity 2 | Explanation |
|---|---|---|---|
| STUDENT | N : 1 | MAJOR | “A student has one major, but multiple students can have the same major” |
| CAR | 1 : 1 | TAG | “A car has exactly one tag, a tag belongs to one particular car.” |
| INSTRUCTOR | 1 : N | LECTURE | “An instructor can teach multiple lecture, but a lecture is taught by only one person.” |
| INSTRUCTOR | 1 : N | OFFICE | “An instructor can have multiple office, but an office belongs to only one instructor” |
| COMPUTER | M : N | OPERATING_SYSTEM | “A computer can have multiple operating system, the same operating system can be installed on more than one computer.” |
Some of these choices are debatable (typically, almost any combination seems reasonable for the INSTRUCTOR : OFFICE relation).
A binary of relationship of SUPERVISOR as a recursive relationship on EMPLOYEE.
An attribute of a binary relationship type can be migrated to one of the participating entity types when the cardinality ratio is 1 : N, 1 : 1, or N : 1. It can be migrated “to the N side” or, if there is no N side, to either side. Note that for n-ary relationships, at least one ratio needs to be 1 for the attribute to be allowed to migrate (and “to the N side”, or, if there is no N side, to any side).
We could have the following:
We could have the following:
A possible example of ternary relationship is:
One example of two binary relationships could be:
A question like
“Who wrote the lecture notes X?”
could be answered with the binary relationships but not the ternary. Conversely, a question like
“What are the lecture notes refered to by Prof. X in their class Y?”
could not be answered using the binary relationships (since we do not know what classes are taught by Prof. X).
No, a ternary relationship cannot always be replaced by three binary relationship. For instance, if I have a “Travelling to” relationship between a “Person”, a “City” and a “Transport mode”, to represent the fact that a person is travelling to a city using a particular mode of transportation, there is no way I can convey the same information using binary relationships.
The weak entity type does not have a key attribute, it cannot be distinguised from the other weak entities based on a single attribute, for that we also need to know its relationship to some other entity type.
For a weak entity attribute, it is the attribute that can uniquely identify weak entites that are related to the same owner entity.
Otherwise, we could not identify entities in it without owner entity.
A possible solution is:
Note that the two composite attributes are “generic”, in the sense that you can re-use those examples easily.
A possible option is:
Note that “Stays_At” could also be a separate relation, with two attributes, “Address” and “Person”, linked to respectively PLACE.Address and PERSON.SSN, and both being the primary key of the relation.
A possible option is:
Note that to more faithfully represent the total participation constraints, one could add NOT NULL attributes to TakenBy, CustomerFName and CustomerLName in RESERVATION.
When you have to invent a primary key or add a lot of NULL value to be able to add a tuple. I want to add a room in my DB, but the only place where rooms are listed are as an attribute on a Instructor table, so I have to “fake” an instructor to add a room.
A delete anomaly exists when certain attributes are lost because of the deletion of other attributes. It is not desirable, since it can lead to the loss of information.
Because they waste space, they are ambiguous (N/A, or unknown, or not communicated?), and they make querries harder. No, it is necessary sometimes.
Because it will be NULL most of the time. In a separate relation, e.g. a “BIKE” relation, with two attributes, “Owner” and “Brand”, “Owner” being a foreign key referencing the SSN attribute of PROF.
Because it will be NULL most of the time, and because students could have more than one sibling on campus. In a separate relation, e.g. in a “EMERGENCY_CONTACT” relation, with two attributes, “Student” (refercing the SSN attribute of STUDENT), and “Contact”. If the emergency contacts are not related to the student, or if we want to preserve the fact that one student is a sibling to another, we can create another relation to store that information.
Major_Head will give update anomalies. By putting the Head of the department in the DEPARTMENT relation only, i.e., removing it from STUDENT.
Just because a coincidence exists (i.e., “in my data set, no android user is color-blind”) does not mean that it will always be true (i.e., “no color-blind person will ever use android”). Functional dependencies should come from a principled reasoning about the attributes, and not from the observation of the data.
A composite attribute is an attribute made of multiple attributes, like an “Address” attribute could be composed of the “sub”-attributes “Street”, “City”, “Zip” and “State. A relational schema needs a primary key and to have only atomic domains to be in first normal form, so, no, a relational schema with composite attributes can not be in second normal form.