Consider a flight information The following attributes exist
Solution
Restrictions on the database schema
- precludes certain undesirable properties from DB - NFs represent good DB design to be specified
First normal form (1NF)
- R is in 1NF if values in dom(A) are atomic for every attribute A in R: not multi-valued
- accurate expressions of FDs
- non-1NF relations force a complex DML to manipulate - is it reasonable to require 1NF?
First normal form : Normalizing the airline relation into NF :
airline-iNF(airline,fno , d_city, d_date,d_time,d_gate,dest_city,a_date,a_time,a_gate,avail_seats,pass_list)
For new applications of relational databases that would benefit from non-1NF representation, new NFs and criteria for good design are necessary
2NF and 3NF arise from trying to avoid update anomalies and redundancy
<ex> Assign
(Flight_no d_date a_gate d_gate) --------------------------------------------
112 1 May 11 2016 11 13
(Flight_no, d_date) is a key for the relation another FD: d_date
- update (Assign: 112, 3 May,12, 8) violates the FD
- to avoid the violation, the system must scan the relation, and update the Gate# everywhere the Flight# appears
--- we don’t want to do it
We want ...
(1) change only one tuple
(2) reduce (eliminate if possible) redundancy
In the example above, (Flight#, Gate#) is replicated in the relation, making the information redundant, and update complicated.
Solution: decomposition
Passign
Gassign
(Flight_number d_date a_date ------------------------------
112 1 May 11 13
Full dependency
Given a set of FDs F and a FD X Y in F+,
Y is fully dependent on X if Y is not functionally dependent on any proper subset of X.
Otherwise, Y is partially dependent on X.
<ex> FD={Flight Day Pilot Gate, Flight Gate} - Gate is partially dependent on (Flight Day)
- Pilot is fully dependent on (Flight Day)
Prime
Attribute A is prime in relation schema R
if A is a member of some candidate key of R. Otherwise, A is non-prime in R.
2NF
R is in 2NF w.r.t. a set of FDs F, if it is in 1NF and every non-prime attribute is fully dependent on every key of R
<ex> Assign is not in 2NF,
since Gate is partially dependent on (Flight Day). Passign and Gassign are in 2NF.
Requiring 2NF does not solve all the problems
Consider Student (S#, Name, Major-dept, Chair) with S# as key
- no partial dependency, but remaining anomalies
(1) To change Chair, all the tuples need to be changed
(2) If a department has no student, can it exist? Can it have a chair?
3NF and transitive dependency :
Why those problems are still there?
- intuitively, the chair is an attribute of
a department entity, not of a student entity
- we have S# Major-dept Chair Transitive dependency
An attribute A is transitively dependent on X ifforsomeattributeY,X Y Y A Y X A XY
<ex> R = (Flight, Day, Pilot-id, Name) FD = {FD P, P N, N P}
Name is transitively dependent on (Flight, Day) because FD P, P FD, P N, N FDP
3NF
R is in 3NF if no non-prime attribute of R is transitively dependent on any key of R.
--- no transitive dependency
In the example above, R is not in 3NF because of a transitive dependency between Name and (Flight Day)
Solution: decomposition
R1 = (Flight, Day, Pilot-id) R2 = (Pilot-id, Name)
Then R = {R1, R2} is in 3NF w.r.t. F
<Theorem> If R is in 3NF, then R is in 2NF.
Proof: If R is in 3NF but not in 2NF, then there must be a partial dependency X A, where A is a non-prime and X is a proper subset of some key Y.
ThenY X(Yisakey),X A(given),X Y(X Y), and A X (A is non-prime)
A is transitively dependent on Y
R is not in 3NF, a contradiction.
3NF allows two FDs X Y and Y A in the same relation onlyifX YandY X
--- both X and Y are keys
It leads to decomposing independent FDs into different relation schema, where they may be updated independently
This suggests that more joins will be performed at run-time
- system performance for such joins can be improved by physically storing such relations near to each other
1NF, 2NF, and 3NF are very popular
- other normal forms and more theory exist, but only those simpler ones have achieved widespread acceptance
DEcomposition into 3NF:
Given R and F
1. If there are no transitive dependencies in R, we are done
2. Otherwise, we have a transitive dependency on a key in R Let K be a key, Y R, A a non-prime attribute in R, suchthatK Y,Y K,Y A,andA KY
3. Let R1=R-A and R2=YA
K is a key for R1, Y is a key for R2
--- one transitive dependency is removed from R
4. Other attributes that are non-prime and R (KY)
that are dependent upon Y can be placed in R2 at the same time
--- speed up
R1 (r) R2 (r) = r (lossless join)
Let R1 and R2 be relation schema with R1 R2=X where X R1 (or R2)
Then for any relation r with schema R1R2
r= R1 (r) R2 (r)
In step 3 of decomposition, R1=R A, R2=YA R1 R2=Y
Y A (given)
Y YA (augmentation)
Hence Y R2
Decomposition cannot go on forever, since each time we decompose R, resulting schema R1 and R2 are smaller,
and no transitive dependency can exist with only two attributes
Decomposition is not unique
4NF : Fourth normal form (4NF) is a level of database normalization where there are no non-trivial multivalued dependencies other than a candidate key. It builds on the first three normal forms (1NF, 2NF and 3NF) and the Boyce-Codd Normal Form (BCNF).


