Вся предоставленная на этом сервере информация собрана нами из разных источников. Если Вам кажется, что публикация каких-то документов нарушает чьи-либо авторские права, сообщите нам об этом.
In the previous section
(Relational Data Model Formalities)
we defined the mathematical notion of
the relational model. Now we know how the data can be stored using a
relational data model but we do not know what to do with all these
tables to retrieve something from the database yet. For example somebody
could ask for the names of all suppliers that sell the part
'Screw'. Therefore two rather different kinds of notations for
expressing operations on relations have been defined:
The Relational Algebra which is an
algebraic notation,
where queries are expressed by applying specialized operators to the
relations.
The Relational Calculus which is a
logical notation,
where queries are expressed by formulating some logical restrictions
that the tuples in the answer must satisfy.
The Relational Algebra was introduced by
E. F. Codd in 1972. It consists of a set of operations on relations:
SELECT (σ): extracts tuples from
a relation that
satisfy a given restriction. Let R be a
table that contains an attribute
A.
σA=a(R) = {t ∈ R ∣ t(A) = a}
where t denotes a
tuple of R and t(A)
denotes the value of attribute A of
tuple t.
PROJECT (π): extracts specified
attributes (columns) from a
relation. Let R be a relation
that contains an attribute X.
πX(R) = {t(X) ∣ t ∈ R},
where t(X) denotes the value of
attribute X of tuple t.
PRODUCT (×): builds the Cartesian product of two
relations. Let R be a table with arity
k1 and let
S be a table with
arity k2.
R × S
is the set of all
k1
+ k2-tuples
whose first k1
components form a tuple in R and whose last
k2 components form a
tuple in S.
UNION (∪): builds the set-theoretic union of two
tables. Given the tables R and
S (both must have the same arity),
the union R ∪ S
is the set of tuples that are in R
or S or both.
INTERSECT (∩): builds the set-theoretic intersection of two
tables. Given the tables R and
S,
R ∪ S is the
set of tuples
that are in R and in
S.
We again require that R and
S have the
same arity.
DIFFERENCE (− or ∖): builds the set difference of
two tables. Let R and S
again be two tables with the same
arity. R - S
is the set of tuples in R but not in
S.
JOIN (∏): connects two tables by their common
attributes. Let R be a table with the
attributes A,B
and C and
let S be a table with the attributes
C,D
and E. There is one
attribute common to both relations,
the attribute C.
R ∏ S = πR.A,R.B,R.C,S.D,S.E(σR.C=S.C(R × S)).
What are we doing here? We first calculate the Cartesian
product
R × S.
Then we select those tuples whose values for the common
attribute C are equal
(σR.C = S.C).
Now we have a table
that contains the attribute C
two times and we correct this by
projecting out the duplicate column.
Example 2-2. An Inner Join
Let's have a look at the tables that are produced by evaluating the steps
necessary for a join.
Let the following two tables be given:
R A | B | C S C | D | E
---+---+--- ---+---+---
1 | 2 | 3 3 | a | b
4 | 5 | 6 6 | c | d
7 | 8 | 9
First we calculate the Cartesian product
R × S and
get:
R x S A | B | R.C | S.C | D | E
---+---+-----+-----+---+---
1 | 2 | 3 | 3 | a | b
1 | 2 | 3 | 6 | c | d
4 | 5 | 6 | 3 | a | b
4 | 5 | 6 | 6 | c | d
7 | 8 | 9 | 3 | a | b
7 | 8 | 9 | 6 | c | d
After the selection
σR.C=S.C(R × S)
we get:
A | B | R.C | S.C | D | E
---+---+-----+-----+---+---
1 | 2 | 3 | 3 | a | b
4 | 5 | 6 | 6 | c | d
To remove the duplicate column
S.C
we project it out by the following operation:
πR.A,R.B,R.C,S.D,S.E(σR.C=S.C(R × S))
and get:
A | B | C | D | E
---+---+---+---+---
1 | 2 | 3 | a | b
4 | 5 | 6 | c | d
DIVIDE (÷): Let R be a table
with the attributes A, B, C, and D and let
S be a table with the attributes
C and D.
Then we define the division as:
R ÷ S = {t ∣ ∀ ts ∈ S
∃ tr ∈ R
such that
tr(A,B)=t∧tr(C,D)=ts}
where
tr(x,y)
denotes a
tuple of table R that consists only of
the components x and y.
Note that the tuple t only consists of the
components A and
B of relation R.
Given the following tables
R A | B | C | D S C | D
---+---+---+--- ---+---
a | b | c | d c | d
a | b | e | f e | f
b | c | e | f
e | d | c | d
e | d | e | f
a | b | d | e
R ÷ S
is derived as
A | B
---+---
a | b
e | d
For a more detailed description and definition of the relational
algebra refer to [Ullman, 1988] or
[Date, 1994].
Example 2-3. A Query Using Relational Algebra
Recall that we formulated all those relational operators to be able to
retrieve data from the database. Let's return to our example from
the previous
section (Operations in the Relational Data Model)
where someone wanted to know the names of all
suppliers that sell the part Screw.
This question can be answered
using relational algebra by the following operation:
We call such an operation a query. If we evaluate the above query
against the our example tables
(The Suppliers and Parts Database)
we will obtain the following result:
The relational calculus is based on the
first order logic. There are
two variants of the relational calculus:
The Domain Relational Calculus
(DRC), where variables
stand for components (attributes) of the tuples.
The Tuple Relational Calculus
(TRC), where variables stand for tuples.
We want to discuss the tuple relational calculus only because it is
the one underlying the most relational languages. For a detailed
discussion on DRC (and also
TRC) see
Date, 1994
or
Ullman, 1988.
The queries used in TRC are of the following
form:
x(A) ∣ F(x)
where x is a tuple variable
A is a set of attributes and F is a
formula. The resulting relation consists of all tuples
t(A) that satisfy F(t).
The relational algebra and the relational calculus have the same
expressive power; i.e. all queries that
can be formulated using relational algebra can also be formulated
using the relational calculus and vice versa.
This was first proved by E. F. Codd in
1972. This proof is based on an algorithm ("Codd's reduction
algorithm") by which an arbitrary expression of the relational
calculus can be reduced to a semantically equivalent expression of
relational algebra. For a more detailed discussion on that refer to
Date, 1994
and
Ullman, 1988.
It is sometimes said that languages based on the relational calculus
are "higher level" or "more declarative" than languages based on
relational algebra because the algebra (partially) specifies the order
of operations while the calculus leaves it to a compiler or
interpreter to determine the most efficient order of evaluation.