1.2. Data models

A data model is a model that describes in an abstract way how data are represented in a business organization, an information system or a database management system - Wikipedia
[Section notes PDF 310Kb].

1.2.01. Introduction

Relational model
Abstract operations on relations

Set theoretic operations
Relational-specific operations

Basic algebra operations

Union, Intersection, Difference
Cross product

1.2.02. Relational model

Table as relation
Row as tuple

real world entity or relationship
fact

Column as attribute

Domain

Schema-state divide for Relational schema

The concept of a relation is abstract, therefore we have a number of different ways of visualising it.

1.2.03. Relation

Relation schema R(A₁, A₂, A₃.. A_n)

Design side
Assertion/declaration

Relation state

Data side
set of n-tuples

each one an ordered list of values

1NF: each value is atomic, no composite/multivalue

Data-design divide

1.2.04. Abstract operations

Database lifecycle

design, populate, evolve

Insert

tuple (a₁,a₂,a₃…a_n)

Delete

tuple (a₁,a₂,a₃…a_n)

Update (or modify)

tuple (a₁,a₂,a₃…a_n)
attribute to change, new value

Insert, update and delete

All the operations described in the next few sections are abstract. We're going to see how valuable they can be in processing real world data later.

1.2.05. Basic algebra

Two categories

Set theoretic operations

Union, Intersection etc.

Relational specific

Select, project and join

At this stage we're talking about set theoretical operators on the Relational model, not SQL instructions which confusingly have identical names and only similar behaviour.

1.2.06. Select operation

SELECT a subset of tuples from a relation

Uses selection condition

Evaluate each tuple to true of false
False tuples discarded

Sigma (s)
output = s_(cond)(input_relation)
Relation schema: R(output) = R(input_relation)
Commutative

1.2.07. Project operation

PROJECT a subset of attributes for all tuples from a relation

Pi (p)
p_{<attribute list>}(R)

If sublist is only non-key attributes

might get duplicates

Removes duplicates
Attribute list:sublist example

The result set of the operation is itself a relational. That output relation will contain the same number of rows as the input, however it may contain a different number of columns; fewer if a subset of attributes is projected; more if derived or aggregated attributes are included.

1.2.07. Sequences of operations

Select followed by projection
Area clipping: rows then columns
p_{<attr list>}
(s_{(select_cond)}(R))
Rename operation (r)

Renames attributes list2 from list1
r_{(new_attr_names)}(R)

Select, Project and composite abstract operations

1.2.08. Rename operation

Attribute renaming only

Cannot alter domain, or add/remove attr

Rename operation (r)

Renames attributes list2 from list1
r_{(new_attr_names)}(R)

Implicit renaming

Order dictated by relational schema

1.2.08. Set Theoretic

Binary operation: two relations

Sets of tuples
Union compatibility (same attributes)

Union (R u S)
Intersection (R n S)

Commutative (R u (S u T) = (R u S) u T)

Relation Union operation on compatible relations

1.2.08b. Set difference

Set difference

Non-commutative (R-S != S-R)

Set difference example

1.2.09. Cross product

Cartesian product of two relations
R x S
Also known as

Cross product
Cross join

Cross product diagram

Introduction to complexity

Computationally explosive

colour-coded cross product of two relations

1.2.10. Relational algebra/model notation

Relational schema R(A₁, A₂,…,A_n)
Relation state r or r(R)

Set of unordered tuples
r = {t₁, t₂,…,t_n}

Each n-tuple is an ordered list of values

t = <v₁, v₂,…,v_n>

i^th value in t = v_i called t[A_i]
r(R) subset of (dom(A₁) x dom(A₂)... x dom(A_n))

1.2.11. Constraints

Domain constraint

For all v in t of r(R)
v_i is an element of dom(A_i)

Entity constraint

K = SK_min
t[K] != null

Key constraint

Superkey SK as identifying subset of attributes
t₁[SK] != t₂[SK]

1.2.12. Referential integrity

Given two relations R₁ and R₂

R₁ contains a foreign key (FK) that references
A primary key (PK) in R₂

R₁ referencing relation, R₂ referenced relation

Shared domains: dom(FK) = dom(PK)
Foreign exists: t₁ in r(R₁), t₂ in r(R₂)

t₁[FK] = t₂[PK] || NULL

Database Training Course

1.2. Data models

1.2. Data models

1.2.01. Introduction

1.2.02. Relational model

1.2.03. Relation

1.2.04. Abstract operations

1.2.05. Basic algebra

1.2.06. Select operation

1.2.07. Project operation

1.2.07. Sequences of operations

1.2.08. Rename operation

1.2.08. Set Theoretic

1.2.08b. Set difference

1.2.09. Cross product

1.2.10. Relational algebra/model notation

1.2.11. Constraints

1.2.12. Referential integrity