Tuple means the number of lines

2.1 Database systems

2.1.1 Relational databases

In relational databases, the data is stored in tables or Relations saved. Each relation consists of a certain number of columns or Attributes as well as a set of lines or Tuples (Sentences). The number of attributes determines the Degree, the number of tuples the Cardinality a relation. Fig. 2-1 shows an example of two relations of degree 3 and cardinality 6 and 5, respectively.

Each attribute has a so-called domain (Domain of definition) which defines the permissible values. The description of a relation includes the name of the relation and its attributes as well as the associated domains. The expression of the relation is given by the set of tuples and corresponds to a subset of the Cartesian product over the attribute domains. The set property of relations means that no tuple may appear more than once and that there is no predetermined order among the tuples.

The relation model prescribes two model-inherent integrity conditions, the so-called Relational invariants. The primary key condition requires that for each relation an attribute or a combination of attributes as Primary key acts to uniquely identify tuples. In Fig. 2-1, for example, the attributes KTONR (relation ACCOUNT) and KNR (relation CUSTOMER) form suitable primary keys. The second constraint inherent in the model concerns so-called Foreign key, with which relationships between relations can be realized. In Fig. 2-1 the attribute KNR (customer number) in relation to ACCOUNT is such a foreign key with which the owner of an account is represented by a reference to the primary key of the relation CUSTOMER. The foreign key condition requires that the tuple referenced by a foreign key value exists in the database, i.e. that a corresponding primary key value must be defined in the referenced relation (referential integrity).

To describe relational queries, we mainly use the Relational algebra back. Relational algebra allows for a compact representation of operations and explicitly specifies the basic operators to be executed when processing the query. The operators of the relational algebra are set-oriented, since they in turn generate a relation from one or two input relation (s). In addition to general set-theoretic operators such as union (), Averaging (), Cartesian product () or difference are of particular interest in the relational operators selection (restriction), projection and join (join). With the selection


a line-by-line (horizontal) subset of a relation R is formed that contains all tuples that satisfy the selection predicate P. An example of this can be found in Fig. 2-2a. The projection


forms a column-wise (vertical) subset of the input relation, whereby all unspecified attributes are "filtered out" (example: Fig. 2-2b). Duplicates are also eliminated in order to preserve the quantity property of the result relation. The Composite () finally allows the linking of two relations that have attributes with matching domains. By far the most important composite type is Peer group (Equi-Join), in which the link is defined by an equality condition on the compound attributes. The equal connection between relation R with compound attribute r and relation S with compound attribute s can be defined as a selection on the Cartesian product between R and S:

The natural composite between two relations in which the compound attributes have the same name is defined as an identical compound between the relations, whereby the compound attributes appear only once in the result. Due to the naming convention, the join condition can be omitted with the natural join (R S). Fig. 2-2c shows an example of the natural bond.
The Semi-join between two relations R and S (R S) is defined as a join, the result of which only contains attributes of relation R:

Fig. 2-2d shows an example of a (natural) semi-join. Semi-joins have become particularly important in the context of distributed database systems. They allow the result set to be reduced compared to the normal join, which can be used to reduce the amount of data to be transferred (see Section 6.5.2).

Since the operators take relations as input and in turn generate relations, they can be combined with one another to form more complex queries. An example of this is shown in Fig. 2-2e.

In practice, DB queries are of course formulated in a more user-friendly and descriptive query language than relational algebra. Here, SQL has clearly prevailed. This query language is supported by almost all database systems and is standardized within the framework of ISO. The current standard (SQL2, SQL92) is very powerful and comprehensive compared to its previous versions and is currently only partially covered by existing implementations. For this book, only a small knowledge of SQL is required, as only a few examples are formulated in SQL, which should also be largely self-explanatory. For example, the query from Fig. 2-2e would be as follows in SQL:

Special features of SQL, which of course also have to be supported by multi-computer DBS, concern the calculation of so-called aggregate functions (MIN, MAX, SUM, COUNT, AVG) as well as the optional sorting of result sets. For an introduction to SQL, see the abundant literature; the current language scope of ISO-SQL is described e.g. in [DD92, MS92].