# Sql Excel : Dataflows, SQL, and Relational Algebra

Beneath the skin of most relational databases is an engine that is essentially a
dataflow engine. Dataflows focus on data and SQL focuses on data, so they are
natural allies.
Historically, though, SQL has a somewhat different theoretical foundation based
on mathematical set theory. This foundation is called relational algebra, an area
in mathematics that defines operations on unordered sets of tuples. A tuple is a lot
like a row, consisting of attribute-value pairs. The “attribute” is the column and
the “value” is the value of the column in the row. Relational algebra then includes
a bunch of operations on sets of tuples, operations such as union and intersection,
joins and projections, which are similar to the dataflow constructs just described.
The notion of using relational algebra to access data is credited to E. F. Codd who,
while a researcher at IBM in 1970, wrote a paper called A Relational Model of
Data for Large Shared Data Banks. This paper became the basis of using
relational algebra for accessing data, eventually leading to the development of SQL
and modern relational databases.
A set of tuples is a lot like a table, but not quite. One difference between the two is
that a table can contain duplicate rows but a set of tuples cannot have duplicates.
A very important property of sets is that they have no ordering. Sets have no
concept of the first, second, and third elements—unless another attribute defines
the ordering. To most people (or at least most people who are not immersed in set
theory), tables have a natural order, defined perhaps by a primary key or perhaps
by the sequence that rows were originally loaded into the table.
As a legacy of the history of relational algebra, SQL tables have no natural
ordering. The order of the results of a query are defined only when there is an ORDER BY clause.