<table operator clause> ::= <where clause> | <over clause> | <remove clause> | <specify clause> | <add clause> | <rename clause> | <aggregate clause> | <quota clause> | <explode clause> | <adorn clause> | <redefine clause> | <union clause> | <intersect clause> | <minus clause> | <times clause> | <join clause> | <having clause> | <without clause> | <outer join clause>
Table Operators
1. Summary
This section covers the various table operators available in D4. The various inference mechanisms and updatability rules are covered for each of the operators.
ClosureAll D4 table operators are fully atomic and produce a new table value. This property is known as closure and allows the results of one table operator to be used as the argument to another. This allows operators within expressions to be nested as often as necessary, and in any order.
The following sections cover each table operator in detail. The sample database provided will be used throughout the examples in this section:
create table Employee { ID : Integer, Name : String, key { ID } }; insert table { row { 1 ID, "Joe" Name }, row { 2, "John" }, row { 3, "Josh" }, row { 4, "Jeff" }, row { 5, "Jake" }, row { 6, "Jeff" } } into Employee; create table Manager { Manager_ID : Integer, Employee_ID : Integer, key { Manager_ID, Employee_ID }, reference Manager_Employee { Manager_ID } references Employee { ID }, reference Employee_Employee { Employee_ID } references Employee { ID } }; insert table { row { 1 Manager_ID, 2 Employee_ID }, row { 1 Manager_ID, 3 Employee_ID }, row { 2 Manager_ID, 4 Employee_ID }, row { 2 Manager_ID, 6 Employee_ID }, row { 4 Manager_ID, 5 Employee_ID } } into Manager; create table EmployeePhone { ID : Integer, Phone : String, key { ID, Phone } }; EmployeePhone := table { row { 1 ID, "555-1234" Phone }, row { 1 ID, "555-1000" Phone }, row { 2 ID, "555-4321" Phone }, row { 4 ID, "555-4444" Phone }, row { 5 ID, "555-5678" Phone }, row { 10 ID, "411" Phone }, row { 12 ID, "911" Phone } };
2. Restriction
The where operator applies a condition to each row of the input. Only rows for which the condition evaluates to true appear in the result.
The where clause in D4 has the following syntax:
<where clause> ::= where <expression term> [<language modifiers>]
The expression specified by <expression term> must be boolean-valued. Within the expression, access to the values of the input row are available by column name.
The result of a where has the same columns, orders, references, and
metadata as the input table value. The restriction condition becomes a
constraint of the result. If the restriction condition uses an equality
test (=) against a key column, that column will be removed from the key
if the equality comparison is part of a conjunction of comparisons
(
Data modifications against views defined using where are propagated directly to the input table. The language modifier EnforcePredicate can be used to control whether or not the new row must satisfy the restriction condition. The language modifier can be set to "true" or "false". The default is "false".
The following example illustrates the use of the where operator:
select Employee where ID >= 4;
ID Name -- ---- 4 Jeff 5 Jake 6 Jeff
See Also
3. Projection
Projection allows a given set of columns to be removed from the result. There are two methods for specifying the projection list in D4, over, and remove. The over operator specifies the desired columns, while remove specifies the unwanted columns.
The over and remove clauses in D4 have the following syntax:
<over clause> ::= over "{"<column name commalist>"}" [<language modifiers>] <remove clause> ::= remove "{"<column name commalist>"}" [<language modifiers>] <column name> ::= <qualified identifier>
The result of a projection has only the columns specified. Only keys of the input which are completely included in the specified column list are keys of the result. If all keys are excluded by the projection, the key becomes all columns of the result, eliminating duplicates as necessary. Only orders of the input which are completely included in the specified column list are orders of the result. References of the input which are completely included in the specified column list are references of the result.
Data modifications against views defined using over or remove are accomplished by performing the corresponding modifications on the input table. An insert will be rejected if the projection has excluded columns which do not have a default defined.
The following example illustrates the use of the project clause:
select Employee over { Name };
Name ---- Jake Jeff Joe John Josh
The following query is equivalent to the above example but uses the remove clause instead:
select Employee remove { ID };
Name ---- Jake Jeff Joe John Josh
The following examples illustrate key inference in a projection:
select Employee where Name = "Jeff";
ID Name -- ---- 4 Jeff 6 Jeff
select Employee where Name = "Jeff" over { Name };
Name ---- Jeff
See Also
4. Extension
The add clause in D4 has the following syntax:
<add clause> ::= add "{"<ne named expression term commalist>"}" [<language modifiers>] <named expression term> ::= <expression term> <qualified identifier> <metadata>
Expressions within the <add clause> have access to the values of the current row by column name.
The result of an add has the same columns of the input, with the
additional columns as defined by
select Employee add { ID ID1 }
will have keys { ID }
and { ID1 }
.
Modifications to views defined using add are propagated to the input by removing the extended columns.
The following example illustrates the use of the add operator:
select Employee add { "Employee Name = " + Name NewName } over { NewName };
NewName -------------------- Employee Name = Jake Employee Name = Jeff Employee Name = Joe Employee Name = John Employee Name = Josh
See Also
5. Rename
The rename operator is used to rename columns in the result. There are two variations of the rename operator. One renames a specified set of columns, and the other renames all the columns by qualifying each column name with a given identifier.
The rename clause in D4 has the following syntax:
<rename clause> ::= rename ("{"<ne named column name commalist>"}" | (<qualified identifier> <metadata>)) [<language modifiers>] <named column name> ::= <column name> <column name> <column name> ::= <qualified identifier>
The result of a rename operator has the same columns as the input, with the names changed as specified. The keys, orders, and references are included with the names of the columns involved updated appropriately. The result has the same metadata as the input.
Data modifications against views defined using rename are accomplished by transforming the modifications as appropriate for the name changes.
The following examples illustrate the use of the rename operator.
select Employee rename { ID EmployeeID, Name FirstName };
EmployeeID FirstName ---------- --------- 1 Joe 2 John 3 Josh 4 Jeff 5 Jake 6 Jeff
select Employee rename X;
X.ID X.Name ---- ------ 1 Joe 2 John 3 Josh 4 Jeff 5 Jake 6 Jeff
See Also
6. Aggregate
The group, or aggregate table operator allows operations based on sets of rows to be computed and added to the result set. It should be noted that aggregation is not a primitive operator, as it can be expressed in terms of other operators. For example, the expression:
Employee group by { Name } add { Count() NameCount }
can also be expressed as:
Employee over { Name } rename X add { Count(Employee where Name = X.Name) NameCount }
The aggregate clause in D4 has the following syntax:
<aggregate clause> ::= group [by "{"<ne column name commalist>"}"] add "{"<ne named aggregate expression commalist>"}" [<language modifiers>] <named aggregate expression> ::= <aggregate expression> <column name> <aggregate expression> ::= <operator name> "(" [distinct] [<column name commalist>] [order by "{"<order column definition commalist>"}"] ")" <column name> ::= <qualified identifier> <operator name> ::= <qualified identifier>
The expression includes an optional
The
With the exception of the Count, All, and Any operators, all the system-provided aggregate operators return nil when invoked on an empty set.
The result of the group operation is a table with the columns
specified in the by clause and a column for each
Data modifications against views defined using group are accomplished by performing the modifications as though the expression were written longhand. In other words, the modifications are propagated through the equivalent projection and extension operators.
For complete descriptions of the aggregate operators available in D4, refer to Aggregate Operators in the Dataphor System Library Reference.
See Also
7. Quota
The return, or quota operator limits the result set to a given number of rows based on a specified order. Note that invocation of the return operator does not guarantee that the resulting set will have the given number of rows. There may be less, and there may be more, depending on the data involved, as explained below.
The quota clause in D4 has the following syntax:
<quota clause> ::= return <expression term> [by "{"<ne order column definition commalist>"}"] [<language modifiers>] <order column definition> ::= <column name> [sort <expression>] [asc | desc] [(include | exclude) nil] <column name> ::= <qualified identifier>
The expression specified by <expression term> must be integer-valued, and specifies the number of rows to be returned in the result set. The actual number of rows returned may be lower if the input does not have enough rows to fulfill the request.
Note that if the columns specified in the by clause do not completely include some key of the input, then the actual cardinality of the output may be more than the number specified by the return expression. This is because the result will include rows that have the same values for the columns specified in the by clause. If the by clause is omitted, the compiler will select a key of the input to be used as the by specifier for the operation.
If the quota operator specifies that a single row should be returned (and the compiler can make this determination at compile time, i.e. the return expression is literal and evaluates to 1), and the quota operation is performed by some key of the input, the key of the output is the empty key. Otherwise, every key of the input is also a key of the output.
The result of a return operator has the same columns, orders, references, and metadata as the input.
Data modifications against views defined using return are propagated directly to the input. The language modifier EnforcePredicate can be used to control whether or not the new row must satisfy the quota condition. The language modifier can be set to "true" or "false". The default is "false".
The following examples illustrate the use of the quota operator:
select Employee return 3 by { Name };
ID Name -- ---- 5 Jake 4 Jeff 6 Jeff 1 Joe
select Employee return 3 by { Name, ID };
ID Name -- ---- 5 Jake 4 Jeff 6 Jeff
select Employee return 3 by { Name desc };
ID Name -- ---- 3 Josh 2 John 1 Joe
See Also
8. Explode
The explode operator allows hierarchical queries to be expressed. Optional include specifications allow both the sequence within the hierarchy, and the level of the hierarchy to be included in the result set.
The explode clause in D4 has the following syntax:
<explode clause> ::= explode by <expression term> where <expression term> [order by "{"<order column definition commalist>"}"] [include level [<column name>] <metadata>] [include sequence [<column name>] <metadata>] [<language modifiers>]
The expressions specified in the by clause and the where clause must be boolean-valued. The by clause specifies the explode condition, and the where clause specifies the root condition. Within the explode condition, the values of the current parent row are accessible by the name of the column preceded by the parent keyword.
The optional order by specification provides a mechanism for describing the order in which rows will be processed in the explode.
The optional include level specification indicates that a column of
type System.Integer and named level, or
The optional include sequence specification indicates that a column of
type System.Integer and named sequence, or
Note that if level or sequence are included, the input to the explode is required to be well-ordered (ordered by at least a key). This requirement ensures that the operation is well-defined when used on graphs that may have multiple parents for a single node (networks vs. hierarchies).
The result of an explode operator has all the columns of the input plus any included columns. All the keys are preserved, plus the key for the sequence, if included. The orders, references, and metadata of the input are preserved.
Modifications to views defined using explode are propagated directly to the input.
The following example illustrates the use of the explode operator:
select Manager join (Employee rename { ID Manager_ID, Name Manager_Name }) join (Employee rename { ID Employee_ID, Name Employee_Name }) explode by Manager_ID = parent Employee_ID where Manager_ID = 1 include level include sequence;
Manager_ID Employee_ID Manager_Name Employee_Name level sequence ---------- ----------- ------------ ------------- ----- -------- 1 2 Joe John 1 1 2 4 John Jeff 2 2 4 5 Jeff Jake 3 3 2 6 John Jeff 2 4 1 3 Joe Josh 1 5
See Also
9. Adorn
The adorn clause in D4 has the following syntax:
<adorn clause> ::= adorn ["{"<ne adorn item definition commalist>"}"] <metadata> <alter metadata> [<language modifiers>] <adorn item definition> ::= <adorn column> | <key definition> | (alter key "{"<column name commalist>"}" <alter metadata>) | (drop key "{"<column name commalist>"}") | <order definition> | (alter order "{"<order column definition commalist>"}" <alter metadata>) | (drop order "{"<order column definition commalist>"}") | <row constraint definition> | <reference definition> | (alter reference <reference name> <alter metadata>) | (drop reference <reference name>) <adorn column> ::= <column name> <nilable definition> ["{"<ne column definition item commalist>"}"] <metadata> <alter metadata> <column name> ::= <qualified identifier>
The adorn operator allows the definition of column defaults, column constraints, column metadata, keys, orders, and constraints, as well as the ability to alter the metadata for derived references, and exclude inferred keys, orders, and references. Each of these constructs is declared exactly as they are in the corresponding DDL statements. Note that keys and references introduced by the adorn operator are only used as structural information in the result set and are not enforced in the resulting expression, or within the database. Other types of constraints introduced by the adorn operator, such as row and transition constraints, are enforced.
The result of an adorn operator has the same columns, keys, orders, references, and metadata as the input, with the additional structural and metadata information specified by the operator.
Data modifications against views defined using adorn are propagated directly to the input.
The following example illustrates the use of the adorn operator:
select Employee adorn { ID { default 0 } tags { Frontend.Width = "5" }, constraint IDValid ID > 5 tags { DAE.Message = "ID must be greater than 5" } };
ID Name -- ---- 6 Jeff
See Also
10. Redefine
The redefine operator is shorthand for an add-remove-rename operation. For example, the expression:
Employee redefine { ID := ID * 2 }
is equivalent to the following expression:
Employee add { ID * 2 X } remove { ID } rename { X ID }
The redefine clause in D4 has the following syntax:
<redefine clause> ::= redefine "{"<redefine column commalist>"}" [<language modifiers>] <redefine column> ::= <column name> := <expression term> <column name> ::= <qualified identifier>
Each column is redefined in terms of an expression. Values of the current row are accessible by name within the expression.
The result of a redefine operation has the same column names as the input, with the specified columns redefined as specified. If any redefined column participates in a key, order, or reference, that structure is no longer part of the result. If this results in the elimination of all the keys, the key of the result is all the columns of the result.
Data modifications against views defined using redefine are propagated as though the equivalent add-remove-rename expression had been used.
The following example illustrates the use of the redefine operator:
select Employee redefine { ID := ID * 2 };
ID Name -- ---- 2 Joe 4 John 6 Josh 8 Jeff 10 Jake 12 Jeff
11. Specify
The specify operator ({}
) is shorthand for an add-project-rename
operation. It allows the desired column list to be specified in a single
operation. The operation will only include extension, and rename if
necessary, but will always include a projection over the specified
column list.
The specify clause in D4 has the following syntax:
<specify clause> ::= "{"<optionally named expression commalist>"}" [<language modifiers>] <optionally named expression> ::= <expression> [<column name>] <column name> ::= <qualified identifier>
Each column specifies either a column from the source table, or an expression that will be evaluated in terms of the source table, to be included in the result set. If the column specifies an expression, it must also specify a name for the column in the result set. Otherwise, the column may optionally specify a new name for the column in the result set.
The following examples illustrate the use of the specify operator:
// Specify as projection select Employee { ID };
ID -- 1 2 3 4 5 6
// Specify as rename select Employee { ID Employee_ID, Name Employee_Name }
Employee_ID Employee_Name ----------- ------------- 1 Joe 2 John 3 Josh 4 Jeff 5 Jake 6 Jeff
// Specify as extension select Employee { ID, Name, ID + 1 NewID }
ID Name NewID -- ---- ----- 1 Joe 2 2 John 3 3 Josh 4 4 Jeff 5 5 Jake 6 6 Jeff 7
12. Union
The union operator allows the rows of two table values to be included in a single result set. If a given row appears in both inputs, it will only appear once in the result. In other words, the union operation eliminates duplicates.
The union clause in D4 has the following syntax:
<union clause> ::= union <expression term> [<language modifiers>]
The expression given by <expression term> must be table-valued. Both inputs to the union operation must be of the same table type.
The result of a union operation has the same type as the inputs. The key of the result is all columns of the table. The result has the orders, references, and metadata from both inputs.
Modifications to views defined using union are propagated to the inputs A and B as follows:
-
Insert
If the newly inserted row can be inserted into A or B, it is. Note that this implies that if the row can be inserted into both A and B it will appear in both. Only if the row cannot be inserted into either A or B will the insert be rejected.
-
Update
Update is treated as a delete of the original row, followed by an insert of the new row. If both operations can be performed against A or B, the operation is accepted. Note that this implies that if the row can be updated in both A and B, the update will be performed on both. Only if the modification fails on both A and B is the update rejected.
-
Delete
If the deleted row can be deleted from A or B, it is. Note that this implies that if the row can be deleted from both A and B it will be deleted from both. Only if the row cannot be delted from either A or B will the delete be rejected.
The following example illustrates the use of the union operator:
select (Employee where ID <= 4) union (Employee where ID >= 3);
ID Name -- ---- 1 Joe 2 John 3 Josh 4 Jeff 5 Jake 6 Jeff
See Also
13. Intersect
The intersect operator computes the intersection of two table values. If a given row appears in both inputs, it will appear in the result.
The intersect clause in D4 has the following syntax:
<intersect clause> ::= intersect <expression term> [<language modifiers>]
The expression given by <expression term> must be table-valued. Both inputs to the intersect operator must be of the same table type.
Because intersect is a special case of join, it has the same semantics for type inference and updatability. For this information, see join
The following example illustrates the use of the intersect operator:
select (Employee where ID <= 4) intersect (Employee where ID >= 3);
ID Name -- ---- 3 Josh 4 Jeff
See Also
Join | Times | Sytem.iJoin
14. Minus
The minus operator computes the difference of two table values. Only rows appearing in the first input and not the second will appear in the result.
The minus clause in D4 has the following syntax:
<minus clause> ::= minus <expression term> [<language modifiers>]
The expression given by <expression term> must be table-valued. Both inputs to the minus operator must be of the same table type.
The result of the minus operator has the same table type as both of the inputs. Keys, orders, references, and metadata are inferred from the first input.
Modifications to views defined using minus are propagated to the inputs A and B as follows:
-
Insert
If the newly inserted row can be inserted into B, it violates the predicate of the difference, and an error is returned. Otherwise, it is inserted into A.
-
Update
If the new version of the row can be inserted into B, it violates the predicate of the difference, and an error is returned. Otherwise, the update is performed against A.
-
Delete
The deleted row is deleted from A.
The following example illustrates the use of the minus operator:
select Employee minus (Employee where ID >= 3);
ID Name -- ---- 1 Joe 2 John
See Also
15. Times
The times operator computes the cartesian product (also called the cross join) of the inputs. For every row in the first input, a row appears in the result for every row in the second input that is the concatenation of both rows.
The times clause in D4 has the following syntax:
<times clause> ::= times <expression term> [<language modifiers>]
The expression given by <expression term> must be table-valued. Inputs must have no column names in common.
Because times is a special case of join, it has the same semantics for type inference and updatability. For this information, see join.
The following example illustrates the use of the times operator:
select Employee times (Employee rename { Name FirstName } over { FirstName });
ID Name FirstName -- ---- --------- 1 Joe Jake 1 Joe Jeff 1 Joe Joe 1 Joe John 1 Joe Josh 2 John Jake 2 John Jeff 2 John Joe 2 John John 2 John Josh 3 Josh Jake 3 Josh Jeff 3 Josh Joe 3 Josh John 3 Josh Josh 4 Jeff Jake 4 Jeff Jeff 4 Jeff Joe 4 Jeff John 4 Jeff Josh 5 Jake Jake 5 Jake Jeff 5 Jake Joe 5 Jake John 5 Jake Josh 6 Jeff Jake 6 Jeff Jeff 6 Jeff Joe 6 Jeff John 6 Jeff Josh
See Also
16. Join
The join operator computes the combination of two table values based on the matching rows for a given set of columns. There are two types of joins in D4, the natural join and the conditioned join.
The natural join simply takes two table values as input and uses the commonly named columns to perform the join. The conditioned join includes a by clause which specifies the join condition.
Note that the two forms of the join operator are equivalent in terms of expressive power. Both forms are included in D4 to allow for different user preferences. The natural join lends itself to a database design in which column names are unique across the database, while the conditioned join lends itself to a design in which column names are only unique within tables.
The join clause in D4 has the following syntax:
<join clause> ::= join <join specifier> [<language modifiers>] <join specifier> ::= <expression term> [by <expression term>]
The result of a join operator on inputs A and B having column sets X,
Y, and Z, where A has the columns { X, Y }
and B has the columns { Y,
Z }
has the columns { X, Y, Z }
. Y represents the columns common to
both inputs. Note that each of X, Y, and Z may be an empty set. For a
conditioned join, Y is required to be an empty set.
The body of the result has a row for each row in A that matches the values for the columns given in Y for natural joins, or the condition for conditioned joins, for each row in B. The join operator is equivalent to a cartesian product of the inputs, followed by a restriction using the join condition.
Based on the join condition and the key information of the join operands, the cardinality of the join can be determined. The cardinality of a given join is the relationship between the number of rows in the left input and the number of potentially matching rows in the right input. There are four possibilities:
- One-to-one
-
If the join columns completely include some key in both inputs, the join is one-to-one. In other words, there will only ever be one matching row in the right input for a given row in the left input, and vice versa.
- One-to-many
-
If the join columns completely include some key of the left input, but not of the right input, the join is one-to-many. In other words, there may be any number of rows in the right input for any given row in the left input.
- Many-to-one
-
If the join columns completely include some key of the right input, but not of the left input, the join is many-to-one. In other words, there may be any number of rows in the left input for any given row in the right input.
- Many-to-many
-
If the join columns do not completely include any key of the left or right inputs, the join is many-to-many. In other words, there may be any number of rows in the left input that match any number of rows in the right input, and vice versa.
The cardinality of a join effectively determines the cardinality of the result. In addition to aiding the Dataphor Server in access path selection and optimization tasks, this information is used to determine how keys and orders are inferred through the join operation:
- One-to-one
-
For one-to-one joins, every key of the left and right inputs is also a key of the output. In the case of natural joins, duplicate keys are eliminated, resulting in a single key in the output.
For one-to-one joins, every order of the left and right inputs is also an order of the output, with duplicate orders eliminated in the case of natural joins.
- One-to-many
-
For one-to-many joins, every key of the right input is also a key of the output, plus a key for each key of the right input with join columns replaced by their correlated columns from the left input.
For one-to-many joins, every order of the right input is also an order of the output, with duplicate orders eliminated in the case of natural joins.
- Many-to-one
-
For many-to-one joins, every key of the left input is also a key of the output, plus a key for each key of the left input with join columns replaced by their correlated columns from the right input.
For many-to-one joins, every order of the left input is also an order of the output, with duplicate orders eliminated in the case of natural joins.
- Many-to-many
-
For many-to-many joins, every key of the left input is combined with every key of the right output to form a key in the output. In other words, the output has the cartesian product of all keys of both inputs, with duplicates eliminated in the case of natural joins. In addition, for conditioned joins only one column from each join condition will appear in any given key.
No orders are inferred for many-to-many joins.
For all types of joins, every source or target reference from both inputs that does not completely include the join columns is a source or target reference of the output, respectively. In other words, references that include the common columns of the join will not be inferred.
Data modifications against views defined using join are supported by projecting the modifications over the columns of each input.
The following sections describe the natural and conditioned joins.
16.1. Natural Join
The natural join uses the commonly named columns, if any, in the inputs to perform the join. A natural join is by definition an equi-join. The commonly named columns are only included once in the result.
If the inputs have no column names in common, the natural join degrades to times. If the inputs have all column names in common, the natural join becomes an intersection. In both cases, type inference and updatability are unaffected. Note that if a natural join results in a times operation, the compiler will report a warning that a possibly incorrect times expression has been issued [1].
The following example illustrates the use of the natural join operator:
select Employee join EmployeePhone;
ID Name Phone -- ---- -------- 1 Joe 555-1000 1 Joe 555-1234 2 John 555-4321 4 Jeff 555-4444 5 Jake 555-5678
16.2. Conditioned Join
The conditioned join allows the join condition to be specified as part of the join. The input tables must have no column names in common.
The expression specified within the by clause by <expression term> must be boolean-valued and must specify an equi-join. These requirements are enforced by the compiler. Within the expression, the current values of both rows are accessible by column name.
The following example illustrates the use of the conditioned join operator:
select (Employee rename E) join (EmployeePhone rename EP) by E.ID = EP.ID remove { EP.ID };
E.ID E.Name EP.Phone ------ -------- ----------- 1 Joe 555-1000 1 Joe 555-1234 2 John 555-4321 4 Jeff 555-4444 5 Jake 555-5678
See Also
17. Outer Join
The result of an outer join is the same as a join, except that rows in one side for which no matching row was found are still included in the result set, with the columns for the side with no matching row in the input set to nil.
In addition to specifying a natural or conditioned outer join, the outer join may be left or right. Left indicates that all rows in the left input should be preserved, while right indicates that all rows in the right input should be preserved.
The outer join clause in D4 has the following syntax:
<outer join clause> ::= (left | right) (join | lookup) <join specifier> [include rowexists [<column name>] <metadata>] [<language modifiers>] <join specifier> ::= <expression term> [by <expression term>] <column name> ::= <qualified identifier>
The keys of the result are inferred from the keys of both inputs, and the cardinality of the join:
-
One-to-one:
-
One-to-many:
The result has all the keys of the right input, plus a corresponding key for each key of the right input that has columns in common with the join columns. The join columns in the corresponding keys are replaced with the corresponding join columns in the left input. If the join is left outer, the corresponding keys are inferred as sparse keys.
-
Many-to-one:
The result has all the keys of the left input, plus a corresponding key for each key of the left input that has columns in common with the join columns. The join columns in the corresponding keys are replaced with the corresponding join columns in the right input. If the join is right outer, the corresponding keys are inferred as sparse keys.
-
Many-to-many:
For each key in the first input, for each key in the second input, the result has a key that is the union of the columns in both keys. In other words, the result has the cartesian product of the keys of both inputs. In addition, for conditioned joins only one column from each join condition will appear in any given key.
The optional rowexists column is used to indicate whether a row is an actual join match, or is included in the result because the join is outer. This column is boolean-valued, and will be true whenever the row is a join match, and false otherwise. This column is updatable, and causes the insertion or deletion of a row in the outer input of the operation. Setting the rowexists column true causes a row to be inserted into the outer input, while setting the rowexists column false causes a row to be deleted from the outer input.
The lookup operator has the same semantics for retrieval, but modifications do not propagate through the lookup. For example, in a view defined by A left lookup B, an insert would be propagated to A, but not to B. The lookup operator is used primarily by client applications to control the update semantics involved in an expression.
The following examples illustrate the use of the outer join operators:
select Employee left join EmployeePhone;
ID Name Phone -- ---- ---------- 1 Joe 555-1000 1 Joe 555-1234 2 John 555-4321 3 Josh <no value> 4 Jeff 555-4444 5 Jake 555-5678 6 Jeff <no value>
select Employee right join EmployeePhone;
ID Name Phone -- ---------- -------- 1 Joe 555-1000 1 Joe 555-1234 2 John 555-4321 4 Jeff 555-4444 5 Jake 555-5678 10 <no value> 411 12 <no value> 911
select Employee left lookup EmployeePhone include rowexists Extended;
ID Name Extended Phone -- ---- -------- ---------- 1 Joe True 555-1000 1 Joe True 555-1234 2 John True 555-4321 3 Josh False <no value> 4 Jeff True 555-4444 5 Jake True 555-5678 6 Jeff False <no value>
create view EmpWithPhone Employee left lookup EmployeePhone; select EmpWithPhone;
ID Name Phone -- ---- ---------- 1 Joe 555-1000 1 Joe 555-1234 2 John 555-4321 3 Josh <no value> 4 Jeff 555-4444 5 Jake 555-5678 6 Jeff <no value>
// will be ignored since phone is across lookup update EmpWithPhone set { Phone := "1701" }; select EmpWithPhone;
ID Name Phone -- ---- ---------- 1 Joe 555-1000 1 Joe 555-1234 2 John 555-4321 3 Josh <no value> 4 Jeff 555-4444 5 Jake 555-5678 6 Jeff <no value>
See Also
18. Semijoin
In addition to the traditional join, the D4 language includes a semijoin operator called having. Loosely speaking, the operator computes the set of rows from the left input for which a matching row exists in the right input. It is equivalent to a join, followed by a projection over the columns of the right input. For example, the following query expressed using having:
select Employee having (EmployeeDepartment { Employee_ID ID })
is equivalent to the following join-project:
select Employee join (EmployeeDepartment { Employee_ID ID, Department_ID }) over { ID, Name }
As with the join operator, there are two flavors of semijoin, the natural and conditioned semijoin. However, because the result set will always be projected over the columns of the left input, conditioned semijoins are not required to have unique column names. For example, the having above could be written:
select Employee having EmployeeDepartment by ID = Employee_ID
In addition, the semijoin condition can make use of the keywords left and right to distinguish potentially ambiguous column names, as in:
select Employee having EmployeeDepartment by left.ID = right.Employee_ID
The having clause in D4 has the following syntax:
<having clause> ::= having <join specifier> [<language modifiers>] <join specifier> ::= <expression term> [by <expression term>]
The result of a having operator on inputs A and B having column sets
X, Y, and Z, where A has the columns { X, Y }
and B has the columns {
Y, Z }
has the columns { X, Y }
. As with the join operator, Y
represents the columns common to both inputs, and each of X, Y, and Z
may be an empty set. Unlike the join, Y is not required to be empty for
a conditioned semijoin.
The body of the result has a row for each row in A that matches the values for the columns given in Y for natural joins, or the condition for conditioned joins, for any row in B. Note carefully the difference in cardinality from a join operator. The cardinality of the result of a having will always be less than or equal to the cardinality of the left input.
As stated above, the semijoin is equivalent to a join followed by a projection over the columns of the left input.
Another way of expressing the semijoin operation is with the exists operator in a restriction condition. For example:
select Employee where exists (EmployeeDepartment where Employee_ID = ID)
In other words, the semijoin operator is a restriction based on the existence of a matching row in the right input. Because of this, the rules governing inference and updatability for the having operator most closely resemble that of the where operator. Namely, all columns, keys, orders, references, and metadata of the left input are inferred for the result.
Data modifications against views defined using having are propagated directly to the left input. The language modifier EnforcePredicate can be used to control whether or not the new row must satisfy the semijoin condition. The language modifier can be set to "true" or "false", with "false" being the default.
See Also
19. Semiminus
The semiminus, or without, operator computes the set of rows from the left input for which a matching row does not exist in the right input. This operator is effectively the opposite of a semijoin, in that it is a restriction with a not exists in the condition. For example:
select Employee without (EmployeeDepartment { Employee_ID ID, Department_ID })
is equivalent to:
select Employee where not exists (EmployeeDepartment where Employee_ID = ID)
As with the join and having operators, there are two flavors of semiminus, the natural and conditioned semiminus. The result set will always be projected over the columns of the left input, so the conditioned semiminus is not required to have unique column names. For example, the without above could be written:
select Employee without EmployeeDepartment by ID = Employee_ID
In addition, the semiminus condition can make use of the keywords left and right to distinguish potentially ambiguous column names, as in:
select Employee without EmployeeDepartment by left.ID = right.Employee_ID
The without clause in D4 has the following syntax:
<without clause> ::= without <join specifier> [<language modifiers>] <join specifier> ::= <expression term> [by <expression term>]
The result of a without operator on inputs A and B having column sets
X, Y, and Z, where A has the columns { X, Y }
and B has the columns {
Y, Z }
has the columns { X, Y }
. As with the join operator, Y
represents the columns common to both inputs, and each of X, Y, and Z
may be an empty set. Unlike the join, Y is not required to be empty for
a conditioned semiminus.
The body of the result has a row for each row in A that has no matching row in B based on the values for the columns given in Y for natural joins, or the condition for conditioned joins. The cardinality of the result of a without will always be less than or equal to the cardinality of the left input.
Another way of expressing the semijoin operation is with the exists operator in a restriction condition. For example:
select Employee where exists (EmployeeDepartment where Employee_ID = ID)
As with the semijoin, the semiminus operator is basically a restriction of the left input. Because of this, the rules governing inference and updatability for the without operator most closely resemble that of the where operator. Namely, all columns, keys, orders, references, and metadata of the left input are inferred for the result.
Data modifications against views defined using without are propagated directly to the left input. The language modifier EnforcePredicate can be used to control whether or not the new row must satisfy the semijoin condition. The language modifier can be set to "true" or "false", with "false" being the default.
See Also
20. Comparison Operators
The following comparison operators are defined for tables:
-
iEqual returns true if the given table values are equal. iNotEqual returns true if the two table values are not equal.
Two table values are equal if they are of the same table type, and they have the same set of rows. For example:
begin var LTable1 : table { ID : Integer } := table { row { 5 ID }, row { 6 ID }, row { 7 ID } }; var LTable2 : table { ID : Integer } := table { row { 5 ID }, row { 6 ID }, row { 7 ID } }; if not(LTable1 = LTable2) then raise Error("Table values are not equal"); // equivalent formulation if not exists(LTable1 minus LTable2) and not exists (LTable2 minus LTable1) then raise Error("Table values are equal"); end;
-
iLess(<)
Returns true if the first table value is a proper subset of the second table value. A given table value A is a proper subset of some table value B if A and B are both of the same table type, and all the rows in A are in B, and there is at least one row in B that is not in A. For example:
begin var LTable1 : table { ID : Integer } := table { row { 5 ID }, row { 6 ID } }; var LTable2 : table { ID : Integer } := table { row { 5 ID }, row { 6 ID }, row { 7 ID } }; if not(LTable1 < LTable2) then raise Error("LTable1 is not a proper subset of LTable2"); // equivalent formulation if not exists (LTable1 minus LTable2) and exists (LTable2 minus LTable1) then raise Error("LTable1 is a proper subset of LTable2"); end;
-
iInclusiveLess(<=)
Returns true if the first table value is a subset, not necessarily proper, of the second table value. A given table value A is a subset of some table value B if A and B are both of the same table type, and all the rows in A are in B. For example:
begin var LTable1 : table { ID : Integer } := table { row { 5 ID }, row { 6 ID } }; var LTable2 : table { ID : Integer } := table { row { 5 ID }, row { 6 ID }, row { 7 ID } }; if not(LTable1 <= LTable2) then raise Error("LTable1 is not a subset of LTable2"); // equivalent formulation if not exists (LTable1 minus LTable2) then raise Error("LTable1 is a subset of LTable2"); end;
-
iGreater(>)
Returns true if the first table value is a proper superset of the second table value. A given table value A is a proper superset of some table value B if A and B are both of the same table type, and all the rows in B are in A, and there is at least one row in A that is not in B. For example:
begin var LTable1 : table { ID : Integer } := table { row { 5 ID }, row { 6 ID }, row { 7 ID } }; var LTable2 : table { ID : Integer } := table { row { 5 ID }, row { 6 ID } }; if not(LTable1 > LTable2) then raise Error("LTable1 is not a proper superset of LTable2"); // equivalent formulation if not exists (LTable2 minus LTable1) and exists (LTable1 minus LTable2) then raise Error("LTable1 is a proper superset of LTable2"); end;
-
iInclusiveGreater(>=)
Returns true if the first table value is a superset, not necessarily proper, of the second table value. A given table value A is a superset of some table value B if A and B are both of the same table type, and all the rows in B are in A. For example:
begin var LTable1 : table { ID : Integer } := table { row { 5 ID }, row { 6 ID }, row { 7 ID } }; var LTable2 : table { ID : Integer } := table { row { 5 ID }, row { 6 ID } }; if not (LTable1 >= LTable2) then raise Error("LTable1 is not a superset of LTable2"); // equivalent formulation if not exists (LTable2 minus LTable1) then raise Error("LTable1 is a superset of LTable2"); end;
-
iCompare(?=)
The iCompare operator is defined in terms of the other comparison operators:
A ?= B = if A = B then 0 else if A < B then -1 else 1;