Lexical Elements

Dataphor Logo

1. Summary

The lexical elements of the D4 language allow special symbols (or tokens) in an input string to be recognized. These tokens help the parser determine the syntax of a given statement. A complete BNF reference for the D4 lexical analyzer can be found in D4 BNF Lanquage Diagrams.

2. Whitespace

The D4 language, like most computer languages, reserves certain characters as "whitespace". This means that the characters are only used to delimit tokens. The following D4 statements are syntactically equivalent:

X := 5 * 3;

X:= 5*      3       ;

X :=
5
*
3;

X:=5*3;

The following Unicode character values are considered whitespace in D4: 0x0009 (tab), 0x000a (line feed), 0x000b (vertical tab), 0x000c (form feed), 0x000d (carriage return), 0x0085, 0x2028, and 0x2029.

3. Comments

Comments are sections of text within D4 code that are ignored by the compiler. Block comments allow for multi-line annotations, whereas line comments are terminated by the line break. Block comments in D4 can be nested, which allows for sections of code to be easily and temporarily "commented out". The following are some examples of D4 comments:

/* This is a multi
   line... /* nested */ ...comment */

X := 5; // Assigns 5 to variable X

4. Keywords

A keyword is a special symbol used by the parser to delimit syntactic structure. For example, the keyword begin is used to delimit the beginning of a statement block. To avoid ambiguity while parsing, some keywords are also reserved words. Reserved words are keywords that cannot be used as identifiers because the parser would not be able to distinguish between an identifier and the keyword.

The following is a list of all keywords in D4. Keywords with an asterisk (*) denote reserved words. The link provided will show one possible use of the keyword however there may be more. Use the help search to find all uses of a keyword.

*add,

*adorn,

all,

*and,

as

asc

by

div

do

end

for

if

*in

is

mod

modify

new

nil

not

of

old

on

or

origin

row

set

*source

do

*target

to

try

var

*xor

5. Symbols

The D4 language also includes several special symbols that are used by the parser to delimit syntatic structure. These include parentheses, brackets, operator symbols, and other characters that have specific meaning within statements of D4. None of these symbols may be used in identifier names.

The following are parser-recognized symbols in D4.

-

$

&

(

)

*

**

,

.

/

:

:=

;

?=

[

]

^

{

|

}

~

+

<

<<

<=

<>

=

>

>=

>>

6. Parser Literals

A parser literal is a value which is directly understood by the lexer as a token. For example the symbol 5 is a parser literal which represents the System.Integer value 5. The following types of parser literals are available in D4:

  • Boolean

  • Integer

  • Decimal

  • Money

  • String

Here are some examples of parser literals within D4:

"Welcome to the ""community"" website."
'"Hello," she said.'
135
332.12d
31415926535897932e-16
$40.00
true

6.1. Boolean

The boolean parser literal allows values of type System.Boolean to be represented directly within D4.

The boolean parser literal in D4 has the following syntax:

<boolean parser literal> ::=
    true | false

6.2. Integer

The integer parser literal allows values of type System.Integer to be represented directly within D4. Integer values can be specified as a base 10 number using decimal digits, or as a base 16 number using hex digits. Base 16 representations must be prefixed with the symbol 0x.

The integer parser literal in D4 has the following syntax:

<integer parser literal> ::=
    <digit>{<digit>} | 0x<hexdigit>{<hexdigit>}

6.3. Decimal

The decimal parser literal allows values of type System.Decimal to be represented directly within D4. Note that a sequence of digits alone will be interpreted as a value of type System.Integer so the trailing d must be used.

The decimal parser literal in D4 has the following syntax:

<decimal parser literal> ::=
    <digit>{<digit>}[.{<digit>}][(e|E)[+|-]{<digit>}][d]

6.4. Money

The money parser literal allows values of type System.Money to be represented directly within D4.

The money parser literal in D4 has the following syntax:

<money parser literal> ::=
    $<digit>{<digit>}[.{<digit>}]

The $ symbol is just a symbol for the compiler to identify a data type of money. It does not specifiy the currency used, i.e. dollars.

6.5. String

The string parser literal allows values of type System.String to be represented directly within D4. Note that the straight single quote character (') or the straight double quote character (") can be used to delimit a string. Do not use curved quote characters. Within the string, the delimiting quote character can be represented by double quoting.

The string parser literal in D4 has the following syntax:

<string parser literal> ::=
    ""{<character>}"" | '{<character>}'

7. Identifiers

Identifiers are user-defined names for catalog objects such as variables and types.

D4 identifiers have the following syntax:

<identifier> ::=
    _ | <letter> {_ | <letter> | <digit>}

Here is an example of a valid D4 identifier:

Customers

7.1. Qualified Identifiers

The D4 language uses the concept of namespaces to allow identifiers to be named more completely, yet accessed more concisely. An identifier that utilizes namespaces is called a qualified identifier because it is prefixed by one or more identifiers called qualifiers.

Qualified identifiers have the following syntax:

<qualified identifier> ::=
    [.]{<identifier>.}<identifier>

Name resolution with qualified identifiers is based on the notion of name equivalence. A given name is equivalent to another name if and only if it is equal, case-sensitively, to some unqualified version of the name. Thus:

  • A is equivalent to A

  • A is equivalent to A.A and B.A, but not A.B

  • A.A is equivalent to A.A, but not B.A

When attempting to resolve a name reference against a list of names such as the set of columns in a table, if the reference is equivalent to more than one name in the list, the reference is considered ambiguous.

The following is an example of a qualified identifier:

MyCompany.MyProduct.Customers

7.2. Disambiguating Identifiers

Unless ambiguous, schema objects can be accessed using their unqualified names. Names must be qualified only to the point where they are no longer ambiguous, but may be qualified more completely if desired.

The following example illustrates the use of namespaces in D4:

var MyCompany.MyProduct.MyVariable : Integer;
var MyCompany.OtherProduct.MyVariable : Integer;
MyVariable := 5;  // Error, MyVariable must be disambiguated
OtherProduct.MyVariable := 5; // Valid
MyCompany.MyProduct.MyVariable := 6;  // Also valid

The root of the namespace can be accessed using a dot qualifier with no preceding identifier as follows:

var .I : Integer;
.I := 5;

8. Case

D4 is a case-sensitive language, meaning that the symbols and identifiers read by the compiler will be compared case-sensitively. In other words, the symbol A is different than the symbol a. The following code sample illustrates this behavior.

begin
    var I : Integer;
    I := Length("Relational"); // valid reference
    i := Length("Relational"); // unknown identifier
end;

Because D4 is case-sensitive, Alphora recommends the use of Pascal-casing for all identifiers. In Pascal-casing, the first letter of each word in the identifier is capitalized, for example PascalCasing. This allows identifiers to have a completely open identifier space because all keywords are all lower case. For example, value is not a valid identifier because it conflicts with the reserved word value, but Value is a valid identifier.

Note
Most SQL-based systems are case-insensitive, so be careful not to rely on casing for identifier resolution, as it could lead to problems when translating into the various dialects of SQL.

results matching ""

    No results matching ""