[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3. Lexical Matters

The vocabulary of the Q programming language consists of notations for identifiers, integers, floating point numbers, character strings, comments, a few reserved words which may not be used as identifiers, and some special symbols which are used as operators and delimiters. Whitespace (blanks, tabs, newlines, form feeds) serves as a delimiter between adjacent symbols, but is otherwise ignored. Comments are treated like whitespace:

/* This is a comment ... */

Both Prolog- and BCPL- resp. C++-style line-oriented comments are supported:

% this is a comment ...
// C++-style comment ...

Furthermore, lines beginning with the #! symbol denote a special type of comment which may be processed by the operating system's command shell and the Q programming tools. On UNIX systems, this (odd) feature allows you to execute Q scripts directly from the shell (by specifying the q program as a command language processor) and to include compiler and interpreter command line options in a script (see B.4 Running Scripts from the Shell).

Identifiers are denoted by the usual sequences of letters (including `_') and digits, beginning with a letter. Upper- and lowercase is distinct. In the Q language, identifiers are used to denote both function and variable symbols. As in Prolog, a capitalized identifier (such as X, Xmax and XMAX) indicates a variable symbol; all other identifiers denote function symbols (unless they are declared as "free" variables, see below). In difference to Prolog, the underscore `_' counts as a lowercase letter, hence _MAX is a function symbol, not a variable. However, as an exception to the general rule, the identifier `_' does denote a variable symbol, the so-called anonymous variable.

Variables actually come in two flavours: bound and free variables, i.e., variables which also occur on the left-hand side of an equation, and variables which only occur on the right-hand side and/or in the condition part of an equation. Identifiers may also be declared as free variables; see 5. Declarations. In this case, they may also start with a lowercase letter. Furthermore, all new symbols created interactively in the interpreter are treated as free variable symbols.

Both function and free variable identifiers may also be qualified with a module identifier prefix (cf. 4. Scripts and Modules), to specifically denote a symbol of the given module. Formally, the syntax of identifiers is described by the following grammatical rules:

identifier              : unqualified-identifier
                        | qualified-identifier
qualified-identifier    : unqualified-identifier '::'
unqualified-identifier  : variable-identifier
                        | function-identifier
variable-identifier     : uppercase-letter {letter|digit}
                        | '_'
function-identifier     : lowercase-letter {letter|digit}
letter                  : uppercase-letter|lowercase-letter
uppercase-letter        : 'A'|...|'Z'
lowercase-letter        : 'a'|...|'z'|'_'
digit                   : '0'|...|'9'

(Please refer to A. Q Language Grammar, for a description of the BNF grammar notation used throughout this document.)

The reserved words of the Q language are:

as        and       const     def       div       else      extern
if        import    in        include   mod       not       or
otherwise private   public    special   then      type      undef
var       where

Signed decimal numeric constants are denoted by sequences of decimal digits and may contain a decimal point and/or a scaling factor. Integers can also be denoted in octal or hexadecimal, using the same syntax as in C:

number                  : ['-'] unsigned-number
unsigned-number         : '0' octdigitseq
                        | '0x' hexdigitseq
                        | '0X' hexdigitseq
                        | digitseq ['.' [digitseq]] [scalefact]
                        | [digitseq] '.' digitseq [scalefact]
digitseq                : digit {digit}
octdigitseq             : octdigit {octdigit}
hexdigitseq             : hexdigit {hexdigit}
scalefact               : 'E' ['-'] digitseq
                        | 'e' ['-'] digitseq
digit                   : '0'|...|'9'
octdigit                : '0'|...|'7'
hexdigit                : '0'|...|'9'|'a'|...|'f'|'A'|...|'F'

Simple digit sequences without decimal point and scaling factor are treated as integers; if the sequence starts with `0' or `0x'/`0X' then it denotes an integer in octal or hexadecimal base, respectively. Other numbers denote (decimal) floating point values. If a decimal point is present, it must be preceded or followed by at least one digit. Both the scaling factor and the number itself may be prefixed with a minus sign. (Syntactically, the minus sign in front of a number is interpreted as unary minus, cf. 6. Expressions. However, if unary minus occurs in front of a number, it is interpreted as a part of the number and is taken to denote a negative value. See the remarks concerning unary minus in 6. Expressions.) Some examples:

0  -187326  0.0  -.05  3.1415e3 -1E-10 0177 0xaf -0XFFFF

String constants are written as character sequences enclosed in double quotes:

string                  : '"' {char} '"'
char                    : any character but newline and "

To include newlines, double quotes and non-printable characters in a string, the following escape sequences may be used:

\n                      newline
\r                      carriage return
\t                      tab
\b                      backspace
\f                      form feed
\"                      double quote
\\                      backslash

Furthermore, a character may also be denoted in the form \N, where N is the character number in decimal, hexadecimal or octal (using the same syntax as for unsigned integer values).

A string may be continued across lines by putting the \ character immediately before the end of the line, which causes the following newline character to be ignored.

Some examples:

""                      empty string
"A"                     single character string
"\27"                   escape character (ASCII 27)
"\033"                  same in octal
"\0x1b"                 same in hexadecimal
"a string"              multiple character string
"a \"quoted\" string"   include double quotes
"a line\n"              include newline
"a very \               continue across line end
long line\n"

Finally, as already mentioned, some special symbols are used as operators and delimiters:

~ , ; : | || < > = <= >= <> ++ + - * / \ ^ ! # @ ( ) [ ]

[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by Albert Gräf on October, 14 2003 using texi2html