[about document]
Peculiar kind of S-expressions specification document[intended audience]
beginners in programming
S-expressions (Symbolic Expressions) are a fundamental concept in computer science and programming language theory. S-expressions are a simple, yet powerful notation for representing nested list data structures and code in a parenthesized form. They are commonly associated with the Lisp family of programming languages, where they serve both as a way to represent code and data uniformly.
S-expr is a S-expression parsing library. Other than usual treatment of atoms and lists, it features peculiar decisions in syntax definition regarding strings, comments, and transposed blocks of contents.
The general form of an S-expression is either:
- An atom (e.g.,
atom
), or - A list of S-expressions (e.g.,
(expr1 expr2 expr3)
).
Lists can be nested, allowing for the representation of complex hierarchical structures. For example:
(eq (mul x x) (pow x 2))
This S-expression depicts equality between multiplication and square.
Although a great part of S-expressions power lies in its simplicity, let's introduce a few extensions in a hope of making expressed code more readable, namely: strings, comments, and transposed blocks.
Strings in S-expr may be single-line or multi-line. Single-line strings are atoms enclosed in "..."
pairs, like in expression "this is a single-line string"
, and represent Unicode format strings. Multi-line strings are enclosed between an odd number greater than 1 of "
symbols in the following manner:
"""
this is a
multi-line
string
"""
Enclosing between a pair of """
symbols, multi-line strings are bound in a rectangle between the start of the first """
symbol, and the end of the second """
symbol. Remember to be careful when modifying contents of multi-line strings to make sure that the end of the second """
symbol is always placed horizontally behind the longest line in the string.
Notice that it is also possible to write expressions like:
(fst-atom """ trd-atom)
00001
00002
00003
"""
where the expression stands for three atoms in a list.
Comment expressions are ignored by the system, and they serve as notes to help programmers reading their code. They are parsed just like strings, only using the /
instead of the "
symbol. Thus, a single-line comment may be written as /this is a single-line comment/
, and may appear repeatedly wherever a whitespace is expected. An example of a multi-line comment may be:
///
this is a
multi-line
comment
///
Just like strings enclosing between a pair of """
symbols, multi-line comments are bound in a rectangle between the start of the first ///
symbol, and the end of the second ///
symbol. Notice that it is also possible to write expressions like:
///
00001 (
00002 fst-atom
00003 snd-atom
00004 trd-atom
00005 )
///
where the expression stands for three atoms in a list.
Transposed blocks are something new and unusual in the world of S-expression parsing. They represent blocks of code where rows and columns diagonally swap their places. Naturally, vertically spanned S-expressions have ability to horizontally indent their contents. Analogously, Transposed blocks span horizontally, and have ability to vertically indent their contents, thus using two-dimensional space in a diagonally mirrored manner.
For example, the expression:
(eq (mul (a a)) (pow (a 2)))
may be also written with horizontal indenting as:
(
eq
(
mul
(
a
a
)
)
(
pow
(
a
2
)
)
)
which, in turn, may be transposed using *
symbol sequences as in the following example:
* ( )
* e ( ) ( )
* q m ( ) p ( ) *
u a a o a 2 *
l w *
Transposed blocks are noted by an odd number greater than 1 of *
symbols spanned vertically. These blocks represent an attempt to tackle with unreadable sequences of parenthesis usually placed at the end of expressions when writing S-expression code. Thus, if the optional vertical indenting in transposed blocks follows the syntax tree of an expression, we may get more readable version of the original expression. Notice that we can nest transposed blocks where odd blocks in the sequence visually relate vertically regarding even blocks in the sequence.
Along with transposed blocks, as a final add-on, let's introduce transposed lines, vertically enclosed within a pair of single *
symbols:
*
t
w
i
s
t
*
This expression behaves just like we wrote twist
horizontally, in a single line. Combining it with transposed blocks, we can get a more readable form of our transposed block from above:
* ( )
* *eq* ( ) ( )
* *mul* ( ) *pow* ( ) *
*a* *a* *a* *2* *
*
Transposed lines within this block are written horizontally because in the transposed block, rows and columns swap their place.
We informally defined S-expr code format and introduced somewhat peculiar way to treat strings and comments. We also introduced a strange twist using transposed contents in a try to cope with otherwise cumbersome sequences of ending parenthesis. We tried to be consistent with these add-ons to keep acceptable ratio between simplicity and usability. The resulting code format is a bit more complicated than it is in usual Lisp languages, but we hope that the introduced complexity is justified by the data readability expressed this way.