关于S表达式的另一种解读
A different take on S-expressions

原始链接: https://gist.github.com/tearflake/569db7fdc8b363b7d320ebfeef8ab503

本文介绍了 S-expr,一个为初学者设计的 S-表达式解析库。S-表达式是 Lisp 语言的基础,它使用括号表示嵌套列表和代码。S-expr 在传统的 S-表达式基础上增加了增强可读性的特性: * **字符串:** 单行字符串用 `"` 包裹,多行字符串用 `"""` 包裹,允许 Unicode 表示,多行字符串支持矩形边界。 * **注释:** 与字符串类似,单行注释用 `/` 表示,多行注释用 `///` 表示,方便代码注释。 * **转置块:** 用 `*` 标记,这些块交换行和列,允许垂直缩进并处理括号繁多的表达式。转置行(用单个 `*` 包裹)进一步增强了这些块内的可读性。 S-expr 旨在平衡简洁性和易用性,提供比标准 Lisp 实现更复杂但可能更易读的代码格式。

这个 Hacker News 讨论串讨论了一个关于 S 表达式的新方案,该方案引入了二维结构,旨在提高可读性。原帖建议垂直和水平排列代码元素以视觉化地表示它们之间的关系。 然而,讨论串中经验丰富的 Lisp 程序员强烈批评了这个想法。他们认为,提出的二维语法违反了 S 表达式的核心原则,例如对空格不敏感和线性解析。二维结构引入了复杂性,需要回溯,使得代码更难阅读和解析。 讨论中也提供了其他的格式化建议。一些发帖者指出了 Racket 对自定义 reader 语法的支持,包括二维结构。另一些人则指出 Clojure 已经通过诸如向量、哈希映射和关键字等特性解决了标准 S 表达式的一些限制。总的来说,这个讨论串反映了大家普遍反对原提案的共识。
相关文章

原文

[about document]
Peculiar kind of S-expressions specification document

[intended audience]
beginners in programming

S-expressions (Symbolic Expressions) are a fundamental concept in computer science and programming language theory. S-expressions are a simple, yet powerful notation for representing nested list data structures and code in a parenthesized form. They are commonly associated with the Lisp family of programming languages, where they serve both as a way to represent code and data uniformly.

S-expr is a S-expression parsing library. Other than usual treatment of atoms and lists, it features peculiar decisions in syntax definition regarding strings, comments, and transposed blocks of contents.

The general form of an S-expression is either:

  • An atom (e.g., atom), or
  • A list of S-expressions (e.g., (expr1 expr2 expr3)).

Lists can be nested, allowing for the representation of complex hierarchical structures. For example:

(eq (mul x x) (pow x 2))

This S-expression depicts equality between multiplication and square.

3. strings, comments and transposed blocks

Although a great part of S-expressions power lies in its simplicity, let's introduce a few extensions in a hope of making expressed code more readable, namely: strings, comments, and transposed blocks.

Strings in S-expr may be single-line or multi-line. Single-line strings are atoms enclosed in "..." pairs, like in expression "this is a single-line string", and represent Unicode format strings. Multi-line strings are enclosed between an odd number greater than 1 of " symbols in the following manner:

"""
this is a
multi-line
string
       """

Enclosing between a pair of """ symbols, multi-line strings are bound in a rectangle between the start of the first """ symbol, and the end of the second """ symbol. Remember to be careful when modifying contents of multi-line strings to make sure that the end of the second """ symbol is always placed horizontally behind the longest line in the string.

Notice that it is also possible to write expressions like:

(fst-atom """   trd-atom)
          00001
          00002
          00003
            """

where the expression stands for three atoms in a list.

Comment expressions are ignored by the system, and they serve as notes to help programmers reading their code. They are parsed just like strings, only using the / instead of the " symbol. Thus, a single-line comment may be written as /this is a single-line comment/, and may appear repeatedly wherever a whitespace is expected. An example of a multi-line comment may be:

///
this is a
multi-line
comment
       ///

Just like strings enclosing between a pair of """ symbols, multi-line comments are bound in a rectangle between the start of the first /// symbol, and the end of the second /// symbol. Notice that it is also possible to write expressions like:

///
00001 (
00002   fst-atom
00003   snd-atom
00004   trd-atom
00005 )
  ///

where the expression stands for three atoms in a list.

Transposed blocks are something new and unusual in the world of S-expression parsing. They represent blocks of code where rows and columns diagonally swap their places. Naturally, vertically spanned S-expressions have ability to horizontally indent their contents. Analogously, Transposed blocks span horizontally, and have ability to vertically indent their contents, thus using two-dimensional space in a diagonally mirrored manner.

For example, the expression:

(eq (mul (a a)) (pow (a 2)))

may be also written with horizontal indenting as:

(
    eq
    (
        mul
        (
            a
            a
        )
    )
    (
        pow
        (
            a
            2
        )
    )
)

which, in turn, may be transposed using * symbol sequences as in the following example:

* (                               )  
*   e (           ) (           )    
*   q   m (     )     p (     )     *
        u   a a       o   a 2       *
        l             w             *

Transposed blocks are noted by an odd number greater than 1 of * symbols spanned vertically. These blocks represent an attempt to tackle with unreadable sequences of parenthesis usually placed at the end of expressions when writing S-expression code. Thus, if the optional vertical indenting in transposed blocks follows the syntax tree of an expression, we may get more readable version of the original expression. Notice that we can nest transposed blocks where odd blocks in the sequence visually relate vertically regarding even blocks in the sequence.

Along with transposed blocks, as a final add-on, let's introduce transposed lines, vertically enclosed within a pair of single * symbols:

*
t
w
i
s
t
*

This expression behaves just like we wrote twist horizontally, in a single line. Combining it with transposed blocks, we can get a more readable form of our transposed block from above:

* (                                                  )  
*   *eq* (                   ) (                   )    
*          *mul* (         )     *pow* (         )     *
                   *a* *a*               *a* *2*       *
                                                       *

Transposed lines within this block are written horizontally because in the transposed block, rows and columns swap their place.

We informally defined S-expr code format and introduced somewhat peculiar way to treat strings and comments. We also introduced a strange twist using transposed contents in a try to cope with otherwise cumbersome sequences of ending parenthesis. We tried to be consistent with these add-ons to keep acceptable ratio between simplicity and usability. The resulting code format is a bit more complicated than it is in usual Lisp languages, but we hope that the introduced complexity is justified by the data readability expressed this way.

联系我们 contact @ memedata.com