Dokmelody

A federated system for programming and sharing information between individual users, groups and communities

21 Jun 2023

Devlog - week 9 of 33

I took advantage of the multi-paradigm nature of Racket and Lisp, for parsing Dok code.

I used Brag for parsing Dok. Then I converted the parsed tree to an Abstract Syntax Tree (AST) in the RACR format.

This is an example of Dok code to parse:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
data A [
  fun f::Int {
    return 0
  }

  variant ../B [
    fun f::Int -override {
      return 1
    }
  ]

  variant ../C [
    fun f::Int -override {
      return 2
    }
  ]
]

var b::A/B
assert b.f == 1

var a::A
assert a.f == 0

The grammar in Brag is rather easy to write

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
code: stmts NEWLINE*

stmts:
  NEWLINE+ (stmt NEWLINE+)+
| stmt

stmt:
  "return" expr
| "assert" expr
| type-decl
| var-decl

var-decl:
  "var" ID ("::" type-ref)? CMD-REST? expr?

type-decl:
  "data" TYPE type-params? CMD-REST? ("[" data-content "]")?

data-content:
  data-part ("," data-part)*
| NEWLINE+ (data-part NEWLINE+)*

data-part:
  slot-decl
| variant-decl
| fun-decl
| type-decl

; TO CONTINUE ...

The corresponding AST written in RACR formalism has a very strange syntax, so I take advantage of the flexibility of S-expressions in Lisp, and I wrote it in a Dok-like formalism

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51

(define-ast dok-spec
  '(Code
      ~ :: Stmt*)

  '(Stmt
      (Return ~ :: Expr)
      (Assert ~ :: Expr)
      (Data-decl
          type :: $String
          ~ :: Type-param*
          cmd-rest :: $String
          ~ :: Data-part*)
      (Var-decl
          id :: $String
          ~ :: Type-ref
          cmd-rest :: $String
          ~ :: Expr)
    )

  '(Data-part
      (Slot-decl
          id :: $String
          ~ :: Type-ref
          ~ :: Expr)
      (Variant-decl
          type-name :: $String
          cmd-rest :: $String
          ~ :: Data-part*)
      (Fun-decl
          id :: $String
          ~ :: Fun-param-decl*
          Result :: Type-ref
          cmd-rest :: $String
          ~ :: Stmt*)
      (Nested-data-decl
         ~ :: Stmt/Data-decl))

  '(Fun-param-decl
      id :: $String
      ~ :: Type-ref)

  '(Type-ref
      paths :: Type-path*
      variants :: $ListOfString)

  '(Type-path
      type :: $String
      ~ :: Type-param*)

; TO CONTINUE ...

Then I wrote a simple walk function that converts this to the specific RACR syntax.

Now I had to convert the Syntax Tree to the AST. They are not much different, so I specified a list of S-expressions rewriting rules. This is an example of the rewriting rules:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
(define (dok-syntax-tree->AST stx)
  ; ...
  (define (remove-variant-tokens xs)
    (filter (lambda (x) (not (and (string? x) (string=? x "/")))) xs))

  (define (remove-path-tokens xs)
    (filter (lambda (x) (not (and (string? x) (string=? x ".")))) xs))
  ; ...
  
 (define simplify
    (replace-all-repeated
      (list 'code xs ...) --> (list* 'Code (remove-extra-tokens xs))

      (list 'stmts stmts ...) --> (remove-extra-tokens stmts)

      (list 'data-part (list 'slot-decl "slot" x "::" tr)) --> (list 'Data-part/Slot-decl x tr)
      (list 'data-part (list 'variant-decl "variant" "../" variant "[" content "]")) --> (list 'Data-part/Variant-decl variant "" content)
      (list 'data-part (list 'fun-decl "fun" f "::" tr cmd-rest "{" stmts "}")) --> (list 'Data-part/Fun-decl f '() tr cmd-rest stmts)
      (list 'data-part (list 'fun-decl "fun" f "::" tr "{" stmts "}")) --> (list 'Data-part/Fun-decl f '() tr "" stmts)

      (list 'data-content cnt ...) --> (remove-extra-tokens cnt)

      (list 'stmt (list 'var-decl "var" x "::" tr)) --> (list 'Stmt/Var-decl x tr "" (list 'Expr/Default-value))
      (list 'stmt (list 'var-decl "var" x "::" tr cmd-rest)) --> (list 'Stmt/Var-decl x tr cmd-rest (list 'Expr/Default-value))
      (list 'stmt "assert" expr) --> (list 'Stmt/Assert expr)
      (list 'stmt "return" expr) --> (list 'Stmt/Return expr)

      (list 'stmt (list 'type-decl "data" type-name type-params "[" (list 'data-content xs ...) "]")) --> (list 'Stmt/Data-decl type-name type-params "" (remove-extra-tokens xs))
      (list 'stmt (list 'type-decl "data" type-name "[" (list 'data-content xs ...) "]")) --> (list 'Stmt/Data-decl type-name '() "" (remove-extra-tokens xs))

      ; ...
  ))

In Lisp, every time there is a task to solve, one can use the best approach/paradigm, and it is rather easy writing data/code transformations utilities for gluing all together, because it is all in the same S-expressions format.