Creating a Programming Language
This tutorial teaches how to create a simple programming language that uses s-expressions for its instructions. Here is an example snippet of the language:
(say (+ (* 4 7) 2))
This project contains the completed code if you don't want to go through the whole tutorial.
Prerequisites
You will need to create these Variables.
(i)
(accumulator)
(token)
Also create these Lists.
(tokens :: list)
(instructions :: list)
(operator stack :: list)
(stack :: list)
Tokenizer
The tokenizer (also known as a lexer) converts a program into a list of tokens, or bits of the original program like words, numbers, and parentheses. Шаблон:Note
define tokenize (program) set [i v] to [1] set [accumulator v] to [] delete all of [tokens v] repeat until <(i) > (length of (program))> set [token v] to (letter (i) of (program)) if <[() ] contains (token)> then // Make sure to include a space after "()" or it won't work correctly if <not <(accumulator) = []>> then add (accumulator) to [tokens v] set [accumulator v] to [] end if <[()] contains (token)> then add (token) to [tokens v] end else set [accumulator v] to (join (accumulator) (token)) end change [i v] by [1] end if <not <(accumulator) = []>> then add (accumulator) to [tokens v] end
Parser
The parser takes in the tokens and builds a parse tree, or in the case of this language, a simple form of assembly.
define parse delete all of [instructions v] delete all of [operator stack v] repeat until <(length of [tokens v]) = [0]> set [token v] to (item [1] of [tokens v]) delete [1] of [tokens v] if <(token) = [(]> then add (item [1] of [tokens v]) to [operator stack v] delete [1] of [tokens v] else if <(token) = [)]> then add (item (length of [operator stack v]) of [operator stack v]) to [instructions v] delete (length of [operator stack v]) of [operator stack v] else add [push] to [instructions v] add (token) to [instructions v] end end end if <(length of [operator stack v]) > [0]> then say [Unclosed parenthesis] for [2] seconds stop [all v] end
Evaluator
The last step is evaluating (also known as interpreting) the parsed instructions. Without this step, all we would have is a list of instructions which is pretty useless on its own.
define evaluate set [i v] to [1] delete all of [stack v] repeat until <(i) > (length of [instructions v])> run instruction (item (i) of [instructions v]) end define run instruction (operator) if <(operator) = [push]> then add (item ((i) + [1]) of [instructions v]) to [stack v] change [i v] by [2] stop [this script v] end if <(operator) = [say]> then say (item (length of [stack v]) of [stack v]) for [2] seconds change [i v] by [1] stop [this script v] end if <(operator) = [+]> then replace item ((length of [stack v]) - [1]) of [stack v] with ((item ((length of [stack v]) - [1]) of [stack v]) + (item (length of [stack v]) of [stack v])) delete (length of [stack v]) of [stack v] change [i v] by [1] stop [this script v] end if <(operator) = [-]> then replace item ((length of [stack v]) - [1]) of [stack v] with ((item ((length of [stack v]) - [1]) of [stack v]) - (item (length of [stack v]) of [stack v])) delete (length of [stack v]) of [stack v] change [i v] by [1] stop [this script v] end if <(operator) = [*]> then replace item ((length of [stack v]) - [1]) of [stack v] with ((item ((length of [stack v]) - [1]) of [stack v]) * (item (length of [stack v]) of [stack v])) delete (length of [stack v]) of [stack v] change [i v] by [1] stop [this script v] end if <(operator) = [/]> then replace item ((length of [stack v]) - [1]) of [stack v] with ((item ((length of [stack v]) - [1]) of [stack v]) / (item (length of [stack v]) of [stack v])) delete (length of [stack v]) of [stack v] change [i v] by [1] stop [this script v] end ... // You can add your own instructions to make the language even better!
Putting It All Together
when gf clicked tokenize [(say (+ (* 4 7) 2))] :: custom parse :: custom evaluate :: custom
When you run the program, the sprite should say "30".
Final Thoughts
While the language works, it could be greatly improved. Here are some things you could do to improve it:
- More functions, like
ask
andsay-for-seconds
. - Better error handling. Currently, the language doesn't care if you do something wrong, like
(say 1 2 3)
. - Control flow functions, like
if
,while
, andfor
. To implement these you would need to create some sort of jumping system to move around the program.
A project with all of the features above can be viewed here.