Creating a Programming Language

Материал из Поле цифровой дидактики

This tutorial teaches how to create a simple programming language that uses s-expressions for its instructions. Here is an example snippet of the language:

(say (+ (* 4 7) 2))

This project contains the completed code if you don't want to go through the whole tutorial.

Prerequisites

You will need to create these Variables.

  • i
  • accumulator
  • token

Also create these Lists.

  • tokens
  • instructions
  • operatorstack
  • stack

Tokenizer

The tokenizer (also known as a lexer) converts a program into a list of tokens, or bits of the original program like words, numbers, and parentheses. Шаблон:Note

definetokenizeprogramsetito1setaccumulatortodeletealloftokensrepeatuntili>lengthofprogramsettokentoletteriofprogramif()containstokenthenMakesuretoincludeaspaceafter"()"oritwon'tworkcorrectlyifnotaccumulator=thenaddaccumulatortotokenssetaccumulatortoendif()containstokenthenaddtokentotokensendelsesetaccumulatortojoinaccumulatortokenendchangeiby1endifnotaccumulator=thenaddaccumulatortotokensend

Parser

The parser takes in the tokens and builds a parse tree, or in the case of this language, a simple form of assembly.

defineparsedeleteallofinstructionsdeleteallofoperatorstackrepeatuntillengthoftokens=0settokentoitem1oftokensdelete1oftokensiftoken=(thenadditem1oftokenstooperatorstackdelete1oftokenselseiftoken=)thenadditemlengthofoperatorstackofoperatorstacktoinstructionsdeletelengthofoperatorstackofoperatorstackelseaddpushtoinstructionsaddtokentoinstructionsendendendiflengthofoperatorstack>0thensayUnclosedparenthesisfor2secondsstopallend

Evaluator

The last step is evaluating (also known as interpreting) the parsed instructions. Without this step, all we would have is a list of instructions which is pretty useless on its own.

defineevaluatesetito1deleteallofstackrepeatuntili>lengthofinstructionsruninstructionitemiofinstructionsenddefineruninstructionoperatorifoperator=pushthenadditemi+1ofinstructionstostackchangeiby2stopthisscriptendifoperator=saythensayitemlengthofstackofstackfor2secondschangeiby1stopthisscriptendifoperator=+thenreplaceitemlengthofstack-1ofstackwithitemlengthofstack-1ofstack+itemlengthofstackofstackdeletelengthofstackofstackchangeiby1stopthisscriptendifoperator=-thenreplaceitemlengthofstack-1ofstackwithitemlengthofstack-1ofstack-itemlengthofstackofstackdeletelengthofstackofstackchangeiby1stopthisscriptendifoperator=*thenreplaceitemlengthofstack-1ofstackwithitemlengthofstack-1ofstack*itemlengthofstackofstackdeletelengthofstackofstackchangeiby1stopthisscriptendifoperator=/thenreplaceitemlengthofstack-1ofstackwithitemlengthofstack-1ofstack/itemlengthofstackofstackdeletelengthofstackofstackchangeiby1stopthisscriptend...Youcanaddyourowninstructionstomakethelanguageevenbetter!

Putting It All Together

whenclickedtokenize(say(+(*47)2))parseevaluate

When you run the program, the sprite should say "30".

Final Thoughts

While the language works, it could be greatly improved. Here are some things you could do to improve it:

  • More functions, like ask and say-for-seconds.
  • Better error handling. Currently, the language doesn't care if you do something wrong, like (say 1 2 3).
  • Control flow functions, like if, while, and for. To implement these you would need to create some sort of jumping system to move around the program.

A project with all of the features above can be viewed here.