Creating a Programming Language

Материал из Поле цифровой дидактики

This tutorial teaches how to create a simple programming language that uses s-expressions for its instructions. Here is an example snippet of the language:

(say (+ (* 4 7) 2))

This project contains the completed code if you don't want to go through the whole tutorial.

Prerequisites

You will need to create these Variables.

  • (i)
  • (accumulator)
  • (token)

Also create these Lists.

  • (tokens :: list)
  • (instructions :: list)
  • (operator stack :: list)
  • (stack :: list)

Tokenizer

The tokenizer (also known as a lexer) converts a program into a list of tokens, or bits of the original program like words, numbers, and parentheses. Шаблон:Note

define tokenize (program)
set [i v] to [1]
set [accumulator v] to []
delete all of [tokens v]
repeat until <(i) > (length of (program))>
  set [token v] to (letter (i) of (program))
  if <[() ] contains (token)> then // Make sure to include a space after "()" or it won't work correctly
    if <not <(accumulator) = []>> then
	  add (accumulator) to [tokens v]
	  set [accumulator v] to []
	end
	if <[()] contains (token)> then
	  add (token) to [tokens v]
	end
  else
    set [accumulator v] to (join (accumulator) (token))
  end
  change [i v] by [1]
end
if <not <(accumulator) = []>> then
  add (accumulator) to [tokens v]
end

Parser

The parser takes in the tokens and builds a parse tree, or in the case of this language, a simple form of assembly.

define parse
delete all of [instructions v]
delete all of [operator stack v]
repeat until <(length of [tokens v]) = [0]>
  set [token v] to (item [1] of [tokens v])
  delete [1] of [tokens v]
  if <(token) = [(]> then
    add (item [1] of [tokens v]) to [operator stack v]
	delete [1] of [tokens v]
  else
    if <(token) = [)]> then
	  add (item (length of [operator stack v]) of [operator stack v]) to [instructions v]
	  delete (length of [operator stack v]) of [operator stack v]
	else
	  add [push] to [instructions v]
	  add (token) to [instructions v]
	end
  end
end
if <(length of [operator stack v]) > [0]> then
  say [Unclosed parenthesis] for [2] seconds
  stop [all v]
end

Evaluator

The last step is evaluating (also known as interpreting) the parsed instructions. Without this step, all we would have is a list of instructions which is pretty useless on its own.

define evaluate
set [i v] to [1]
delete all of [stack v]
repeat until <(i) > (length of [instructions v])>
  run instruction (item (i) of [instructions v])
end

define run instruction (operator)
if <(operator) = [push]> then
  add (item ((i) + [1]) of [instructions v]) to [stack v]
  change [i v] by [2]
  stop [this script v]
end
if <(operator) = [say]> then
  say (item (length of [stack v]) of [stack v]) for [2] seconds
  change [i v] by [1]
  stop [this script v]
end
if <(operator) = [+]> then
  replace item ((length of [stack v]) - [1]) of [stack v] with ((item ((length of [stack v]) - [1]) of [stack v]) + (item (length of [stack v]) of [stack v]))
  delete (length of [stack v]) of [stack v]
  change [i v] by [1]
  stop [this script v]
end
if <(operator) = [-]> then
  replace item ((length of [stack v]) - [1]) of [stack v] with ((item ((length of [stack v]) - [1]) of [stack v]) - (item (length of [stack v]) of [stack v]))
  delete (length of [stack v]) of [stack v]
  change [i v] by [1]
  stop [this script v]
end
if <(operator) = [*]> then
  replace item ((length of [stack v]) - [1]) of [stack v] with ((item ((length of [stack v]) - [1]) of [stack v]) * (item (length of [stack v]) of [stack v]))
  delete (length of [stack v]) of [stack v]
  change [i v] by [1]
  stop [this script v]
end
if <(operator) = [/]> then
  replace item ((length of [stack v]) - [1]) of [stack v] with ((item ((length of [stack v]) - [1]) of [stack v]) / (item (length of [stack v]) of [stack v]))
  delete (length of [stack v]) of [stack v]
  change [i v] by [1]
  stop [this script v]
end
... // You can add your own instructions to make the language even better!

Putting It All Together

when gf clicked
tokenize [(say (+ (* 4 7) 2))] :: custom
parse :: custom
evaluate :: custom

When you run the program, the sprite should say "30".

Final Thoughts

While the language works, it could be greatly improved. Here are some things you could do to improve it:

  • More functions, like ask and say-for-seconds.
  • Better error handling. Currently, the language doesn't care if you do something wrong, like (say 1 2 3).
  • Control flow functions, like if, while, and for. To implement these you would need to create some sort of jumping system to move around the program.

A project with all of the features above can be viewed here.