Introduction

This book serves as a short introduction and reference to the Brain Aneurysm programming language v1.0, which I will refer to as basm throughout the book. This book will also touch on the companion cli tool of the same name. The basm transpiler is fully open-sourced at https://github.com/AtmolanderMimi/basm, if you encounter any issues while using it please report it there so that I can fix the problem!

Basm is a very simple, assembly-like esoteric programming language made to transpile into Brainfuck (aka bf for short). My goal in creating basm was to prove (although it had already been done before) that bf can be used to write any program a user can think of. Whilst bf is Turing-complete, it is rare to see that Turing-completeness actually exploited to write generic, everyday programs understandably so. Hopefully, by providing tooling to abstract over writing bf, it can become feasible to write anything in it. Making us finally free from the cage of that is this language with a funny name.

By aiming to be transpiled into bf, basm shares many similarities with bf. Certain pieces of logic, like instructions, are transpiled pretty much 1:1 into bf. We even allow writing raw bf in a basm source file. Basm does separate itself from bf where it matters however. I've summed up the pain points I've aimed to solve when writing bf programs:

Relative nature of memory
Lack of organization or code-reuse features
Difficulties with text handling

Relative Nature of Memory

In bf, you can't just say: add cell no.3 to cell no.2. You need to actually manage the tape pointer and most likely do mental arithmetic to calculate how many > and < you'll need to get to the cells you want. So, you'd better not mess up your bf tape pointer or you might end up with a silent bug killing both your program and any hope of creating anything productive with bf that may have inhabited you before. You may even end up in a situation where you don't know exactly where you are, because you conditionally moved the tape pointer. This can happen in a loop like this for example: [>].

You could say that bf dynamically addresses its memory by default. It says: get the cell -1 from me. Basm tries to solve this by having a static addressing system. This means that, in a basm source file, you can safely refer to the 3rd cell and it will always be the third cell no matter the operations you performed before. That's possible thanks to the tape pointer being automatically managed by the compiler. Also, because of the abstraction over raw bf operations, basm makes it impossible to reach a "relative state" without you wanting to. You cannot accidentally write [>]! You must implicitly use a RAW or ASUM operation with the intent of entering a relative state to enter a relative state. Reaching a relative state with these two operation is valid as it is needed to write things like dynamic array getting/setting. (Implementing these operations is really interesting and involves a lot of language features that I describe throghly in the Writing Relative Code chapter)

In brief, relative is when you don't know where the cell pointer is in the program. As a rule of thumb when programming in basm: relative bad (mostly), static good (mostly).

Lack of Organisation or Code-reuse Features

Bf has virtually nothing to help developers, maybe that's why it got so popular, maybe people like torture... Welp, I don't! So, I filled basm with plenty of ways to help you abstract over bf.

Firstly, in basm, the base building block of our program are not operators, but instead instructions. This gives the language a bigger indivisible piece of logic compared to bf. INCR 0 16; is much easier to conceptualize than ++++++++++++++++, don't you think?

Secondly, we have aliases with the ALIS operation, which allow you to give name to values, addresses or blocks of code. In this household (programming language), we call blocks of code "scopes".

Lastly, there are "meta-instructions" which are kinda similar to functions, but in implementation are moreso similar to macros. These meta-instructions, once defined can be used like any in-built instruction in the program. Since bf doesn't have a jump operator and data is separate from code, it's practically impossible to create functions. So, meta-instructions, as the name implies, are simply instructions made up of other lower-level instructions. Hence why they are more similar to macros than functions, they are all inlined when used.

Difficulties With Text Handleing

Bf works with text like any other programming language would, it stores one character by cell in Unicode format. The problem with that is that bf is not like any other programming language in pretty much all regards! First and foremost, there is no way to set a cell to a specific value directly. If you want the character 'c', then hold on tight, because you will need to type + 99 times! Basm implements both a character literal and a string literal, thus with basm you only need to type INCR 0 'c'; to set cell no.0 to the value of 'c'. There are also in-built operations for loading and printing strings, which is a normally a very tedious task in bf.

Quick Note on Code Formating Throughout This Book

This book will contain and promote how I prefer to write code in basm. Almost nothing in the language forces a specific formatting standard. For example, to save columns, I like to omit putting a level of tab in the upmost scope.

Like so:

[main] [
// my code here
INCR 0 12;
WHNE 0 0 [
    // but i raise for scopes after
    DECR 0 1;
];
]

There is nothing in the language forcing you to omit whitespaces, in fact there is nothing in the language dictating how you should use whitespaces (except for separating identifiers). So, keep in mind that whilst I like to follow certain naming and formatting standards, the flexible nature of basm allows you to write it however your heart desires!

Now that you know whether or not basm is for you, let's hop in!

Disclamer

Basic knowledge of bf is required to understand this book. That is, you need to know the operators do and little else. If you read up to here I'm guessing, you're probably fine on that.

Both the language and transpiler are my first foray into language development and are serving as my first real major project in Rust. There have been better implementations of bf transpilable languages which are both more efficient are easier to use. This project is only for my own personal pleasure as a creator and is not aimed at creating efficient bf programs. If you are seeking usable and well informed implementations of bf transpilers look here.

Installing the Cli

First, before getting into programming in basm, you'll need to install the basm transpiler. You can find compiled binaries of the basm cli tool in the releases section at https://github.com/AtmolanderMimi/basm or you can choose to compile the Rust code straight from source with cargo. (Don't be scared, despite being built 100% Rust, learning basm does not require any Rust knowledge)

Once installed you can use the basm tool through the terminal. It comes with a basm to bf transpiler, simple bf code optimizer and a bf interpreter. To transpile and run any basm file use the run subcommand like below:

basm run my-file.basm

For a more comprehensive list of capabilities, you can use the help subcommand or read the chapter on the basm cli tool.

Our First Program

Like any good programming language, you must start from the basics. The basics in this case being a "Hello, world!" program. Here is how you write one in basm:

[main] [
PSTR 0 "Hello, world!";
]

This source code file, once transpiled through the basm cli would compile down to this is bf:

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++.
+++++++++++++++++++++++++++++.+++++++..
+++.[-]++++++++++++++++++++++++++++++++++++++++++++.------------.
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++.
--------.+++.------.--------.
[-]+++++++++++++++++++++++++++++++++.[-]

This program uses cell 0 as a buffer to increment and output each character in the string "Hello, world!" consecutively. Quite simple isn't it? (compared to the bf at least)

Let's start dissecting what it all means. First, we begin at the top with the [main] decorator which indicates that the following scope will be a main field. There can only be one main field per program, and by the name you have probably already guessed what it is. It's the entry point of our program! Any instruction put inside the scope of the [main] field will be transpiled and then executed at runtime. In this example, the main field only held one instruction, which is PSTR 0 "Hello, world!".

Instructions Statement

"Ok, but what's an instruction in basm?", I hear you asking. An instruction is the basic building block of basm programs. They represent an action and are the smallest unit of logic in the programming language. You can think of them like functions in other programming languages: They take a set number of typed arguments and operate using these arguments. In this program, the operation was "print a string" using the cell at index 0. Here is the syntax for any instruction statement:

INSTRUCTION_NAME arg1 arg2 .. ;

INSTRUCTION_NAME is the name of the instruction. An instruction's name is not required to be fully capitalized, although it is the standard that I am going to use. arg1 and arg2 are the arguments to the instruction. Arguments are separated by whitespaces (don't put commas!). To end an instruction we need to put a semicolon, and we are done here. In the case of the PSTR instruction above:

PSTR was the name (meaning "print string"),
0 was the first argument and
"Hello, world!" was the second argument.

Scopes

As said prior, a [main] field is made up of a [main] decorator and a scope. So what is a scope? A scope is a collection of zero or more instructions or scopes. Scopes are denoted by matching square brackets which encapsulate their contents. They can be used to make fields and can serve as arguments to instructions, such as conditionals and loops. Scopes also serve as lifetimes for aliases. Meaning any alias defined within themselves are invalid outside of themselves, basically serving to limit the lifetime of aliases. (we will see this later in the Aliases chapter)

The scope syntax is very simple, just put square brackets around stuff! This would be a valid scope:

[]  // totally empty
    // (comments can be made by writing "//" which will make the rest of a line a comment)

.. and so would this third argument in this WHNE instruction:

WHNE 0 42 [
    INCR 0 1;
];

Like other languages, it is totally valid to write a scope within another scope as a statement. Doing this, inlines the contents of the scope, practically like you would have not written the scope contents in a scope. With what you know now, scope statements are not useful, but you will learn that they can be convenient for making local, temporary aliases and sublogic in parts of code. Note that scope aliases, which you will also see later in the chapter on aliases, can't be inlined like this.

Here is an example of using a scope as an expression: (only instruction need semicolons to terminate their statement, scope statements don't need it)

[main] [
// doing stuff
INCR 0 42;

// oof, the main scope is getting quite crowded let's get out
[
    OUT 0;
]
]

Scope statements (for inlining) and scope expressions (for instruction arguments) are written exactly the same. Their expression-ness is dependent on context. When in a scope, a scope is parsed as a statement. Whereas, when in an instruction argument (which is most of the time), a scope is an expression.

Note

From now on, all code will be implicitly contained in [main] unless otherwise noted for ease of reading (and writing ;)).

In-built Instructions

Any basm source file start with some predefined instructions called called in-built. These instructions are valid wherever and whenever. Basm has very reduced set of in-built instructions for simplicity. This means that any instruction that could be easily reproduced by combining two or more instructions was not included (with some few exceptions). For example, there is no multiplication instruction built in. I said the instruction set was small and, in fact, the set of built-in instructions is so reduced that I can show the 16 of them all here:

Arithmetic

Name	Arguments	Function
ZERO	addr	sets the value of`addr` to 0
INCR	addr, value	increments the value of the`addr` cell by `value`
DECR	addr, value	decrements the value of the`addr` cell by `value`
ADDP	addr1, addr2	adds`addr2` to `addr1`, the result is stored in `addr1` (in place)
SUBP	addr1, addr2	substract`addr2` from `addr1`, the result is stored in `addr1` (in place)
COPY	addr1, addr2, addr3	copies the value of`addr1` into `addr2` and `addr3`

Control Flow / Loop

Name	Arguments	Function
WHNE	addr, value, [scope]	while the value of`addr` cell is not equal to `value` runs the `[scope]`. `addr` is not consumed

I/O

Name	Arguments	Function
IN	addr	takes input from the user and sets it in`addr`, behaviour will vary between bf implementations
OUT	addr	outputs the value of`addr`, `addr` is not consumed
LSTR	start_addr, "str"	loads the string character by character into cells from the`start_addr` advancing forward
PSTR	addr, "str"	prints the string character by character using the cell`addr` as a buffer

Language / Compilation

Name	Arguments	Function
ALIS	ident, value or [scope]	creates an alias to a value or scope named`ident`. This instruction is purely abstraction
INLN	[scope]	inlines a scope
RAW	"str"	includes the string after transpilation, this can be used to include brainfuck operators
BBOX	addr	moves the tape pointer to`addr`
ASUM	addr	tells to compiler to assume that the tape pointer is at`addr`. If that assumption is wrong all cells accesses will be offset

You can find this table of instructions in the reference section of this book.

Types

Seeing this chart you might be a little confused due to how arguments in the Arguments column are formatted. If confusion lasts for more than eight hours contact a doctor (or continue reading).

Of course, there is a reason why arguments are denoted differently, it's because they don't have the same expected type! Each value in basm source code is typed and totally static, meaning that values within the source code can't get mutated at runtime. Only the value of cells in the tape can be mutated during execution. (basm is like one big set of macros, it would be for impossible for values to be mutable at runtime)

Basm has 3 types for its source code values being:

number denoted by nothing
scope denoted by being surrounded with [..]
string denoted by being surrounded with ".."

Instructions statements will only take arguments with types corresponding to the instruction's arguments' types. You cannot pass in a string where a number is expected, types will not coerce. Instructions in the built-in set have arguments which are ordered by their type and meaning. The way I like to order them follows this little diagram:

address numbers -> pure numbers -> strings -> scopes -> stack pointer number

Number Type

Numbers in basm are the most common type for literals and are probably going to be the type you use most as they are very versatile.

As you may have seen in the table above, numerics usually have one of two purposes. Either they serve as an address to a cell or they serve as a pure number. For example, INCR takes in the arguments addr and value which are both numeric, but their values hold two very different meanings. The value of addr (address) denotes the address of the cell to be operated on. Whereas the value of the passed in value (pure number) denotes by how much the cell at addr should be increased. Be sure to not mix numerics representing address and and numerics representing (pure) numbers as that is a surefire way to create bugs!

There is three ways to write a number literal:

Via a positive number, ex: 0, 42, 732
Via a character literal, ex: 'b', 'F'
Or by combining two number literals into an expression, ex: 3+'a'

Character literals' values are mapped to their value in Unicode. Expression are interesting as they allow modularity to arguments. If I have an array starting at index 3 and I want to zero the first 3 cells in that array, I could elegantly write it like so:

ZERO 3+0;
ZERO 3+1;
ZERO 3+2;

Another interesting property about expressions is that they themselves are number literals, meaning that we can chain them together like to.

// reminder that values in sources files are computed at compile-time
// expressions don't resolve at runtime!
//     75  - 32  - 1 = 42
INCR 0 'K' - ' ' - 1;
OUT 0;

The way expressions are built is that they take a "base" value and then a "modifier". It merges the two values (the base and modifier value) into one expression and repeats. Currently, modifiers have four types which are implemented.

Addition: +,
Substraction: -,
Multiplicaton: *,
Integer division (towards 0): /

Basm uses left to right priority, totally forgoing PEMDAS and any parentheses to denote priority. Here would be the implicit priority of that last example:

((75) - 32) - 1 = 42

A little note on integer division modifiers, basm doesn't do decimal numbers. This means that division are kinda weird as they can't result a decimal number. So, all divisions made are truncated (practically, the decimal digits are removed) towards 0. As a result, 5/3, which is in normal maths equal to 1.6666.., is equal to 1 in basm. This also introduces a loss of precision. 10/3*3 is not in fact 10, but 9. Since we execute from left to right we first compute 10/3 which is 3.3333.. That gets rounded to 3 and then we execute the multiplication modifier on our rounded 3, so we get 9!

The Difference Between Address Numbers and Pure Numbers

You may have noticed that I separate numbers into two "sub-types" based on their meaning. There are "address numbers" which are numeric values which represent addresses to cells and there are "pure numbers" which are numeric values which represent numeric values used for operations. Nothing stops you from using address numbers where pure numbers belong. This is also true the other way around.

This distinction is going to be important once we start naming these numeric values with aliases. You will see, I prefix address number aliases with A for "address" and pure value aliases with V for pure number "value".

In the built-in's' tables above, arguments named addr or addr_nth are address numbers, whereas arguments named value are pure numbers.

Scopes

Scopes, as I described in the last chapter, are collections of instructions and other scopes. You can pass them into instructions as arguments by simply writing them in their literal form (which means the same thing as expression here). Here's a reminder of what a scope literal looks like:

[
    // ...content here...
]

String

Strings as types are very limited in usage, they cannot be aliased and you cannot make a meta-instruction with a string arugment. This means, basically, that the string type is only used by in-built instructions (i.e: PSTR, LSTR and RAW). Whilst writing code, it is very rare to actually end up working with string anymore than loading them into memory or printing them out. Because of this, string support is currently fairly lacking. You cannot operate on source code strings in any meaningful way in basm 1.0. The only power you hold is to summon them through their literal form, which like many programming languages is simply text between two quotes ("). For example:

RAW "this will be included in the transpiled file!
";

Noticed how I needed to actually include a newline character in the source file for the string to contain it? This is because basm does not contain escapable characters like \n as of now. It also means that you can't write a string containing " (or a character literal with '). That's because the lexer, in the state it currently is, would freak out if I ever tried to implement escaped characters. (I really need to rewrite the lexer)

Assumptions Made by Instructions

All instructions both the ones you make and the ones which are built-in will hold some sort of assumptions.

In basm, instructions "consume" the values of the source cells unless otherwise noted. "consuming", in this context, means that the cell is set to 0. For example, when we add two numbers together using ADDP, the second cell, which is not an output, will inadvertently be zeroed due to the operation we made on it. Cells which hold the result of operations are of course not zeroed.

Apart from that, instructions also assume that address arguments don't have the same value (aka "alias", but that word has another meaning in basm). An instruction statement passed in with two of the same addresses have unspecified behaviour. Most often, this ends up locking the program in an endless loop, although it can also cause other undesirable behaviour.

Lastly, notably, instructions assume that the values of the cells at output addresses are 0. If that condition is not met, an instruction which "sets a cell to a value" would rather "increment a cell by that value".

Notable Instructions

Now that you know a bit more about what constitutes an instruction, let's look at some notable instructions which may not be immediately intuitive just by reading their description.

WHNE (While Not Equal)

Arguments

Name	Type	Description
addr	number	address of the cell being compared against the`value`
value	number	value that the cell at`addr` must reach to stop the loop
scope	scope	code to be executed in loop while the condition is not resolved

This instruction is vital, not only for writing good software in basm, but for writing any software that can be considered Turing-complete in general. WHNE is the only source of conditional execution and looping included within the whole pool of built-in instructions (excluding RAW, because it's weird). Thus, all sources of conditionality need to be derive from this. It is a sad day to be alive, but bf has no other primitive form of conditional than a while not equal to zero with the [ and ] operators. This instruction translates directly to a matching pair of [and] alongside some + and - so that we aren't restricted to equality with zero. As the name implies the behaviour is to:

Check if cell at addr is equal to value. If it is, exit
Execute the code in scope
Return to step 1

Unlike most other operations WHNE does not consume the value of the cell it checks, meaning you don't have to reinitialize the cell after each check.

Example: The fact that the cell is not consumed makes it easy to create something like a counter

// outputs values from 0-99
WHNE 0 100 [
    OUT 0;
    INCR 0 1;
];

COPY

Arguments

Name	Type	Description
addr1	number	address the source cell to be copied to`addr2` and `addr3`, the cell at `addr1` is consumed
addr2	number	address of a cell receiving a copy of the cell`addr1`
addr3	number	address of a cell receiving a copy of the cell`addr1`

COPY is a weird one which is needed because of the weird nature of the weird language which is bf (weirdly). There is no way to add two numbers together without one disappearing into the void, never to be seen again. This is why we require COPY. If we want operate on a value and still have a copy of it afterwards we will need to have the value twice. Once to use for the operation and the other to keep it alive. COPY was made for this, it takes in one cell and sets the value of two other cells to the value of the source addr1 cell, consuming it in the process. This behaviour is essential for building almost any program.

Example: a program outputting the double of the last number forever, or until the cell value limit is reached:

INCR 0 1;
WHNE 0 0 [
    OUT 0;
    COPY 0 2 3;
  
    ADDP 0 2;
    ADDP 0 3;
];

Now, most basm professionals don't want you to know this... but it's acceptable for the two output addresses of COPY to be the same. Doing so will copy the value of the cell twice in the same place, practically doubling it. Although I know it is safe in this circumstance to have two address arguments with the same value due to the fact that I kinda made the language, this is not a safe bet to take on ANY OTHER instruction! (pretty cool though)

ADDP (Add in Place)

Arguments

Name	Type	Description
addr1	number	cell to be added to`addr2`, output is stored here
addr2	number	cell to be added to`addr1`

You might be surprised to see ADDP in this list, after all it's an addition operation, how complicated can it be? Welp, not all that much, but it still does have a property which is not reflected by its name. That being the ability to act as a move instruction.

As long as the value at addr1 is 0, calling ADDP will simply move the contents of the cell at addr2 to addr1. This is because x + 0 = x, or in layman's terms: adding something to nothing will just return the initial something. And, since we happen to return that "initial something" to somewhere different from where we took it from, we effectively moved that "something". So, we can giveADDPa little pet name likeMOVE as it really serves as so sometimes.

Example: I want to copy the value of cell 0 to cell 1 without consuming it

INCR 0 42;

// reminder that we can't copy to 0 directly as that would make two pointers alias,
// which in turn would cause an infinite loop
// (no, the doubling trick with COPY was not about that)
COPY 0 1 2;

// we "move" 2 -> 0
ADDP 0 2;

RAW

Arguments

Name	Type	Description
str	string	a string to be included in the transpiled code at the appropriate location

RAW is a bit of a special case in that it can do the job of all the other instructions. If I truly wanted to reduce the instruction set to the maximum there would be one instruction and it would be this.

Now, including a string in the compiled output does not seem like it would be able to do much at first. Like, ok, we can put comments in the output? But that's not the main attraction of RAW. The real reason it is of interest is because you can smuggle ANY string into the compiled file. This means that it is possible to include operators like +, -, >, <, [, ], , and . allowing us to insert raw bf code directly in our basm source file. With this, we can harvest the full power (it being rather small still) of bf. This comes with one downside, tape pointer movement, which is usually completely handled by the compiler, and unmatched brackets are not checked within strings passed to RAW. An unmatched bracket will probably just match with a totally unrelated one, messing up jumps made by WHNE and messing with tape pointer position unintentionally is BAD. I won't go into much details why it is, but simply put, the compiler needs to know were the pointer is at all times (unless you plan for it, we'll see that in the relative-code chapter). Just writing RAW ">"; is enough to completely mess up your program by effectively offsetting all the cells by 1.

With RAW "">";, we can write dynamic code to our heart content without being limited by basm's static memory addresssing. The art of writing basm a program capable of harnessing the power of dynamic memory addresssing is one hard to come by these days, but in time (reading the chapter on relative code) you shall learn how to master it.

Example: Why would I want to use basm, when bf worked just fine for a "hello, world!"?

// be careful to not include bf operators in your text!
RAW "my hello world program:
";
RAW "++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.";

Which will give us exactly this output, as expected:

my hello world program:
++++++++[>++++[->++>+++>+++>+<<<<]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>->.+<.<.+++.------.--------.>>.>++.

ALIS (Alias)

Oh, wait... we have the whole next chapter just for that one!

Aliases

Aliases are by far the most common abstraction feature you are going to use while writing basm code, no questions asked. They allow to alias number values or scopes to identifiers to representing the value. You can think of aliases like variables in other programming languages, except they are purely immutable constants. Once an alias is defined, it can be used to substitute having the write the literals forms of values anywhere the alias is valid. Defining an alias is done via the ALIS "instruction" (ALIS can hardly be called an instruction as it does not compile into anything).

For example, if I have a cell located at 42 containing the index of an array, rather than having to memorize its location, I could simply create an alias to the address like Aindexwith the instructionALIS Aindex 42;.

The `ALIS` Instruction

At the exception of meta-instructions, the ALIS instruction is the only way to create aliases. Since I baited you in the last chapter, below is its nicely formatted specification!

Arguments

Name	Type	Description
alias_name	identifier	the name of the alias
value	numeric/scope	the value to be bound to the alias

(don't worry about the "identifier" type, it's not actually a type. Just that ALIS is a special boy and can take names as argument)

ALIS allows creating aliases in the current scope by binding numeric or scope expressions to an identifier. As an instruction, ALIS does not compile to anything, it is purely abstraction.

To define numeric aliases and then use the alias in place of any numeric value do as so:

ALIS Aindex 1;         // address of the index cell
ALIS Vshrob 2;         // some random value
ALIS Vvalue 40+Vshrob; // value of the index

ALIS Vshrob 33;        // overwriting the last Vshrob alias

INCR Aindex Vvalue;

Aliases are purely abstraction for writing the value of the bound expressions where the alias is. So, to help visualize, here is what the above program looks like after the aliases have been normalized out:

INCR 1 42;

Notice how Vvalue is replaced by the value of the expression passed into ALIS when the alias is created. This is why Vvalue got replaced by 42 (40+2) rather than 73 (40+33). The alias was bound to the value of the expression rat her than the expression itself, if it were bound to the expression changes to Vshrob would change Vvalue.

Alias Validity

Aliases are valid for the duration of the scope they are defined in. This means that an alias defined in a higher scope cannot reach a lower one. That alias behaviour is the main reason why you would want to insert a scope within another, to have operation specific aliases. Here is an example of alias validity over the program:

ALIS Vsome 1; 

// Vsome: 1
[
    // Vsome: 1
    ALIS Vsome 2;
    // Vsome: 2
]
// Vsome: 1

Whilst we are talking about this, might as well mention that, as in the example above, aliases can shadow each other and that not only through scope boundaries! They can also refer to the value they will shadow when defined. Here's an example of a fairly common pattern using aliases:

ALIS Vnum_cats 5;
// ... some operations ...
ALIS Vnum_cats Vnum_cats+3;

That example firsts set Vnum_cats to 5 then does all the operations between the two ALIS with 5 as Vnum_cats. Lastly, the Vnum_cats alias is shadowed by the second ALIS which sets the alias to the value of Vnum_cats+5. The compiler normalizes the expression and gets the result of 8 (5+3) which it sets to be the new Vnum_cats. Using this, we can reach the what would be expected of variables in other programming languages, but don't get it twisted, aliases are not variables. They are replaced by their value at compile time and are totally gone at runtime.

So, don't expect something like this to increment forever:

// THIS DOES NOT WORK!

ALIS Vincrement 0;
WHNE 0 1 [
    ALIS Vincrement Vincrement+1;
];

That example would completely fail as the alias in the looped scope doesn't persist over loops. It will get thrown out at the end of its scope like any other alias, as thus the value of Vincrement will always be 1 (0+1) in the looped scope.

Alias Types

There are currently two types which are supported by aliases: number and scope. Yes, this means you can't alias a string, sadly. So far, we have only made aliases of numeric values as these are the most common, but we can also create scope aliases. Aliases with different types won't overwrite each other, even if they have the same name. They are treated as completely different. This means you can have both a numeric alias and scope alias named my_alias at the same time with no issue. Basm will automatically use the alias with the fitting type when my_alias is used.

You create aliases of different types by matching identifiers to different expressions. If you give a numeric value to ALIS you will create a numeric alias, on the other hand if you give it a scope you will get a scope alias.

Scope Aliases

Now, scope aliases are interesting to talk about as they allow you to share and reuse code around your source file, unlike numeric aliases which are only about values. Right now, they are not really interesting, but as you'll learn about meta-instruction you will see one of most interesting use case, as a callback-like thingamabob.

Since there hasn't been an example of alias scopes yet, here is one:

// defining "my_scope" scope alias
ALIS my_scope [
	INCR 0 1;
	OUT 0;
];

// using my_scope with the scope identifier syntax
WHNE 0 128 [my_scope];

You have probably noticed a difference to how we use numeric aliases, being that we need to use a "scope identifier" to specify our scope alias. A scope identifier is simply an identifier (aka the name of the alias) wrapped by square brackets. This syntax tells the compiler to checks for the scope alias rather than the numeric alias (also it allows the person reading it to know that this is a scope alias being passed as an argument, which is cool).

If you don't use the scope identifier syntax you will get this error:

------------------ [ ERRORS ] ------------------
Error: from Ln 7, Col 12 in "/some/path/to/the/file.basm"
 → alias was not defined
[...] [main] [
ALIS my_scope [
        INCR 0 1;
        OUT 0;
];

WHNE 0 128 my_scope;
]
 [...]

(the second my_scope is highlighted and underscored in the terminal)

This means that the compiler tried to search for the numeric alias my_scope, which obviously doesn't exist, and failed.

There is one more thing I want to add to scope aliases and that's inlining them. Unlike scope literals you can't simply write the alias down to inline it! So, you will need to use an instruction I made specifically for that called INLN. It takes one scope argument and inlines it, as if you would have written a literal in the file at its place.

Here is an example displaying what I just talked about and an extra property that may not be obvious:

ALIS Vscale 7;
ALIS increment [
    INCR 0 Vscale;
];

// we repeat the scope three times
INLN [increment];
INLN [increment];
INLN [increment];

// then three more
ALIS Vscale 12;
INLN [increment];
INLN [increment];
INLN [increment];

OUT 0;

This example, when ran, outputs 42 (7*6) or the * character if your interpreter outputs as text. That may seem odd, after all we set Vscale to 12 halfway through which should give us 57 (7*3 + 12*3), right? Wrong! Whilst, you can include aliases within a scope alias, they will be immediately normalized (aka replaced) by their bound value. So, the increment alias actually looks more like this after being defined:

ALIS increment [
    // Vscale was replaced by its value
    INCR 0 7;
];

Calculating Fibonacci

Now that you've learnt the basic building blocks of the language, you are probably eager to do something useful with them, but like a kid with un-assorted Lego blocks it may not be as easy as it seems to build something coherent with them. This synthesis chapter is a walk-through for creating a program which calculates the nth number of the Fibonacci sequence. We will first start by creating the basic logic then add user input.

If you want to go ahead and try to build a fib program yourself before reading this explanation, I highly recommend it! Getting your hands dirty is the best way to learn and memorise quirks of the language (and any skill for that matter).

Building the Inputless Version

We are going to be implementing an iterative version of a Fibonacci calculator. (In opposition to the recursive one, because there is no recursion in basm)

If you have never built an iterative Fibonacci calculator in another language before, here is pseudo-code for how it typically looks like:

a = 0
b = 1

for i in [0; nth[:
    c = a + b
    a = b
    b = c

output a

What's nice here is that most of the features present in that pseudo-code example are present in basm. We can store values like a, b and c in cells and we of course have addition and output. There one pain point though, which is that unlike in the pseudo-code basm instructions consume their inputs. This means that we can't feasibly use b in c = a + b and a = b without having to copy it. That will unfortunately require writing a bit of copying logic boilerplate.

Here's the non-functional pseudo-code/basm hybrid with the pieces which we can easily replace:

// unlike in the example above, we need to set the value of the variables
ALIS Aa 0;              // a = 0 (all cells are 0 by default)
ALIS Ab 1;
INCR Ab 1;              // b = 1

// this for loop syntax is invalid in basm
for i in range [0; nth[:
    ALIS Ac 2;          // we define Ac here, because it is operation specific to the loop
    ALIS Atmp 3;

    COPY Ab Ac Atmp;    // c = b and tmp = b
    ADDP Ac Aa;         // c += a

    ADDP Aa Atmp;       // a = tmp (which is b)
    ADDP Ab Ac;         // b = c

OUT Aa;                 // output a

Notice how ADDP has the double purpose of both adding and moving values around? We can be sure that these moves are safe to do because the cells were consumed by prior instructions. An example of this is with Ab, we knew it was zeroed from the COPY instruction, so it was safe to use ADDP as a move to move Ac to Ab.

It is very important to keep track of which cell has been consumed when writing basm code. That, not only for being able to move cells to other cells without having to preemptively ZERO them, but also for what I like to call "scope hygeine". What it means is that a scope, once executed, will not leave non-zeroed cells. It is the duty of the scope to zero the cells it uses for operation. In the case of this example, the cells used for operation in the for loop scope are Ac and Atmp. Looking at the code we can see that they both get consumed before the end of the scope. Perfect! If you betray scope hygiene, you might end up trying to use a cell with an arbitrary value, whereas you expected all cells to be zero by default. By default, in basm, we assume all cells untouched by the current scope to be zero, this allows us to save operations having to ZERO all cells before using them.

Here is an example where not zeroing all cells before the end of the scope is problematic:

[
    // .. some operation ..
    ALIS Atmp 1;
    INCR Atmp 2;
    COPY Atmp 0 2;
]

ALIS Aval 0; // we expect *Aval == 0
INCR Aval '!';
OUT Aval; // we should get '!', but get '#'

(In this case the problem is very easy to spot, it is as the project size increases that bugs caused by these kinds of errors become harder and harder to spot)

For the last step in transforming our pseudo-code example into real basm code is to deal the for loop. There is no in-built for loop instruction, so we will have to manually handle our loop's exit condition. Normally other languages would do this for you, but being basm being low-level we will need to explicitly create and increase the index as well as check whether that index is out of the range.

ALIS Aa 0;
ALIS Ab 1;
INCR Ab 1;
// we add a cell to store the index
ALIS Ai 2;
// the index we want to reach (it's a constant for now)
ALIS Vdesired_index 11;

// this reads like: while (index != desired_index)
WHNE Ai Vdesired_index [
    // because of the extra cell used for the index we need to offset those by 1
    ALIS Ac 3; 
    ALIS Atmp 4;

    COPY Ab Ac Atmp;
    ADDP Ac Aa;

    ADDP Aa Atmp;
    ADDP Ab Ac;

    // increase the index for each iteration
    INCR Ai 1;
];

OUT Aa;

If you did everything well, now the program should output 89! (or Y, if it does output this you can use the -m flag on the basm interpreter to output cells as numbers) You can try to fiddle with the value of Vdesired_index to get different numbers from the Fibonacci sequence. Be weary though, the bf interpreter that comes with the basm cli defaults to using cells of unsigned 8 bit integers, meaning that numbers are limited from 0-255! To change the size of the cells, use -c 16 flag to get 16 bit unsigned cells.

Adding User Control

We will now be adding a very small amount of user input to our program: we'll let the user decide which number they want from the Fibonacci sequence. This may seem simple, and it is! Rather than increasing a cell until we reach the desired value, we can simply decrease the cell containing the desired value until we reach zero. It's a bit different from the last version thought since the desired value will be stored in a cell rather than be a constant.

ALIS Aa 0;
ALIS Ab 1;
INCR Ab 1;
// we set a cell to take user input (Value -> Address)
ALIS Adesired_index 2;
IN Adesired_index;

WHNE Adesired_index 0 [
    ALIS Ac 3;
    ALIS Atmp 4;

    COPY Ab Ac Atmp;
    ADDP Ac Aa;

    ADDP Aa Atmp;
    ADDP Ab Ac;

    // decrease the index for each iteration
    DECR Adesired_index 1;
];

OUT Aa;

Now when running the program, you will be prompted to input something which will be treated as the index. As you may have already guessed IN and OUT transpile directly to , and .. Make sure you use the -n flag while using the inbuilt basm interpreter to parse input as numbers rather than as characters. The prompt will repeat until you enter a valid number. That being until your input can parsed and contained by the cell. Input is special in bf as it is the only thing that can straight up overwrite cell data without caring about what it contained before. (it is also special in that its implementation varies a lot from interpreter to interpreter, but whatever)

Now you have a working interactive Fibonacci calculator! You check out the generated bf by printing it out via the -p flag and make my interpreter sweat by asking it anything above fib(30) (you will need the -c 32 flag for the result to be correct).

Meta-Instructions

Wow, only now are we onto meta-instructions! Meta-instruction are like functions in other programming languages, except that they aren't. They are actually just multiple instructions joined into one, thus the name meta-instruction. These composite instructions are expanded at compile time into their component built-in instructions. So, they are actually more similar to macros or always inlining functions.

The Meta Fields

To define a meta-instruction you need to use the second and only other field other than [main], which is the [@META arg1 arg2] field. A basm source file can have zero or more meta-instructions fields present in a file before the [main] one. This field allows you to create a meta instruction of the name META with the arguments following it, in the case of the example they would be two numeric arguments named arg1 and arg2.

Once defined a meta-instruction can be used anywhere a normal built-in instruction can, even in other meta-instruction bodies (excluding those who came before and itself of course).

Arguments

Meta-instructions can take zero or more arguments of numeric or scope type. To specify the arguments you want your meta-instruction to take in, you simply write they names (identifiers) in the field header:

[@META]: would take no arguments
[@META arg1 arg2]: would take two numeric arguments
[@META arg1 [scope]]: would take one numeric argument followed by a scope argument

Basm will automatically create aliases named after the arguments which are bound to the arguments passed in by the caller. This means that you can refer to arg1 like any other numeric alias in the scope of [@META arg1].

Naming

Meta-instructions cannot share the name of other instructions, must they be built-in's or user defined meta-instructions. Also, quick aside, while I prefer to keep the 4 capital letters naming convention for instruction names, names abide by the same restrictions as aliases and are thus not limited to 4 capital letters or less.

Examples

Meta-instructions are very practical to remove boilerplate. Since I decided to greatly reduce the amount of built-in instructions, you will probably need some to define some quality of life instructions. Thankfully, doing so is both easy on the meta-instruction definition end and in the usage end.

`SET`

If I want to set the value of cell by incrementing, but I know that it may be not consumed, then I would need to zero it before incrementing the cell, causing boilerplate. It would be much easier to read if both zeroing and incrementing the cell to a value would be the same instruction. Well, with meta-instructions you can implement that instruction yourself:

[@SET Acell Vval] [
ZERO Acell;
INCR Acell Vval;
]

[main] [
// instead of:
ZERO 0;
INCR 0 12;

// you can simply write:
SET 0 12;
]

At compile time, the SET instruction will be expanded and the aliases will be resolved, leading the main field looking something like this after meta-instruction inlining:

// instead of:
ZERO 0;
INCR 0 12;

// you can simply write:
ZERO 0;
INCR 0 12;

As you can see, using meta-instruction saves you time and brain!

`COPC` (Copy Conservatively)

It would have been useful in the Fibonacci program that we just wrote if we could have copied the value of Ab to Ac without having to explicitly use a temporary Atmp, right? We can made an instruction for that! This meta-instruction is called COPC for "copy conservatively". In this context, being conservative means that this instruction does not consume its inputs, or rather as we will see while implementing it, they are not consumed when the instruction ends.

[@COPC Asrc Adst sp] [
ALIS Atmp sp;
ALIS sp sp+1;

COPY Asrc Adst Atmp;
ADDP Asrc Atmp;
]

I am guessing you aren't too chocked for the most part by this implementation, we simply encapsulate the temporary shuffle in a meta-instruction, like we would encapsulate logic in any other language. Everything is normal to one exception, you are probably asking are yourself what sp is doing here and what it means.

We cannot simply use a static memory position for Atmp as we don't know which cells are free to be used when when COPC is called. Just using a static address for Atmp would probably just end up creating a situation where COPC and another operation are both using that static cell for something.

Instruction which need memory for their operation can include a spargument as the last one to ask for where it is safe to "allocate" cells for computation.spmeaning "stack pointer" it is an address pointing to the first free cell that the instruction can use for its operations. In the context of basm, allocating a cell is synonymous to "reserving for a cell for an operation". This is whyAtmpis defined fromsp. We know that cell including and after sp are free to use.

I personally like to always have an alias named sp, which increment as I allocate more cells, in all scopes. When I want to allocate a new cells, I create an alias for the address of all cells and then increase sp by each cell I've allocated. In the example above, this logic comes in. I create 1 alias for a temporary cell named Atmp and then increase sp by 1.

The sp alias also benefits greatly from alias lifetimes. When you increase the value of sp what you really do is create a new copy of sp that shadows the last one. This means that when you are done with a specific operation and the scope ends, sp will decrease naturally. This practically frees the cells that were allocated in the scope!

Using sp is especially useful in the [main] scope where you want to keep track of where you have not allocated yet. Allocating based on the stack pointer also allows you to seamlessly add a cell without having to offset all later cell pointers by one (like we had to do in the Fibonacci example).

Here is what Fibonacci would look like with COPC and the sp system:

[@COPC Asrc Adst sp] [
ALIS Atmp sp;
ALIS sp sp+1;

COPY Asrc Adst Atmp;
ADDP Asrc Atmp;
]

[main] [
ALIS Aa 0;
ALIS Ab 1;
INCR Ab 1;
ALIS Adesired_index 2;
IN Adesired_index;
ALIS sp 3; // set the sp to the next free cell

WHNE Adesired_index 0 [
    // "allocate" cells where the is free space
    // this allows us to easily add another cell before without having to touch this
    ALIS Ac sp;
    ALIS Atmp sp+1;
    ALIS sp sp+2; // update sp

    // we use sp here
    COPC Ab Ac sp;
    ADDP Ac Aa;

    ADDP Aa Ab;
    ADDP Ab Ac;

    DECR Adesired_index 1;
];
// aliases being bound by scope helps us here by giving us back
// our old sp from before the WHNE scope, basically freeing a part of the stack

OUT Aa;
]

TWIC (Twice)

There aren't many interesting examples of using scopes as arguments other than trying to make conditional execution, so I wanted to give an little simple example here before actually getting into conditionals, because those are a bit complicated. TWIC is a instruction that takes a scope and inlines it twice, causing it to execute twice.

[@TWIC [scope]] [
INLN [scope];
INLN [scope];
]

I mean, I have a whole chapter written about using scope as arguments for conditionals, and it's the next one... so let's read it!

Creating Custom Conditionals

When we have one type of conditional execution in a language, we can derive all the other types from it. Same for looping, if we have a looping element we can make any looping element. In the case of basm, that any conditional/looping element is WHNE which we can derive all conditionals and loops we desire from. In this chapter, we're going to first implement IFNE (If Not Equal), then from it IFEQ (If Equal).

First off, to make these meta-instruction conditionals we need to think of their arguments. We are going to want a cell address and a value so we can compare them together, a scope to be conditionally executed, and lastly, we are going to make use of the sp pattern to use some operation memory.

So, the argument table for all of our conditionals should look like this:

Name	Type	Description
addr	number	address of cell compared to`val`
val	number	value compared to cell at`addr`
[scp]	scope	code to be conditionally executed
sp	number	address to the next free cell

IFNE (If Not Equal)

IFNE is the easiest conditional to define if we consider that right now we only have WHNE. When you think about it, an if statement is simply a conditional loop that loops once. With that in mind, we'll want to purposefully make WHNE loop once while also keeping its comparason ability.

Here is how I would go about implementing a one iteration WHNE loop:

[@IFNE Aaddr Vval [scp] sp] [
    WHNE Aaddr Vval [
        INLN [scp];

        // force Aaddr to be equal to Vval
        ZERO Aaddr;
        INCR Aaddr Vval;
    ];
]

You will notice that this implementation works! But, there is some pretty big downsides that we will want to avoid.

We don't want these implementations to consume their inputs, so IFNE's can be chained. In this case this implementation sets Aaddrto the value ofVval, which is no good. If we want to chain IFNE's we will need to manually copy Aaddreach time we pass it in, which is tedious.

Apart from that, there should be no non-zero allocated cells when inlining the scope as it might corrupt the behaviour of the scope, which does not expectIFNE to allocate cells while it is running. The specifics of what I just said are important, we don't want non-zero allocated cells. What this means, is that we can allocate cells, but when we inline the scope argument, all of our cells should be zero. If all our cells are zero, it is as if, to the inlined scope, that there are no allocated cells. Allocation is only a concept, what matters is whether the cells are zero or not when the scope is inlined. This notion is important as it will allow the caller to use the same sp both in the scope and in the IFNE argument. If IFNE required a cell to be allocated when inlining the scope, then all the mentions of sp in the scope should be increased by 1.

So, let's solve these issues via extra cells that are granted by sp:

[@IFNE Aaddr Vval [scp] sp] [
ALIS Atmp1 sp;
ALIS Atmp2 sp+1;
ALIS sp sp+2;

// copy to Atmp1 so we can consume it
COPY Aaddr Atmp1 Atmp2;
ADDP Aaddr Atmp2;

WHNE Atmp1 Vval [
    // we don't care about Atmp1, it needs to be consumed before the scope
    ZERO Atmp1;

    // at this point, all of the values allocated in IFNE are zero,
    // so it's like we allocated nothing, scp can use the same cells as we just did without causing bugs
    INLN [scp];

    INCR Atmp1 Vval;
];

// cleanup
ZERO Atmp1;
]

Now let's try it and see if it works:

// .. add the IFNE definition

[main] [
    ALIS Aval 0;
    INCR Aval 42;
    ALIS sp 1;
  
    // this will print ..
    // (IFNE and PSTR can use the same sp)
    IFNE Aval 33 [
        PSTR sp "Aval is not equal to 33!";
    ] sp;

    // .. but not this
    IFNE Aval 42 [
        PSTR sp "Aval is not equal to 42!";
    ] sp;
]

Notice how both PSTR uses the same cells as IFNE? Yet, they don't collide as we purposefully zeroed all cells from the IFNE scope before running the scope argument! Try it yourself: Move the ZERO Atmp1; instruction after the inlining and see what happens when values are still non-zero.

IFEQ (If Equal)

IFEQ is very much linked to IFNE. IFEQ executes only when IFNE doesn't. We can make use of relation to easily derive IFEQ from a couple of IFNE's. Rather than copying the value around, this implementation will make use of a flag representing wheter or not IFNE was executed.

[@IFEQ Aaddr Vval [scp] sp] [
    ALIS Aflag sp;
    ALIS sp sp+1;
    ALIS Vnot_equal 1;

    IFNE Aaddr Vval [
        INCR Aflag Vnot_equal;
    ] sp;

    // this reads like "if is not equal to not equal"
    // both "not equals" cancel out and we get "if is equal"

    // once again all cells are zero here,
    // as the flag needs to be of 0 for this to execute
    IFNE Aflag Vnot_equal [scp] sp;

    // cleanup (only useful if it did not execute)
    ZERO Aflag;
]

Once again I encourage you to test it:

// .. IFEQ and IFNE definition here

[main] [
    ALIS Aval 0;
    INCR Aval 42;
    ALIS sp 1;
  
    // this will print ..
    IFEQ Aval 42 [
        PSTR sp "Aval is equal to 42!";
    ] sp;

    // .. but not this
    IFEQ Aval 60 [
        PSTR sp "Aval is equal to 60!";
    ] sp;
]

With that out of the way, it's going to be much faster to write actually useful programs and now that you know the trade, I am sure you will be able to create your own conditionals! (If you want to challenge yourself, try making a WHEQ)

Writing Relative Code

Onto the last remaining chip of knowledge that basm requires: Harvesting the Power of Relative Code! Remember at the very start of this book, where I said that every memory access is static unless otherwise stated? Well, this chapter is about stating you don't want static accessing and would much prefer write relative logic. Sometimes static addressing is either less efficient, harder or simply impossible to use for certain problems. In these cases you will need to reach for relative logic.

But, what is the difference between relative and static in basm exactly? By default, basm refers to cells in a static manner, meaning that an address will always refer to the same cell (e.g address 1 will always refer to the same cell). This kind of addressing is known as static. Relative means that an address will not always refer to the same cell, it will refer to a cell relative to a dynamic point. In most cases, that "relative point" is simply where your tape pointer is at a given moment.

Now, by default basm moves your tape pointer automatically in order to access specific addresses. Basm won't lose track of where the tape pointer is unless you explicitly tell it to do so. When you ask to operate on a cell, basm will have remembered where the tape pointer currently is, and it will then try to move the tape pointer by the delta between the current position and the desired position. The tape pointer position that is remembered by the compiler is also known as the assumed tape pointer position, as it can get temporarily invalidated when working with relative code. For example, if we have the tape pointer at position 3 and ask to reach cell 5, basm will automatically insert >>. There are three instruction made specifically for dealing with the tape pointer and be able to write relative code from it.

`BBOX` (Black Box)

Arguments

Name	Type	Description
addr	number	the address for the tape pointer to be moved to

BBOX is probably the most simple instruction in the whole set of built-in ones. Its one and only purpose is to guaranty the position of the tape pointer at a certain point in the program. All it does it move the tape pointer to addr and does nothing. This behaviour is very useful for other instruction that work off of the assumption that the tape pointer is positioned at a certain cell.

`ASUM` (Assume)

Arguments

Name	Type	Description
addr	number	the address overwriting the assumed tape pointer position

This instruction is tightly integrated with the compiler. What it does is that it overwrites the assumed tape pointer position with the specified addr. This can serve either to put the program into a relative state (where the assumed pointer position is wrong) or to set the assumed pointer position back to a valid value.

When combined with BBOX it can serve to offset the whole address field. Think about it, if the real tape pointer position is 1, but you tell the compiler it's 0 through ASUM, then the assumed pointer position will be offset by +1. Trying to access cell no.0 would lead to accessing cell no.1. When this happens, I like to call this program state the "relative state", all accesses are relative to 1. So, when this happens, all addresses become relative to the offset.

INCR 1 42;
OUT 1; // returns '*'

// offsets the pointer by +1 (so cells by -1)
BBOX 0;
ASUM 1;

OUT 0; // returns '*'

This may all see like too much pedantry for something as simple as offsetting all addresses by one. Yes, in this case it is, but that's not what makes relative code interesting. Relative programming in basm starts becoming truly interesting when you loop this offset. Once you start looping, your offset cannot be known at compile time. Looping offsets are even the main reason why anyone would want to write relative code as it allows to create "flyers". Flyers are an operation which is applied while moving forwards (or backwards) through the tape. The word "flyer"/"glider" comes from the glider in Conway's Game of Life where it is a self replicating structure moving through the board. In basm, flyers can include data cells which move with the shift in assumed pointer position. I think it is easier to understand them in bf:

[.>]

This is a basic flyer in bf. It outputs every cell until it reaches a cell with a value of 0. It may seem like it is useless at first, but it can serve to print strings in memory loaded via LSTR. This flyer takes advantage of the fact that strings are not constituted of arbitrary data and that they will not contain a zero cell (at least not until you the string's end).

We can easily create this pattern in basm with BBOX and ASUM:

// dynamic print string
// Astart is the first cell
// Aend is the 0 cell right after the last non-zero cell
[@DPSTR Astart Aend] [ // "Dynamic Print String"
    BBOX Astr;
    ASUM 0;

    WHNE 0 0 [
        OUT 0;

        // move forward
        BBOX 1;
        ASUM 0;
    ];

    // we set the pointer back to the valid assumption after we are done
    ASUM Aend;
]

(If you compile the above example with a [main] field with only DPSTR 0 0;, it will compile to exactly [.>])

Most notably, this instruction will return the assumed pointer position to a valid value after running. Logic making use of relative state should always use ASUM to set back the assumed pointer position to a valid state. Since all basm code relies on the fact that addressing is static, leaving it in a relative state will no doubt make your program bug!

`RAW`

Arguments

Name	Type	Description
str	string	inlines the string in the transpiled code, this can be used to include brainfuck operators

RAW is inherently relative by default, as any moves made in it is not considered by the compiler. As thus, writing RAW ">"; has the same effect using BBOX 1; ASUM 0; like before. You have access to the whole of bf, which means that you can very easily cause offsets and write flyers. While I prefer using the more basm'ic solution, RAW is a valid solution for your relative programming needs.

Next chapter will focus on applying relative knowledge to create a dynamically indexable array.

Dynamically Indexing

We are now here, all the content chapters are after us and we are on the last stretch to making the final project, being a bf interpreter written in basm. But, before being able to go into that directly, I'd like to isolate a part of that bigger program and built it from scratch prior. To make a bf interpreter we will need a dynamically indexable array.

Parking

Yes, I do say dynamically indexable array and not meta-instruction for dynamically indexing. While it would be possible to index any array dynamically without changing its layout, for the sake of conciseness, we will need to modify the layout of a basic array somewhat to allow for our dynamic get/set operations. To hold meta-instruction specific logic we'll reserve a little amount of memory before the actual array content. That memory region in the array is called the "parking".

Since we are going to use a flyer to move through the array, we need somewhere to create it before it starts moving. The reserved space is going to be used for flyer construction. Our implementation of a dynamically indexable array will need 4 cells zeroed cells for parking before the main contents of the array. Getting and setting will use these spaces in different ways.

`GETD` (Get Dynamic)

Arguments

Name	Type	Description
Aarray	number	the address of the start of the array (including the parking)
Aindex	number	the address of the cell to be gotten in the array
Adst	number	the address that will be used to store the gotten cell

As mentioned prior, we will be using a flyer with state to move through the array. The data it will need to hold are the index (aka the number of cells it needs to move), the return index (aka the number of cells it needs to move to get back to the parking) and, finally, when on the return, it will of course need to contain the gotten cell. We will also need to reserve an empty space (which we will call [swap]) to swap array content from the front to the back of our flyer.

Flyer Layout

A diagram of our flyer would look like so:

[swap].[return][index].[element] ->

And then on the return, once index is depleted:

<- [element].[return][cell].[swap]

[index] holds the distance left until the flyer reaches the index. It counts down for everytime the glider moves forward.

[return] holds og_index - [index]. Basically everytime [index] is decreased, we increase [return]. Since we slowly consume [index] to go forward, we need another counter to go backward. That counter comes in the form of the [return] cell to be able to return. In the best of worlds we would just move back until we reach a certain flag, rather than having to store that [return] index, but unlike strings, arrays can hold any arbitrary data. This means that we can't just move back until we meet a cell with a specific value, as that specific value could be contained in the array itself!

[cell] holds the value of the cell at the index. Once [index] is zeroed by the decrements, [cell] takes its place.

[swap] is the destination cell for [element]. Since array needs to be moved out of the way in order for the glider to move forward, an empty cell needs to be reserved for the element to be moved into. Note that [swap] needs to be there in the parking so that the cell no.0 can be moved out, although [[swap] is not part of the glider. The [swap]cell on the flyer should always be zero. By default,[swap]is included in our parking space and, as we move forward, it gets zeroed by the glider moving operation, so[swap] should always be zero If it is not, it will corrupt the array when flying through it as adding into a non-zero cell is no longer a move, like we expect, but an actual addition.

[element] represents the element of the array right in front of the flyer, it is not part of the flyer. It is the first cell of content of the array in the initial glider state. (Not much to add here...)

(Right now, not all parking cells are used, don't worry we will need all the 4 of them in ADDP)

Making our Flyer Move

Moving our gliders (yes, I will also call flyers gliders) around is going to be a 4 step operation:

End if [index] == 0, decrease the [index] by 1,
Move the cell in front of us ([element] to our back [swap])
Move the whole of the glider cells up by one
Offset the assumed pointer position by one and repeat

This procedure will shift all the cells before the cell at index to our back, this is fixed on the way back, where we do the same shifting to go backwards, effectively cancelling the shift.

A Simpler `INCD` (Increment Dynamic)

Before getting too much into the weeds of implementing this whole thing, let's just stand back a little and write the simplest part of it: a glider that can move to an dynamic index and increase the cell by 1.

That glider will be much simpler! We won't need [return] or [cell], making it a flyer with 1 cell of state.

For context, diagram of our flyer would look like so:

[swap].[index].[element] ->

We won't be able to move backwards without [return]. This also means that the assumed pointer position will be busted once the glider has done its thing. INCD as written here should be the last thing you do execute your program.

[@INCD Aarray Aindex] [
// we still assume the array has 4 cells of parking
// we load the parking like so: [empty][empty][swap][index]
ADDP Aarray+3 Aindex; // Aindex -> [index] 

// entering relative mode
// (since we won't have free unallocated space, we won't be able to use instructions with sp)
BBOX Aarray+2;
ASUM 0;

ALIS Aswap 0;
ALIS Aindex 1;
ALIS Aelement 2;

WHNE Aindex 0 [
    DECR Aindex 1; // step 1
    ADDP Aswap Aelement; // step 2

    ADDP Aindex+1 Aindex; // step 3

    // step 4
    // (it's very important do this before the end of the scope,
    // because we need Aindex to point to the actual index cell so that the WHNE can check it)
    BBOX 1;
    ASUM 0;
];

// we are here, this means that we moved all we need
// the cell we want is right in front of us at Aelement
INCR Aelement 1;

// normally this is here we would come back to unshift all the array
// and fix the assumed tape pointer position, but we can't since we didn't store a return index!
]

Let's try it by giving it an array with some already initialized values!

// add the INCD definition here...

[main] [
ALIS Aindex 0;
INCR Aindex 3;

ALIS Aarray 1;
ALIS Aarray_contents Aarray+4;
INCR Aarray_contents 1;
INCR Aarray_contents+1 2;
INCR Aarray_contents+2 3;
INCR Aarray_contents+3 4; // the one that will be increased
INCR Aarray_contents+4 5;
INCR Aarray_contents+5 6;

INCD Aarray Aindex;
]

If we use the -d flag to dump the memory after execution we can see the program state:

-- TAPE STATE NUMERIC --
0: 0
1: 0
2: 0
3: 1
4: 2
5: 3
6: 0
7: 0
8: 5
9: 5
10: 6

Good news, the cell containing 4 was successfully increased to 5! Bad news, we can see the effects of the array shift: every element before the one we operated on got offset by 2. Notably, you can also see the parking (cells 1 to 4) has been partially overwritten by the shifted elements of the array and the cells 6 and 7, being those of our flyer, being completely zeroed as expected.

Don't expect to write these kind of meta-instruction right on the first try, there is always going to be little typos and unforeseen behaviour! Myself, while writing this example, had made many typos and oversights on the first try.

Back to `GETD`

Alright, enough dilly dallying now! Let's get to actually implementing GETD with what we have learnt.

The first notable difference of GETD from what we just wrote is the memory layout, we already saw what we needed extensively in the layout section. Just keep in mind that our parking layout should look like this:

[empty][swap][return][index]

Just adding the [return] cell (and later the [cell] cell) to our logic isn't too difficult:

[@GETD Aarray Aindex Adst] [
ADDP Aarray+3 Aindex; // Aindex -> [index] 

// switch +2 -> +1 because our flyer is bigger
BBOX Aarray+1;
ASUM 0;

ALIS Aswap 0;
ALIS Areturn 1; // add return
ALIS Aindex 2;
ALIS Aelement 3;

WHNE Aindex 0 [
    DECR Aindex 1; // step 1
    INCR Areturn 1;

    ADDP Aswap Aelement; // step 2

    ADDP Aindex+1 Aindex; // step 3
    // it's important to correctly order the moves,
    // so that your glider doesn't overlap itself
    ADDP Areturn+1 Areturn;

    // step 4
    BBOX 1;
    ASUM 0;
];

// we arrived!
// we don't have a [index] anymore, but we have a [cell] at its place
ALIS Acell Aindex;

ADDP Acell Aelement; // taking the element

// not done yet...

We are not done here! The second notable difference from the INCD is going to be that we want to go back both to un-shift the cells and reset the assumed pointer position to the correct value. If we don't do both of these things, GETD becomes practically the last valid instruction in the program. To do that we will have to copy the code of going forward and modify it to go back. It is as simple as making our loop read from [return]instead of[index] and reordering the flyer state moves.

// ... rest of GETD implementation

// since we move back, we'll swap [element] and [swap]
ALIS Aswap Aelement;
ALIS Aelement 0;

WHNE Areturn 0 [
    DECR Areturn 1; // step 1

    // we now have <- [element][return][cell][swap]
    // we move back so -1 instead of +1
    ADDP Areturn-1 Areturn; // step 3
    ADDP Acell-1 Acell;

    // step 4
    // we inverted BBOX and ASUM values so we can go back
    BBOX 0;
    ASUM 1;

    // step 2 (because of off by 1 reasons, we need it after the move)
    ADDP Aswap Aelement;
];

// we are back at parking here and since we know where parking is,
// we can set the assumed pointer back to a valid value!
BBOX Aelement;
ASUM Aarray+1;

ADDP Adst Aarray+3;
]

We can now try the same [main] as with INCD before, but with INCD Aarray Aindex; changed to GETD Aarray Aindex 0;:

-- TAPE STATE NUMERIC --
0: 4
1: 0
2: 0
3: 0
4: 0
5: 1
6: 2
7: 3
8: 0
9: 5
10: 6

That's exactly what we wanted, the 3rd element of the array has been gotten and moved to cell 0. Notice how none of the array is shifted anymore because of the backtracking we did, perfect!

`ADDD` (Add Dynamic)

Arguments

Name	Type	Description
Aarray	number	the address of the start of the array (including the parking)
Aindex	number	the address of the cell to be gotten in the array
Asrc	number	the address of the cell that will be added to the dynamic one

Now that we are done with GETD, we can move onto ADDD. This meta-instruction is going to serve as our setter, we won't call it SETD though, because it will not zero the cell it adds to, making it more of an add than a set. This means that pretty much all the rules that we saw for moving with ADDP's will apply to this, except now in a dynamic fashion.

Flyer Layout

The interesting (and annoying) part of the ADDD glider, is that rather than carrying a cell value on the return trip, we do it on the going trip. This means we can't take advantage of the index being zeroed in the return trip to store the cell value where index stood. So, sadly, this requires us to make the flyer have 3 reserved data cells at the start (plus the swap, maxing out our 4 spots of parking).

A diagram of our flyer would look like so:

[swap].[return][index][cell].[element] ->

And then on the return, once index is depleted:

<- [element].[return][empty][empty].[swap]

It's going to be important to keep the [empty] cells empty, so that we properly un-shift all the cells on the return.

Implementing

ADDD is very similar to GETD outside of the flyer layout. The only notable difference between the two being that the flyer has 3 data cells, and that rather than taking a cell, it gives one of its cells. Other than that, you can probably take GETD and make an ADDD out of it with ease like we made GETD from our knowledge of INCD (despite being similar in name, ADDD is very different from our arguably incomplete INCD implementation).

With that said, this means that a ADDD implementation should not be foreign to you. So, here is its definition:

[@ADDD Aarray Aindex Asrc] [
// this layout forward: [swap].[return][index][cell].[element]
ADDP Aarray+3 Asrc;   // Asrc   -> [cell]
ADDP Aarray+2 Aindex; // Aindex -> [index] 

// we'll remove the +1, because we'll use the full parking
BBOX Aarray;
ASUM 0;

ALIS Aswap 0;
ALIS Areturn 1;
ALIS Aindex 2;
ALIS Acell 3; // new!
ALIS Aelement 4; // .. which offset this by one

WHNE Aindex 0 [
    DECR Aindex 1; // step 1
    INCR Areturn 1;

    ADDP Aswap Aelement; // step 2

    ADDP Acell+1 Acell; // step 3 (also, new Acell move!)
    ADDP Aindex+1 Aindex;
    ADDP Areturn+1 Areturn;

    // step 4
    BBOX 1;
    ASUM 0;
];

// we arrived!
// adding the element out of the flyer
ADDP Aelement Acell;

ALIS Aswap Aelement;
ALIS Aelement 0;

WHNE Areturn 0 [
    DECR Areturn 1; // step 1

    // we now have <- [element][return][empty][empty][swap]
    // (we won't need to move both empty cells)
    ADDP Areturn-1 Areturn; // step 3

    // step 4
    BBOX 0;
    ASUM 1;

    ADDP Aswap Aelement;
];

BBOX Aelement;
ASUM Aarray; // removed the +1, again
]

Running the following program: ..

// .. ADDD definition here

[main] [
ALIS Aindex 0;
INCR Aindex 3;
// we need to allocate one more cell
ALIS Acell 1;
INCR Acell 40;

ALIS Aarray 2;
ALIS Aarray_contents Aarray+4;
INCR Aarray_contents 1;
INCR Aarray_contents+1 2;
INCR Aarray_contents+2 3;
INCR Aarray_contents+3 4;
INCR Aarray_contents+4 5;
INCR Aarray_contents+5 6;

ADDD Aarray Aindex Acell;
]

.. would expectedly give us this result:

-- TAPE STATE NUMERIC --
0: 0
1: 0
2: 0
3: 0
4: 0
5: 0
6: 1
7: 2
8: 3
9: 44
10: 5
11: 6

With that out of the way, we are ready to tackle the last chapter of this book! In the next chapter, we will use what we just wrote to handle the memory tape array and instruction array reading/setting of our bf interpreter. So, don't throw away these meta-instructions just yet!

Note

These dynamic get/set implementation can probably be improved by you! Due to the nature of flyers (specifically moving the front element to the back swap), using long arrays/a lot of dynamic array addresing is not good for performance. If you can implement something while forgoing using these GETD/ADDP, prefer the unglided way.

Furthermore, there is a way to make dynamic indexing for regular arrays without parking. This would require to move some cells out of the way to create a temporary parking and then set them back after the instruction is done. The issue with this, and why this is not the version used here, is because it is a bit more complex. You need to check wether the index refers to one of the cells that you are going to move away for example.

Synthesis: Making a Bf Interpreter

This chapter focuses entirely on the conception process of a bf interpreter in basm. This interpreter will be able in turn to transpile to bf, to make a bf interpreter in bf! It is highly recommended to have read all other chapters before digging into this one.

Meta-instructions "header"

While working on the logic of the interpreter, I will use meta-instruction which we already defined earlier (notably GETD and ADDD). Make sure to add these meta-instructions definitions before any other code that uses them:

// sets a value to a specific value by zeroing it before writing
[@SET Aaddr Vval] [
ZERO Aaddr;
INCR Aaddr Vval;
]

// copies the content of a cell whilst keeping the source (conservatively)
[@COPC Asrc Adst sp] [
	ALIS Atmp sp;
	COPY Asrc Adst Atmp;
	ADDP Asrc Atmp;
]

// if not equal conditional
[@IFNE Aaddr Vval [scp] sp] [
ALIS Atmp1 sp;
ALIS Atmp2 sp+1;
ALIS sp sp+2;

COPY Aaddr Atmp1 Atmp2;
ADDP Aaddr Atmp2;

WHNE Atmp1 Vval [
    ZERO Atmp1;

    INLN [scp];

    INCR Atmp1 Vval;
];

ZERO Atmp1;
]

// if equal conditional
[@IFEQ Aaddr Vval [scp] sp] [
    ALIS Aflag sp;
    ALIS sp sp+1;
    ALIS Vnot_equal 1;

    IFNE Aaddr Vval [
        INCR Aflag Vnot_equal;
    ] sp;

    IFNE Aflag Vnot_equal [scp] sp;
    ZERO Aflag;
]

// moves the value of the cell at Aarray[value of Aindex] to Adst
[@GETD Aarray Aindex Adst] [
ADDP Aarray+3 Aindex; 

BBOX Aarray+1;
ASUM 0;

ALIS Aswap 0;
ALIS Areturn 1;
ALIS Aindex 2;
ALIS Aelement 3;

WHNE Aindex 0 [
    DECR Aindex 1
    INCR Areturn 1;

    ADDP Aswap Aelement;

    ADDP Aindex+1 Aindex;
    ADDP Areturn+1 Areturn;

    BBOX 1;
    ASUM 0;
];

ALIS Acell Aindex;

ADDP Acell Aelement;

ALIS Aswap Aelement;
ALIS Aelement 0;

WHNE Areturn 0 [
    DECR Areturn 1;

    ADDP Areturn-1 Areturn;
    ADDP Acell-1 Acell;

    BBOX 0;
    ASUM 1;

    ADDP Aswap Aelement;
];

BBOX Aelement;
ASUM Aarray+1;

ADDP Adst Aarray+3;
]

// adds the value of the cell at Asrc to the Aarray[value of Aindex] cell
[@ADDD Aarray Aindex Asrc] [
ADDP Aarray+3 Asrc;
ADDP Aarray+2 Aindex;

BBOX Aarray;
ASUM 0;

ALIS Aswap 0;
ALIS Areturn 1;
ALIS Aindex 2;
ALIS Acell 3;
ALIS Aelement 4;

WHNE Aindex 0 [
    DECR Aindex 1;
    INCR Areturn 1;

    ADDP Aswap Aelement;

    ADDP Acell+1 Acell;
    ADDP Aindex+1 Aindex;
    ADDP Areturn+1 Areturn;

    BBOX 1;
    ASUM 0;
];

ADDP Aelement Acell;

ALIS Aswap Aelement;
ALIS Aelement 0;

WHNE Areturn 0 [
    DECR Areturn 1;

    ADDP Areturn-1 Areturn;

    BBOX 0;
    ASUM 1;

    ADDP Aswap Aelement;
];

BBOX Aelement;
ASUM Aarray;
]

Memory Layout

Our the memory of our bf interpreter's tape consist of 3 parts:

Generic memory for operations, aka operating memory
Instruction array (with parking)
Tape array (with parking)

In memory the sections should look like this:

[operation][instruction][tape]

The tape array will be allowed to grow infinitely, as there is no ranges of allocated allocated cells after it. This is unlike the instruction array, which is stuck in-between the operation and tape array, meaning that we will need to reserve it a static size so that it does not overwrite the tape array's cells. Since we can't expand infinitely before the 0'th cell, as using cells before the 0'th is invalid bf (not the one we are going to interpret, but the one our program is made of), we cannot grow infinitely backwards, only forwards.

Taking User Input

We will want the user to be able to input any bf program they find and paste it directly into the bf input without any processing on the part of the user. We'll also need a way for the user to tell us that they are done inputting the program, for this we can check for a specific character like ! to stop instruction input.

To do this your first though might be to have a counter representing the number of characters inputted. When there is an input, use ADDD to set it to the index of counter and then increase counter by 1. Repeat until we reach !.

That is fine, with nothing more. There are some issues, mostly performance (both compute and size), which arise from it. We'll need all the space we can take, and I assure you that any performance gain will be very appreciated.

First, using this method, we will end up storing any non-operator characters in our instruction array. This both takes up space for no reason and slows down GETD and ADDD (the more element in the array, the less efficient they become).

Second, we don't take advantage of the limited nature of the operators. The set of characters which represent an operator is very restricted. When storing a + with this method, we will actually store the + character, being 43. Remember, most operations in bf take O(n) time, where n: value of the cell Whereas in most languages, operations are in O(1) time, aka constant time. This means that, if we can reduce the values representing operators, this will speed up all operations on operators including indexing them, since array indexing is just shifting them around (shifting once is O(n))

To address these issues, we will discard all non-operators characters and we will map all operators to the lowest values possible. We can also use 0 to denote the start/end of the array contents. Since all cells are zero by default, we won't need to add 0's manually, which is quite nifty. (You could also store the array length to know when you reach the end of the contents, but that will not be possible with our implementation) Here is the mapping:

start / end -> 0
`+` -> 1
`-` -> 2
`>` -> 3
`<` -> 4
`[` -> 5
`]` -> 6
`,` -> 7
`.` -> 8

There is one more issue, although rather small compared to the two first ones. That being, ADDD is not made for this at all! Every time we would add a new operator, we would need to travel through all the other elements of the array we already added before. It's not that bad, since instruction input is a one time process, but this is the best example where GETD and ADDD like meta-instruction should be avoided. So we are going to work around having to use them by making a dedicated instruction array filling flyer.

Writing The Flyer

Instead of having a flyer go out for each operator through the use of ADDD, we can simply write a specialized flyer to fill the array. This input flyer will be able to take advantage of the context to have zero cells of state! That's right, we don't care about where we are or where we are going, so no index needed. Furthermore, since the array it will be constructing does not contain arbitrary data, we can guaranty that certain values will not show up (like 0). So, we can return without having to store a return cell. For the return, we will look for 0, which is not mapped to any instructions, but is present in the parking.

I'll write this logic into a program specific meta-instruction. A program specific meta-instruction is a meta-instruction which is only useful within the context of the program. Custom instructions like COPC are not program specific, because they are generic enough to be used in many circumstances. This is not the case with what we will write. When writing program specific meta-instructions, I like to give them snake case names, like aliases. Don't worry meta-instruction names cannot overwrite your aliases.


[@init_input Aarray] [
BBOX Aarray+4;
ASUM 0;

ALIS Aflag 1;
ALIS Vappended 1;
ALIS Vexit 2;

ALIS sp 2;
// we can still use sp, since we know the cells will be zeroed

WHNE Aflag Vexit [
    IN 0;

    // -- mapping operators --
    IFEQ 0 '+' [
        SET 0 1;
        INCR Aflag Vappended;
    ] sp;
    IFEQ 0 '-' [
        SET 0 2;
        INCR Aflag Vappended;
    ] sp;
    IFEQ 0 '>' [
        SET 0 3;
        INCR Aflag Vappended;
    ] sp;
    IFEQ 0 '<' [
        SET 0 4;
        INCR Aflag Vappended;
    ] sp;
    IFEQ 0 '[' [
        SET 0 5;
        INCR Aflag Vappended;
    ] sp;
    IFEQ 0 ']' [
        SET 0 6;
        INCR Aflag Vappended;
    ] sp;
    IFEQ 0 ',' [
        SET 0 7;
        INCR Aflag Vappended;
    ] sp;
    IFEQ 0 '.' [
        SET 0 8;
        INCR Aflag Vappended;
    ] sp;

    // check for end
    IFEQ 0 '!' [
        ZERO 0;
        SET Aflag Vexit;
    ] sp;

    // if we added something, we need to move up one
    IFEQ Aflag Vappended [
        ZERO Aflag;

        BBOX 1;
        ASUM 0;
    ] sp;

    // we don't clear the last character, even if it was invalid,
    // as the next IN will overwrite the cell
];

// NEVER forget cleanup
ZERO Aflag;

// -- going back --
// because the last char will be our zeroed '!', we need to offset by at least 1
// if we don't want to read 0 directly
BBOX 0;
ASUM 1;

// moves back until we reached the zeroed cells of the parking
WHNE 0 0 [
    BBOX 0;
    ASUM 1;
];

ASUM Aarray+3;
]

As you can see we once again use a flag for control flow. In this case our flag can have three states:

0: when nothing has happened
Vappend: when a operation is appended to the array
Vexit: when the '!' character is detected

This flag doesn't count as the flyer having a state properly speaking. It's more like operation memory than state, since we won't carry that flag along for each glider shift.

We also, despite having a flyer, use sp. We can safely do this as we can assume that the un-visited parts of the array are still zeroed. Also, our input safely overwrites the flag when getting user input, meaning that our code will always consume our flag, leaving no trash on the array. Once again, taking advantage of safe array assumptions saves us from using ADDD/GETD which operate on no assumption and are much less efficient.

Memory Layout in `[main]`

First and foremost, before defining the main logic we'll define the cells for our program to use.

ALIS Voperating_memory 16;
ALIS Aprog_pointer 0;
ALIS Amem_pointer 1;
ALIS Aflag 2;
ALIS Vexit 1;
ALIS Vbracket 2;
ALIS Aoperator 3;
ALIS Acell 4;
ALIS Abracket_jump 5;
ALIS sp 6;

ALIS Vprog_array_cap 256;
ALIS Aprog Voperating_memory;
// we add 4 to take into account the parking of the program array
ALIS Amem Aprog+Vprog_array_cap+4;

There's alot to unpack here. Here are the most notable ones:

`Voperating_memory`

This alias defines how many cells are reserved for general computation (that being all but the arrays). The reserved space granted by Voperating_memory is required for us to store cells like Aflag. It is also required for meta-instruction using sp so they can have zeroed cells. Otherwise, without free zeroed cells, meta-instructions would reach into the program array, which would be bad for program state.

`Aflag`

Aflag is once again a flag cell. It can have one of these values:

0: nothing happened
Vexit: program is done and should exit
Vbracket: a bracket has jumped, so Aprog_pointer should be updated to Abracket_jump (this needs to be done after all the other logic, which is why it is stored in the flag)

`Aoperator` and `Acell`

Those two aliased cells will hold both the current operator and current memory cell respectively. Since we can't edit cells directly in their array, we need to keep them in a static place in memory to read/write them.

`Aprog`

This alias defines the start of the program array (at its parking). The way it is defined is that it comes straight after the operating memory segment, meaning that its address can be defined as the length of the operating memory segment! That means that the starting position of the instruction array is relative to the end of the reserved operation cells range. This would cause an off by 1 error if basm started indexing cells from 1, but it indexes from 0, so we are good!

Aprog also comes with Vprog_array_cap aka the capacity of the program array. We don't inherently need it to define the array, but we will need to define how far away we want the memory array to be from the program array. Effectively, choosing how much space we allocated to the program array. In this example, I set 256 as the array capacity, because that is the max that you can index with an unsigned 8 bit cells. If you want to interpret bigger programs, you can increase this capacity, but beware you will also need to use cells of more than 8 bits to be able to index the instructions!

`Amem`

All that I specified about Aprog applies to Amem. The value is the start of the array (at its parking). While the memory array is technically not limited by memory, since it cannot overwrite other memory, it is still practically limited by the (un)efficiency of indexing and the maximum value storable in the cells, limiting the number of indexable cells.

`Abracket_jump`

Is used to store the address of the bracket operation matching the current bracket operation. (Aka, the [ which is linked to the current ] or vice versa) The bracket jumping implementation will not set the program pointer directly to it when running the bracket specific logic, since the program needs to do some cleanup before updating the Aprog_pointer. That cleanup includes setting the current operation back in Aoperatorback in the array. To do that we need to know the index of that operator, which is always just the index inAprog_pointer. This is why we don't change the operator pointer right away as we need to store it to return Aoperator. You are probably going to understand it better when you'll see the code managing this for yourself.

Writing `[main]` Logic

The bread and butter of our logic is of course going to be a loop. We'll loop over every operator in the program until we reach the end, applying the effects of them as we go. To be more precise here's the broken down version of what the loop does:

Get the operator at Aprog_pointer
Execute the operator specific logic
Set the operator back
Increase Aprog_pointer

Notice how getting the memory cell is not part of the loop, and neither will it be part of operator specific logic. This is because it would be incredibly wasteful to get and set the cell in the tape for every operator since the cell mostly remains over operators. Only > and < change the cell index, we will set and get a new cell only when these operators appear. Think about interpreting the very common pattern +++. If we were to get the cell, increase it by one and then put it back 3 times performance would be horrendous! So, to combat this, Acell functions more like a cell "cache" then its Aoperator counterpart. We only get/set the memory cell when moving the tape pointer with > and <.

With all that said, let's start implementing the easy parts of the logic, that being all operators, but the brackets:

// .. add the header here
// .. define init_input here

[main] [
ALIS Voperating_memory 16;
ALIS Aprog_pointer 0;
ALIS Amem_pointer 1;
ALIS Aflag 2;
ALIS Vexit 1;
ALIS Vbracket 2;
ALIS Aoperator 3;
ALIS Acell 4;
ALIS Abracket_jump 5;
ALIS sp 6;

ALIS Vprog_array_cap 256;
ALIS Aprog Voperating_memory;
ALIS Amem Aprog+Vprog_array_cap+4;

// get program from user
init_input Aprog;

WHNE Aflag Vexit [
    ALIS Atmp sp;
    ALIS sp sp+1;

    // 1. load in operator
    COPC Aprog_pointer Atmp sp;
    GETD Aprog Atmp Aoperator;

    // 2. operator specific logic
    // + + + + +
    IFEQ Aoperator 1 [
        INCR Acell 1;
    ] sp;

    // - - - - -
    IFEQ Aoperator 2 [
        DECR Acell 1;
    ] sp;

    // > > > > >
    IFEQ Aoperator 3 [
        // store the old cell
        COPC Amem_pointer Atmp sp;
        ADDD Amem Atmp Acell;

        // get the new cell
        INCR Amem_pointer 1;
        COPC Amem_pointer Atmp sp;
        GETD Amem Atmp Acell;
    ] sp;

    // < < < < <
    IFEQ Aoperator 4 [
        // store the old cell
        COPC Amem_pointer Atmp sp;
        ADDD Amem Atmp Acell;

        // get the new cell
        // NOTE: this DECR may underflow the tape pointer
        DECR Amem_pointer 1;
        COPC Amem_pointer Atmp sp;
        GETD Amem Atmp Acell;
    ] sp;

    // TODO: implement '[' and ']'

    // , , , , ,
    IFEQ Aoperator 7 [
        IN Acell;
    ] sp;

    // . . . . .
    IFEQ Aoperator 8 [
        OUT Acell;
    ] sp;

    // check for the end
    IFEQ Aoperator 0 [
        INCR Aflag Vexit;
    ] sp;

    // set the operator back
    COPC Aprog_pointer Atmp sp;
    ADDD Aprog Atmp Aoperator;

    // NOTE: we'll add something here later... (ominous)

    // increase program pointer
    INCR Aprog_pointer 1;
];
]

With that that's seemingly most of the bf interpreter done already! (with the exception of [ and ]) You can take it easy from here we're already on the final stretch. With this stripped down bf interpreter, we don't have access to looping or conditionals, but we have enough to print some basic characters. Try the following code: (don't forget to use ! to end program input!)

outputs the answer to life the universe and everything
++++++++++++++++++++++++++++++++++++++++++++++++++++.--.

The "Needlessly" Complicated Part

You may not have understood the part about Abracket_jump and Vbracket for Aflag. It's time where we need to use them... Welp, the bracket jumping is logic not actually that bad, it's just that it is the most complicated part of this interpreter by far. If you equate complexity with badness, like me, this is the worst part, but it ain't that bad. Don't treat, my implementation is going to be the simplest one possible.

When we get a [ and we point to 0 or when we have ] and point to a non-zero value, we follow this procedure to find the index of the opposing matching bracket:

Initiate a counter at 1
Initiate a search pointer at the current program pointer
Add/Subtract the search pointer by 1
Get the operator at the search pointer
If the operator is a bracket and it is the same as the original, add 1 to the counter. Else, if the operator is a bracket and the bracket is opposite, remove 1 to the counter
Set the operator back at the search pointer
If the counter is non-zero, repeat from number 3

With this procedure, our search counter should always end on the matching bracket! Or, the program will loop forever/read oob if there is an unmatched bracket, but most bf code doesn't contain unmatched brackets. So, this is not a issue by one bit. Once we have completed the procedure the searched search counter, equates to the position of the matching bracket. We don't want to apply it right away for reasons stated before. Instead, we will opt into setting the flag to Vbracket, so we can apply it after having set the operator back.

Add these IFEQ instructions to apply bracket logic:

// [ [ [ [ [
IFEQ Aoperator 5 [
    IFEQ Acell 0 [
        ALIS Acounter sp;
        INCR Acounter 1; // 1.
        ALIS Asearch_op sp+1;
        ALIS sp sp+2;

        // 2.
        // we don't need to define our search counter here,
        // we can use Abracket_jump directly
        COPC Aprog_pointer Abracket_jump sp;
        WHNE Acounter 0 [
            // 3.
            INCR Abracket_jump 1;

            // 4.
            COPC Abracket_jump Atmp sp;
            GETD Aprog Atmp Asearch_op;

            // 5.
            // if op == '['
            IFEQ Asearch_op 5 [
                INCR Acounter 1;
            ] sp;
            // if op == ']'
            IFEQ Asearch_op 6 [
                DECR Acounter 1;
            ] sp;

            // 6.
            COPC Abracket_jump Atmp sp;
            ADDD Aprog Atmp Asearch_op;
        ]; // 7.

        INCR Aflag Vbracket;
    ] sp;
] sp;

// ] ] ] ] ]
IFEQ Aoperator 6 [
    IFNE Acell 0 [
        ALIS Acounter sp;
        INCR Acounter 1; // 1.
        ALIS Asearch_op sp+1;
        ALIS sp sp+2;
  
        // 2.
        COPC Aprog_pointer Abracket_jump sp;
        WHNE Acounter 0 [
            // 3.
            DECR Abracket_jump 1;
  
            // 4.
            COPC Abracket_jump Atmp sp;
            GETD Aprog Atmp Asearch_op;
  
            // 5.
            // if op == '['
            IFEQ Asearch_op 5 [
                DECR Acounter 1;
            ] sp;
            // if op == ']'
            IFEQ Asearch_op 6 [
                INCR Acounter 1;
            ] sp;
  
            // 6.
            COPC Abracket_jump Atmp sp;
            ADDD Aprog Atmp Asearch_op;
        ]; // 7.

        INCR Aflag Vbracket;
    ] sp;
] sp;

And the extra conditional to check the flag and apply the bracket jump: (you should add it at my ominous `NOTE

// if there was a jump, set the pointer to the matching bracket
// (having the pointer increment after is intentional)
IFEQ Aflag Vbracket [
    ZERO Aprog_pointer;
    ADDP Aprog_pointer Abracket_jump;
    ZERO Aflag;
] sp;

Whew... that, was, all. So now, with this enormous blob of bracket logic written, we finally have a 100% functional bf interpreter in basm!!!

Try it out with this fancier "Hello Word!" example: (taken from Wikipedia)

++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.

What to Improve

You now own a bf interpreter, problem: it's slooooooooow... I highly encourage you to keep on working on this code to get a better handle on the logic that goes behind it. You can probably improve your knowledge of the language overall just by reviewing the interpreter. In implementing tweaks, you will find your own preferred way to approach writing basm code and find how you typically make mistakes. Making mistakes is part of learning, don't forget that!

The bf interpreter may have seemed like an easy project if you just copied and pasted the code provided in this book, but let me tell you it isn't! Myself, despite having already written an extremely similar interpreter in basm and despite basm being a language I created, has struggled with many bugs which were not mentioned in this book for brevity and also for my ego a little. For example, as I was rewriting every meta-instruction from scratch for the book, I made a mistake in the original IFEQ definition. I forgot to zero the flag. This issue caught up with me when writing this very chapter on the interpreter. I didn't suspect that IFEQ, which I had written a long while ago and worked seeming well, would pollute cells. Now, with my experience and the fact that I already had a working IFEQ implementation before writing the book version, you would expect it to be easy to spot that mistake. It wasn't, especially considering this was not the only mistake I had made on the interpreter.

I won't go over how to implement patches to improve the interpreter beyond what is shown above, but I can point out areas that you can thinker with if you crave performance.

Bulking Operators

Adjacent operators of the same type can be merged into one single 2 cell wide item on the program array. For example, you could lay bulked operator on the array with this layout: [operator][recurrence]. So, [+][+][+][+] could be represented as [+][4] saving both space and the time to get those cells. (Saving space also comes with saving time, as the bigger the array is, the longer it will take to index)

You could also store the precomputed index of the matching partner for brackets instead of recurrence. That would save iterating over the program array every time a bracket jumps.

While at storing items of two cells wide in an indexable array, you could also implement indexable arrays of 2 cell big elements. Firstly, that would save you having to do two gets/sets for each element. Secondly, you could index by element rather than by cell, increasing the length of programs that can be stored with 8 bit cells.

Flipping The Program Array

Currently memory is layed-out as such:

[operating memory].[prog parking][prog content].[mem parking][mem content]

This is perfectly fine if you don't care what the transpiled bf looks like. If you do care about what the compiled bf looks like you should be worried.

The compiled basm (with optimization) looks like this:

>>>>>>>>>>>>>>>>>>>>>--[++<,[->>>+>+<<<<]>>>>[-<<<<+>>>>]<-------------------------------------------[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<+<[-]+>>>][-]---------------------------------------------<[-]<<[->>>+>+<<<<]>>>>[-<<<<+>>>>]<[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<+<[-]++>>>][-]--------------------------------------------------------------<[-]<<[->>>+>+<<<<]>>>>[-<<<<+>>>>]<[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<+<[-]+++>>>][-]------------------------------------------------------------<[-]<<[->>>+>+<<<<]>>>>[-<<<<+>>>>]<[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<+<[-]++++>>>][-]-------------------------------------------------------------------------------------------<[-]<<[->>>+>+<<<<]>>>>[-<<<<+>>>>]<[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<+<[-]+++++>>>][-]---------------------------------------------------------------------------------------------<[-]<<[->>>+>+<<<<]>>>>[-<<<<+>>>>]<[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<+<[-]++++++>>>][-]--------------------------------------------<[-]<<[->>>+>+<<<<]>>>>[-<<<<+>>>>]<[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<+<[-]+++++++>>>][-]----------------------------------------------<[-]<<[->>>+>+<<<<]>>>>[-<<<<+>>>>]<[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<+<[-]++++++++>>>][-]---------------------------------<[-]<<[->>>+>+<<<<]>>>>[-<<<<+>>>>]<[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<<[-]>[-]++>>][-]-<[-]<[->>+>+<<<]>>>[-<<<+>>>]<[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<[-]>>>][-]<[-]<--][-]<<[<]<<<<<<<<<<<<<<<<<-[+<<[->>>>>>+>+<<<<<<<]>>>>>>[->>>>>>>>>>>>>+<<<<<<<<<<<<<]>[-<<<<<<<+>>>>>>>]>>>>>>>>>>>>[->[-<<<+>>>]<[->+<]<+[->+<]>>]>[-<+>]<<[-[-<+>]>[-<+>]<<<[->>>+<<<]>]>[-<<<<<<<<<<<<<<<<+>>>>>>>>>>>>>>>>]<<<<<<<<<<<<<<<<[->>>>>+>+<<<<<<]>>>>>>[-<<<<<<+>>>>>>]<-[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<<<+>>>>][-]--<[-]<<<<[->>>>>+>+<<<<<<]>>>>>>[-<<<<<<+>>>>>>]<[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<<<->>>>][-]---<[-]<<<<[->>>>>+>+<<<<<<]>>>>>>[-<<<<<<+>>>>>>]<[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<<<<<<[->>>>>+>+<<<<<<]>>>[->>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<]>>[->>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<]>[-<<<<<<+>>>>>>]>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>[->>[-<<<<+>>>>]<[->+<]<[->+<]<+[->+<]>>]>[->+<]<<[-[-<+>]<<[->>>>+<<<<]>]<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+[->>>>>+>+<<<<<<]>>>>>[->>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<]>[-<<<<<<+>>>>>>]>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>[->[-<<<+>>>]<[->+<]<+[->+<]>>]>[-<+>]<<[-[-<+>]>[-<+>]<<<[->>>+<<<]>]>[-<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<][-]----<[-]<<<<[->>>>>+>+<<<<<<]>>>>>>[-<<<<<<+>>>>>>]<[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<<<<<<[->>>>>+>+<<<<<<]>>>[->>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<]>>[->>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<]>[-<<<<<<+>>>>>>]>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>[->>[-<<<<+>>>>]<[->+<]<[->+<]<+[->+<]>>]>[->+<]<<[-[-<+>]<<[->>>>+<<<<]>]<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<-[->>>>>+>+<<<<<<]>>>>>[->>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<]>[-<<<<<<+>>>>>>]>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>[->[-<<<+>>>]<[->+<]<+[->+<]>>]>[-<+>]<<[-[-<+>]>[-<+>]<<<[->>>+<<<]>]>[-<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<][-]<[-]<<<<[->>>>>>>+>+<<<<<<<<]>>>>>>>>[-<<<<<<<<+>>>>>>>>]<-----[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<<<<<[->>>>>>+>+<<<<<<<]>>>>>>>[-<<<<<<<+>>>>>>>]<[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<<+<<<<<<<[->>>>>+>>>>+<<<<<<<<<]>>>>>>>>>[-<<<<<<<<<+>>>>>>>>>]<<[<<+[->+>>>+<<<<]>[->>>>>>>>>>>>>+<<<<<<<<<<<<<]>>>[-<<<<+>>>>]>>>>>>>>>>[->[-<<<+>>>]<[->+<]<+[->+<]>>]>[-<+>]<<[-[-<+>]>[-<+>]<<<[->>>+<<<]>]>[-<<<<<<<<<<<+>>>>>>>>>>>]<<<<<<<<<<<[->>+>+<<<]>>>[-<<<+>>>]<-----[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<<+>>>][-]------<[-]<[->>+>+<<<]>>>[-<<<+>>>]<[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<<->>>][-]<[-]<[->>>>>>>>>>>+<<<<<<<<<<<]<<<[->+>>>+<<<<]>[->>>>>>>>>>>>+<<<<<<<<<<<<]>>>[-<<<<+>>>>]>>>>>>>>>[->>[-<<<<+>>>>]<[->+<]<[->+<]<+[->+<]>>]>[->+<]<<[-[-<+>]<<[->>>>+<<<<]>]<<<<<<<<<<]<<<<<++>>>>>>>>][-]<[-]>][-]<[-]<<<<<<[->>>>>>>>>+>+<<<<<<<<<<]>>>>>>>>>>[-<<<<<<<<<<+>>>>>>>>>>]<------[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<<<<<<<[->>>>>>>+>+<<<<<<<<]>>>>>>>>[-<<<<<<<<+>>>>>>>>]<[[-]<<+<<<<<<<<<[->>>>>+>>>>>>+<<<<<<<<<<<]>>>>>>>>>>>[-<<<<<<<<<<<+>>>>>>>>>>>]<<[<<<<-[->+>>>>>+<<<<<<]>[->>>>>>>>>>>>>+<<<<<<<<<<<<<]>>>>>[-<<<<<<+>>>>>>]>>>>>>>>[->[-<<<+>>>]<[->+<]<+[->+<]>>]>[-<+>]<<[-[-<+>]>[-<+>]<<<[->>>+<<<]>]>[-<<<<<<<<<+>>>>>>>>>]<<<<<<<<<[->>+>+<<<]>>>[-<<<+>>>]<-----[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<<->>>][-]------<[-]<[->>+>+<<<]>>>[-<<<+>>>]<[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<<+>>>][-]<[-]<[->>>>>>>>>+<<<<<<<<<]<<<<<[->+>>>>>+<<<<<<]>[->>>>>>>>>>>>+<<<<<<<<<<<<]>>>>>[-<<<<<<+>>>>>>]>>>>>>>[->>[-<<<<+>>>>]<[->+<]<[->+<]<+[->+<]>>]>[->+<]<<[-[-<+>]<<[->>>>+<<<<]>]<<<<<<<<]<<<<<<<++>>>>>>>>>][-]>][-]-------<[-]<<<<<<<<[->>>>>>>>>+>+<<<<<<<<<<]>>>>>>>>>>[-<<<<<<<<<<+>>>>>>>>>>]<[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<<<<<<<,>>>>>>>>][-]--------<[-]<<<<<<<<[->>>>>>>>>+>+<<<<<<<<<<]>>>>>>>>>>[-<<<<<<<<<<+>>>>>>>>>>]<[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<<<<<<<.>>>>>>>>][-]<[-]<<<<<<<<[->>>>>>>>>+>+<<<<<<<<<<]>>>>>>>>>>[-<<<<<<<<<<+>>>>>>>>>>]<[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<<<<<<<<<+>>>>>>>>>>][-]<[-]<<<<<<<<<<<[->>>>>>+>>>>>+<<<<<<<<<<<]>>>[->>>>>>>>>>>>>>>>+<<<<<<<<<<<<<<<<]>>>[->>>>>>>>>>>>+<<<<<<<<<<<<]>>>>>[-<<<<<<<<<<<+>>>>>>>>>>>]>>>>>>>[->>[-<<<<+>>>>]<[->+<]<[->+<]<+[->+<]>>]>[->+<]<<[-[-<+>]<<[->>>>+<<<<]>]<<<<<<<<<<<<<<<[->>>>>>>>>>+>+<<<<<<<<<<<]>>>>>>>>>>>[-<<<<<<<<<<<+>>>>>>>>>>>]<--[[-]<+>][-]<[->+>+<<]>>[-<<+>>]<-[[-]<<<<<<<<<<<<[-]>>[-]>>>[-<<<<<+>>>>>]>>>>>>>][-]<[-]<<<<<<<<<<<+>>-]+

This is 11567 operators. Large spans of them being moves of the tape pointer. This inefficiency is caused by the fact that everytime we want to do something, like move a value from the operating memory to the memory parking, we need to move across the whole the program array.

We can fix this by moving around the memory layout a little bit:

[prog content][prog parking].[operating memory].[mem parking][mem content]

Note the program array being both before the operating memory AND being flipped. (If we had kept the parking before the content, we would still need to jump over all of the content of the program array, except, this time, instead of when it would be to access the memory, it would be to access the program) That new layout saves us the jumps over the program array, but still keeps the program array capacity limit.

This is much better in concept, but is quite annoying in execution. You won't be able to use the same meta-instructions to get/set the program and memory arrays due the fact that the array is flipped. This means two new whole instructions that will need debugging. It also causes a huge amount of code duplication, since you probably just need to change the assumed pointer increments by decrements and decrements by increments for the most part. If you make it, please don't show it to "Don't Repeat Yourself" (aka DRY) people!

While this would probably save many operations, it still would not be enough to cram the interpreter in itself without a little finagling. In the advent you would want to make this bf interpreter run itself, you would need to find a way to move from 0 to the operating memory segment while using > less than Vprog_array_cap to move over the program array. Not being able to do so would mean that your interpreter is always bigger than its program array. This is of course something you want to avoid if you were to want to run it on itself.

You can fix this issue with the usage of flyer, but I will leave this for you to solve.

Get a Better Bf Interpreter / Use a Compiler

Finally, if you want better performance, don't use the bf interpreter built-in with the basm cli or, generally, just don't use an interpreter and get a compiler. While I believe the built-in interpreter is capable of running an helping somewhat with debugging via its dump feature (with flag -d), it is not the best for performance in no way.

If you want to get just the transpiled basm code, use basm compile rather than basm run. You can then use that transpiled bf output in another bf interpreter / compiler. (What I mean by "compiler", is something that compiles to machine code)

Conclusion

You've now learnt possibly one of the most useless languages of the world, should I congratulate you? Yes! (but you should really find other things to do with your time)

If you still have unanswered questions, want to contribute to this book or want to contribute to the basm programming language head over to the github repository.

Full Code

For your copy-pasting convinience, here is the full file for the bf interpreter as implemented in this book:

// sets a value to a specific value by zeroing it before writing
[@SET Aaddr Vval] [
ZERO Aaddr;
INCR Aaddr Vval;
]

// copies the content of a cell whilst keeping the source (conservatively)
[@COPC Asrc Adst sp] [
	ALIS Atmp sp;
	COPY Asrc Adst Atmp;
	ADDP Asrc Atmp;
]

// if not equal conditional
[@IFNE Aaddr Vval [scp] sp] [
ALIS Atmp1 sp;
ALIS Atmp2 sp+1;
ALIS sp sp+2;

COPY Aaddr Atmp1 Atmp2;
ADDP Aaddr Atmp2;

WHNE Atmp1 Vval [
    ZERO Atmp1;

    INLN [scp];

    INCR Atmp1 Vval;
];

ZERO Atmp1;
]

// if equal conditional
[@IFEQ Aaddr Vval [scp] sp] [
    ALIS Aflag sp;
    ALIS sp sp+1;
    ALIS Vnot_equal 1;

    IFNE Aaddr Vval [
        INCR Aflag Vnot_equal;
    ] sp;

    IFNE Aflag Vnot_equal [scp] sp;
    ZERO Aflag;
]

// moves the value of the cell at Aarray[value of Aindex] to Adst
[@GETD Aarray Aindex Adst] [
ADDP Aarray+3 Aindex; 

BBOX Aarray+1;
ASUM 0;

ALIS Aswap 0;
ALIS Areturn 1;
ALIS Aindex 2;
ALIS Aelement 3;

WHNE Aindex 0 [
    DECR Aindex 1;
    INCR Areturn 1;

    ADDP Aswap Aelement;

    ADDP Aindex+1 Aindex;
    ADDP Areturn+1 Areturn;

    BBOX 1;
    ASUM 0;
];

ALIS Acell Aindex;

ADDP Acell Aelement;

ALIS Aswap Aelement;
ALIS Aelement 0;

WHNE Areturn 0 [
    DECR Areturn 1;

    ADDP Areturn-1 Areturn;
    ADDP Acell-1 Acell;

    BBOX 0;
    ASUM 1;

    ADDP Aswap Aelement;
];

BBOX Aelement;
ASUM Aarray+1;

ADDP Adst Aarray+3;
]

// adds the value of the cell at Asrc to the Aarray[value of Aindex] cell
[@ADDD Aarray Aindex Asrc] [
ADDP Aarray+3 Asrc;
ADDP Aarray+2 Aindex;

BBOX Aarray;
ASUM 0;

ALIS Aswap 0;
ALIS Areturn 1;
ALIS Aindex 2;
ALIS Acell 3;
ALIS Aelement 4;

WHNE Aindex 0 [
    DECR Aindex 1;
    INCR Areturn 1;

    ADDP Aswap Aelement;

    ADDP Acell+1 Acell;
    ADDP Aindex+1 Aindex;
    ADDP Areturn+1 Areturn;

    BBOX 1;
    ASUM 0;
];

ADDP Aelement Acell;

ALIS Aswap Aelement;
ALIS Aelement 0;

WHNE Areturn 0 [
    DECR Areturn 1;

    ADDP Areturn-1 Areturn;

    BBOX 0;
    ASUM 1;

    ADDP Aswap Aelement;
];

BBOX Aelement;
ASUM Aarray;
]

[@init_input Aarray] [
BBOX Aarray+4;
ASUM 0;

ALIS Aflag 1;
ALIS Vappended 1;
ALIS Vexit 2;

ALIS sp 2;
// we can still use sp, since we know the cells will be zeroed

WHNE Aflag Vexit [
    IN 0;

    // -- mapping operators --
    IFEQ 0 '+' [
        SET 0 1;
        INCR Aflag Vappended;
    ] sp;
    IFEQ 0 '-' [
        SET 0 2;
        INCR Aflag Vappended;
    ] sp;
    IFEQ 0 '>' [
        SET 0 3;
        INCR Aflag Vappended;
    ] sp;
    IFEQ 0 '<' [
        SET 0 4;
        INCR Aflag Vappended;
    ] sp;
    IFEQ 0 '[' [
        SET 0 5;
        INCR Aflag Vappended;
    ] sp;
    IFEQ 0 ']' [
        SET 0 6;
        INCR Aflag Vappended;
    ] sp;
    IFEQ 0 ',' [
        SET 0 7;
        INCR Aflag Vappended;
    ] sp;
    IFEQ 0 '.' [
        SET 0 8;
        INCR Aflag Vappended;
    ] sp;

    // check for end
    IFEQ 0 '!' [
        ZERO 0;
        SET Aflag Vexit;
    ] sp;

    // if we added something, we need to move up one
    IFEQ Aflag Vappended [
        ZERO Aflag;

        BBOX 1;
        ASUM 0;
    ] sp;

    // we don't clear the last character, even if it was invalid,
    // as the next IN will overwrite the cell
];

// NEVER forget cleanup
ZERO Aflag;

// -- going back --
// because the last char will be '!', we need to offset by at least 1
BBOX 0;
ASUM 1;

// moves back until we reached the zeroed cells of the parking
WHNE 0 0 [
    BBOX 0;
    ASUM 1;
];

ASUM Aarray+3;
]

[main] [
ALIS Voperating_memory 16;
ALIS Aprog_pointer 0;
ALIS Amem_pointer 1;
ALIS Aflag 2;
ALIS Vexit 1;
ALIS Vbracket 2;
ALIS Aoperator 3;
ALIS Acell 4;
ALIS Abracket_jump 5;
ALIS sp 6;

ALIS Vprog_array_cap 256;
ALIS Aprog Voperating_memory;
ALIS Amem Aprog+Vprog_array_cap+4;

// get program from user
init_input Aprog;

WHNE Aflag Vexit [
    ALIS Atmp sp;
    ALIS sp sp+1;

    // 1. load in operator
    COPC Aprog_pointer Atmp sp;
    GETD Aprog Atmp Aoperator;

    // 2. operator specific logic
    // + + + + +
    IFEQ Aoperator 1 [
        INCR Acell 1;
    ] sp;

    // - - - - -
    IFEQ Aoperator 2 [
        DECR Acell 1;
    ] sp;

    // > > > > >
    IFEQ Aoperator 3 [
        // store the old cell
        COPC Amem_pointer Atmp sp;
        ADDD Amem Atmp Acell;

        // get the new cell
        INCR Amem_pointer 1;
        COPC Amem_pointer Atmp sp;
        GETD Amem Atmp Acell;
    ] sp;

    // < < < < <
    IFEQ Aoperator 4 [
        // store the old cell
        COPC Amem_pointer Atmp sp;
        ADDD Amem Atmp Acell;

        // get the new cell
        // NOTE: this DECR may underflow the tape pointer
        DECR Amem_pointer 1;
        COPC Amem_pointer Atmp sp;
        GETD Amem Atmp Acell;
    ] sp;

    // [ [ [ [ [
IFEQ Aoperator 5 [
    IFEQ Acell 0 [
        ALIS Acounter sp;
        INCR Acounter 1; // 1.
        ALIS Asearch_op sp+1;
        ALIS sp sp+2;

        // 2.
        // we don't need to define our search counter here,
        // we can use Abracket_jump directly
        COPC Aprog_pointer Abracket_jump sp;
        WHNE Acounter 0 [
            // 3.
            INCR Abracket_jump 1;

            // 4.
            COPC Abracket_jump Atmp sp;
            GETD Aprog Atmp Asearch_op;

            // 5.
            // if op == '['
            IFEQ Asearch_op 5 [
                INCR Acounter 1;
            ] sp;
            // if op == ']'
            IFEQ Asearch_op 6 [
                DECR Acounter 1;
            ] sp;

            // 6.
            COPC Abracket_jump Atmp sp;
            ADDD Aprog Atmp Asearch_op;
        ]; // 7.

        INCR Aflag Vbracket;
    ] sp;
] sp;

// ] ] ] ] ]
IFEQ Aoperator 6 [
    IFNE Acell 0 [
        ALIS Acounter sp;
        INCR Acounter 1; // 1.
        ALIS Asearch_op sp+1;
        ALIS sp sp+2;
    
        // 2.
        COPC Aprog_pointer Abracket_jump sp;
        WHNE Acounter 0 [
            // 3.
            DECR Abracket_jump 1;
    
            // 4.
            COPC Abracket_jump Atmp sp;
            GETD Aprog Atmp Asearch_op;
    
            // 5.
            // if op == '['
            IFEQ Asearch_op 5 [
                DECR Acounter 1;
            ] sp;
            // if op == ']'
            IFEQ Asearch_op 6 [
                INCR Acounter 1;
            ] sp;
    
            // 6.
            COPC Abracket_jump Atmp sp;
            ADDD Aprog Atmp Asearch_op;
        ]; // 7.

        INCR Aflag Vbracket;
    ] sp;
] sp;

    // , , , , ,
    IFEQ Aoperator 7 [
        IN Acell;
    ] sp;

    // . . . . .
    IFEQ Aoperator 8 [
        OUT Acell;
    ] sp;

    // check for the end
    IFEQ Aoperator 0 [
        INCR Aflag Vexit;
    ] sp;

    // set the operator back
    COPC Aprog_pointer Atmp sp;
    ADDD Aprog Atmp Aoperator;

    IFEQ Aflag Vbracket [
        ZERO Aprog_pointer;
        ADDP Aprog_pointer Abracket_jump;
        ZERO Aflag;
    ] sp;

    // increase program pointer
    INCR Aprog_pointer 1;
];
]

Addendum - [setup] scope

After having fully written this book and as I started working on an improved version of the current bf interpreter (the naive version of which is the one discussed in this very book), I finally caved to implement something which had been bugging my mind for a long time: a way to have global aliases. Such a feature always was on the list of things to introduce into the language had I had enough dedication and time, and as things went forth and as I made, quite honestly, fairly simple programs I thought that such a thing would not be required for v1.0. The full breath of consequences of this retrospectively earnestly stupid decision of mine only came down onto me as finally undertook a true "project of size", which was as previously stated the improved version of the bf interpreter in basm. Such an undertaking required to refactor and modify many values which were repeated all over, like the ids associated with each bf operator and the size of dynamically readable arrays.

So, to solve this I added one more field: the [setup] field. Such a field had always been in my notes, only it is now that I see that this field is truely required to make a workable language.

What is it?

The [setup] field is very similar to [main] in function. It's scope contains zero or more scopes or instructions. Any code included within [setup] will be put right before [main]. There can only be one [setup] field. Right about now this field sounds pretty useless: "Just another place to put code in..." Well, not really. Where [setup] shines is in it's two unique characteristics:

It is normalized before any other fields, and
All aliases contained within the lowest scope are available from anywhere in the file

"It is normalized before any other fields"

This statement may not seem like too much, but it's ramifications are immense. At least if you can understand it. In the case of basm, normalizing means replacing aliases by their value. This means that meta-instructions are not normalized before [setup]. Then logically, meta-instructions cannot be used in [setup]. To be more precise, meta-instructions are not declared when [setup] is normalized, so if there are any meta-instructions they will raise an error stating that they are undefined. Despite the fact that they may be defined elsewhere in the file, they are not defined at the time of [setup] normalization. A meta-instruction cannot be inlined as it is not defined!

This clause is limits quite a lot your ability to put logic within this field, however it is required for the second much, more empowering characteristic, of [setup].

"All aliases contained within the lowest scope are available from anywhere in the file"

This does what it says on the tin. Any scope, including most importantly the scopes of meta-instructions, can access aliases defined in [setup] unless that alias has been shadowed in scopes lower than the current one. That behaviour allows the creation of aliases available through the whole program, practically creating global aliases. Although, the wording on this is a little clumsy. "anywhere in the file", in this context, means that they are available in any scope other than the ones contained within [setup]. Of course, [setup] could not reference an alias which it has not yet created.

Here's an example of the global aliases at work:

[setup] [
// the idiomatic way to name global aliases is to prefix with "G"
ALIS GVfrob 42;
ALIS GVdefault 10;

// any built-in instruction can be inserted here,
// but it is not recommended to do so for anything other than value setting
]

[@FROB Acell] [
    INCR Acell GVfrob;
]

[main] [
    ALIS Acell 0;
    INCR Acell GVdefault;
    FROB Acell;

    OUT Acell; // should return 52


    // overwriting the global (only affects this scope and subscopes)
    ALIS GVdefault 5;
    // this will overwrite the global in this scope, but not in the meta
    // since meta-instructions aliases are independent from the caller's
    ALIS GVfrob 0;

    ALIS Acell 1;
    INCR Acell GVdefault;
    FROB Acell;

    OUT Acell; // should return 47 (despite the fact that we set GVfrob to 0 in main)
]

Built-in Instructions

Here is the list of all built-in instructions.

Name	Arguments	Function
ZERO	addr	sets the value of`addr` to 0
INCR	addr, value	increments the value of the`addr` cell by `value`
DECR	addr, value	decrements the value of the`addr` cell by `value`
ADDP	addr1, addr2	adds`addr2` to `addr1`, the result is stored in `addr1` (in place)
SUBP	addr1, addr2	subtract`addr2` from `addr1`, the result is stored in `addr1` (in place)
COPY	addr1, addr2, addr3	copies the value of`addr1` into `addr2` and `addr3`
WHNE	addr, value, [scope]	while the value of`addr` cell is not equal to `value` runs the `[scope]`. `addr` is not consumed
IN	addr	takes input from the user and sets it in`addr`, behaviour will vary between bf implementations
OUT	addr	outputs the value of`addr`, `addr` is not consumed
LSTR	start_addr, "str"	loads the string character by character into cells from the`start_addr` advancing forward
PSTR	addr, "str"	prints the string character by character using the cell`addr` as a buffer
ALIS	ident, value or [scope]	creates an alias to a value or scope named`ident`. This instruction is purely abstraction
INLN	[scope]	inlines a scope
RAW	"str"	includes the string after transpilation, this can be used to include brainfuck operators
BBOX	addr	moves the tape pointer to`addr`
ASUM	addr	tells to compiler to assume that the tape pointer is at`addr`. If that assumption is wrong all cells accesses will be offset

Keywords

Here is a list of keywords specific to basm and their meaning.

allocated: a cell is said to be "allocated" when it is reserved/used by an operation
dynamic/relative: an access to memory which the address is not known to the compiler at compile time. It can vary over executions.
address number: a value of numeric type in the basm source code file, which represents the address of a cell.
pure number: a value of numeric type in the basm source code file, which represents a number.
static: an access to memory which the address is known to the compiler at compile time. It does not vary over executions.
inlining: expanding the code referred to in the caller's body, all meta-instructions inline in basm.

Cli Capabilities

Alongside the language itself, I'd like to include some information about the transpiler I wrote for it, because, of course, there's no point in writing a book about a compilerless language.

The only (probably forever) basm compiler is available on github under the retrospectively confusing name basm. This tool doesn't just include a compiler, it also comes with a bf interpreter to run transpiled basm directly.

Currently, the cli has two subcommands: compile and run. Both of them are similar in that they both take in a file path and have some flags in common.

`compile`

The compile subcommand is the simplest way to use the basm cli. Similar to tool like gcc, it will create a file in the current working directory named like the one passed in with the extension replaced with .bf. File name and output path can be specified with the -o flag. If compilation fails, error information will be printed to the terminal.

Flags

Shorthand	Longhand	Description
`-o`	`--out <OUT>`	The path to put the compiled file
`-p`	`--show`	Print the transpiled bf script
`-u`	`--unoptimized`	Skips the use of the inbuilt brainfuck optimizer
`-h`	`--help`	Print help

`run`

This command compiles then run the specified basm source file. You can use the bf interpreter directly if you use this command the -r flag, which specifies that the file is a bf source file, not a basm source file. Unlike compile, this does not create a file containing the compiled bf. If the compilation fails, error information will be printed to the terminal.

Flags

Shorthand	Longhand	Description
`-c`	`--cell-size <CELL_SIZE>`	Sets the size of cells in bits (only 8, 16 and 32) [default: 8]
`-i`	`--signed`	Sets the cells as signed containing signed numbers
`-a`	`--abort-overflow`	Aborts the execution of the program when a cell over/under-flows
`-t`	`--tape-limit <TAPE_LIMIT>`	Limits the lenght of the tape in cells, aborts execution if it is reached
`-n`	`--number-input`	Treats the input as integer numbers rather than characters
`-m`	`--number-output`	Treats the output as integer numbers rather than characters
`-s`	`--single-input`	Removes the bufferring of the unused parts of inputs to provide to them to later bf inputs
`-r`	`--raw`	Interprets the file as brainfuck, skips the compiling process
`-p`	`--show`	Print the transpiled brainfuck
`-u`	`--unoptimized`	Skips the use of the inbuilt brainfuck optimizer
`-d`	`--dump`	Dump the tape and tape pointer position to terminal once the program ends (includes by erroring out)
`-h`	`--help`	Print help

Bf Optimizations

basm applies some basic optimizations to the bf resulting from the transpilation process by default. It can merge operators to reduce redundant use (e.g: +++- would turn into ++) and it can reorder operations so that less tape pointer moves are used. Overall, the purpose of the built in optimizer is to reduce the number of operators in compiled scripts. By default, basm doesn't remove redundant operations and is generally quite stupid. For example, an instruction like INCR 12 0;, which does nothing, would still move the tape pointer to the 12th cell. This gets optimized out.

All optimization are made to have no impact on the resulting program, at a few exceptions:

Willingfully moving the tape pointer before 0 will be optimized out if there is a cancelling move afterwards (e.g: <<<<<<<<<<<<<<<>>>>>>>>>>>>>>>)
Any pointer moves that don't lead to other operations (at the end of the program) are removed

If you suspect that optimizations are messing with your program, you can disable them with the -u flag.

A Quick Introduction and Reference to Brain Aneurysm