ref: d15b6a7213e876047859dc6add85255829a4f6c4
parent: 82f94a1ad3d38b62ea08351190ec237646c52a17
parent: 575cd5a2034d9bfe97fc2275df76631383fa0898
author: Ori Bernstein <[email protected]>
date: Wed Aug 8 20:15:23 EDT 2012
Merge branch 'master' of git+ssh://mimir.eigenstate.org/git/ori/mc2
--- a/doc/lang.txt
+++ b/doc/lang.txt
@@ -1,7 +1,24 @@
The Myrddin Programming Language
- Aug 2012
+ Jul 2012
Ori Bernstein
+TABLE OF CONTENTS:
+
+ 1. OVERVIEW
+ 2. LEXICAL CONVENTIONS
+ 3. SYNTAX
+ 3.1. Declarations
+ 3.2. Literal Values
+ 3.3. Control Constructs and Blocks
+ 3.4. Data Types
+ 3.5. Packages and Uses
+ 3.6. Expressions
+ 4. TYPE SYSTEM
+ 5. TOOLCHAIN
+ 6. EXAMPLES
+ 7. STYLE GUIDE
+ 8. FUTURE DIRECTIONS
+
1. OVERVIEW:
Myrddin is designed to be a simple, low level programming
@@ -60,8 +77,77 @@
Literals are a direct representation of a data object within the
source of the program. There are several literals implemented
- within the Myrddin language:
+ within the Myrddin language. These are fully described in sectio
+3. SYNTAX OVERVIEW:
+
+ Myrddin syntax will likely have a familiar-but-strange taste
+ to many people. Many of the concepts and constructions will be
+ similar to those present in C, but different.
+
+ 3.1. Declarations:
+
+ A declaration consists of a declaration class (ie, one
+ of 'const', 'var', or 'generic'), followed by a declaration
+ name, optionally followed by a type and assignment. One thing
+ you may note is that unlike most other languages, there is no
+ special function declaration syntax. Instead, a function is
+ declared like any other value: By assigning its name to a
+ constant or variable.
+
+ const: Declares a constant value, which may not be
+ modified at run time. Constants must have
+ initializers defined.
+ var: Declares a variable value. This value may be
+ assigned to, copied from, and
+ generic: Declares a specializable value. This value
+ has the same restricitions as a const, but
+ taking its address is not defined. The type
+ parameters for a generic must be explicitly
+ named in the declaration in order for their
+ substitution to be allowed.
+
+ In addition, there is one modifier allowed on declarations:
+ 'extern'. Extern declarations are used to declare symbols from
+ another module which cannot be provided via the 'use' mechanism.
+ Typical uses would be to expose a function written in assembly. They
+ can also be used as a workaround for external dependencies.
+
+ Examples:
+
+ Declare a constant with a value 123. The type is not defined,
+ and will be inferred.
+
+ const x = 123
+
+ Declares a variable with no value and no type defined. The
+ value can be assigned later (and must be assigned before use),
+ and the type will be inferred.
+
+ var y
+
+ Declares a generic with type '@a', and assigns it the value
+ 'blah'. Every place that 'z' is used, it will be specialized,
+ and the type parameter '@a' will be substituted.
+
+ generic z : @a = blah
+
+ Declares a function f with and without type inference. Both
+ forms are equivalent. 'f' takes two parameters, both of type
+ int, and returns their sum as an int
+
+ const f = {a, b
+ var c : int = 42
+ -> a + b + c
+ }
+
+ const f : (a : int, b : int -> int) = {a : int, b : int -> int
+ var c : int = 42
+ -> a + b + c
+ }
+
+ 3.2. Literal Values
+
Integers literals are a sequence of digits, beginning with a
digit and possibly separated by underscores. They are of a
generic type, and can be used where any numeric type is
@@ -141,69 +227,57 @@
eg: (1,), (1,'b',"three")
-3. SYNTAX OVERVIEW:
+ 3.3. Control Constructs and Blocks:
+
+ if for
+ while match
+ goto
- Myrddin syntax will likely have a familiar-but-strange taste
- to many people. Many of the concepts and constructions will be
- similar to those present in C, but different.
+ The control statements in Myrddin are similar to those in many other
+ popular languages, and with the exception of 'match', there should
+ be no surprises to a user of any of the Algol derived languages.
+ Where a truth value is required, any type with the builtin trait
+ 'tctest' can be used in all of these.
- 3.1: Declarations:
+ Blocks are the "carriers of code" in Myrddin programs. They consist
+ of series of expressions, typically ending with a ';;', although the
+ function-level block ends at the function's '}', and in if
+ statemments, an 'elif' may terminate a block. They can contain any
+ number of declarations, expressions, control constructs, and empty
+ lines. Every control statement example below will (and, in fact,
+ must) have a block attached to the control statement.
+
+ If statements branch one way or the other depending on the truth
+ value of their argument. The truth statement is separated from the
+ block body
- A declaration consists of a declaration class (ie, one
- of 'const', 'var', or 'generic'), followed by a declaration
- name, optionally followed by a type and assignment. One thing
- you may note is that unlike most other languages, there is no
- special function declaration syntax. Instead, a function is
- declared like any other value: By assigning its name to a
- constant or variable.
+ if true
+ std.put("The program always get here")
+ elif elephant != mouse
+ std.put("...eh.")
+ else
+ std.put("The program never gets here")
+ ;;
- const: Declares a constant value, which may not be
- modified at run time. Constants must have
- initializers defined.
- var: Declares a variable value. This value may be
- assigned to, copied from, and
- generic: Declares a specializable value. This value
- has the same restricitions as a const, but
- taking its address is not defined. The type
- parameters for a generic must be explicitly
- named in the declaration in order for their
- substitution to be allowed.
+ For statements begin with an initializer, followed by a test
+ condition, followed by an increment action. For statements run the
+ initializer once before the loop is run, the test each on each
+ iteration through the loop before the body, and the increment on
+ each iteration after the body. If the loop is broken out of early
+ (for example, by a goto), the final increment will not be run. The
+ syntax is as follows:
- Examples:
+ for init; test; increment
+ blockbody()
+ ;;
- Declare a constant with a value 123. The type is not defined,
- and will be inferred.
+ While loops are equivalent to for loops with empty initializers
+ and increments. They run the test on every iteration of the loop,
+ and
- const x = 123
-
- Declares a variable with no value and no type defined. The
- value can be assigned later (and must be assigned before use),
- and the type will be inferred.
+
+ 3.4. Data Types:
- var y
-
- Declares a generic with type '@a', and assigns it the value
- 'blah'. Every place that 'z' is used, it will be specialized,
- and the type parameter '@a' will be substituted.
-
- generic z : @a = blah
-
- Declares a function f with and without type inference. Both
- forms are equivalent. 'f' takes two parameters, both of type
- int, and returns their sum as an int
-
- const f = {a, b
- var c : int = 42
- -> a + b + c
- }
-
- const f : (a : int, b : int -> int) = {a : int, b : int -> int
- var c : int = 42
- -> a + b + c
- }
-
- 3.2: Data Types:
-
The language defines a number of built in primitive types. These
are not keywords, and in fact live in a separate namespace from
the variable names. Yes, this does mean that you could, if you want,
@@ -213,7 +287,7 @@
must be explicitly cast if you want to convert, and the casts must
be of compatible types, as will be described later.
- 3.2.1. Primitive types:
+ 3.4.1. Primitive types:
void
bool char
@@ -248,7 +322,7 @@
var y : float32 declare y as a 32 bit float
- 3.2.2. Composite types:
+ 3.4.2. Composite types:
pointer
slice array
@@ -272,7 +346,7 @@
foo[123] type: array of 123 foo
foo[,] type: slice of foo
- 3.2.3. Aggregate types:
+ 3.4.3. Aggregate types:
tuple struct
union
@@ -290,6 +364,8 @@
(a keyword prefixed with a '`' (backtick)) indicating their
current contents, and a type to hold. They are declared by
placing the keyword 'union' before a list of tag-type pairs.
+ They may also omit the type, in which case, the tag is
+ suficient to determine which option was selected.
[int, int, char] a tuple of 2 ints and a char
@@ -304,7 +380,7 @@
;;
- 3.2.4. Magic types:
+ 3.4.4. Magic types:
tyvar typaram
tyname
@@ -333,19 +409,176 @@
@foo creates a type parameter
named '@foo'.
- 3.2.5. Traits:
+ 3.4.5. :
- 3.3: Control Constructs:
- 3.4: Packages and Uses:
- 3.5: Expressions
+ 3.6. Packages and Uses:
-4. TYPES:
+ pkg use
+ There are two keywords for module system. 'use' is the simpler
+ of the two, and has two cases:
+
+ use syspkg
+ use "localfile"
+
+ The unquoted form searches all system include paths for 'syspkg'
+ and imports it into the namespace. By convention, the namespace
+ defined by 'syspkg' is 'syspkg', and is unique and unmerged. This
+ is not enforced, however. Typical usage of unquoted names is to
+ import a library that already exists.
+
+ The quoted form searches the local directory for "localpkg". By
+ convention, the package it imports does not match the name
+ "localpkg", but instead is used as partial of the definition of the
+ importer's package. This is a confusing description.
+
+ A typical use of a quoted import is to allow splitting one package
+ into multiple files. In order to support this behavior, if a package
+ is defined in the current file, and a use statements imports a
+ package with the same namespace, the two namespaces are merged.
+
+ The 'pkg' keyword allows you to define a (partial) package by
+ listing the symbols and types for export. For example,
+
+ pkg mypkg =
+ type mytype
+
+ const Myconst : int = 42
+ const myfunc : (v : int -> bool)
+ ;;
+
+ declares a package "mypkg", which defines three exports, "mytype",
+ "Myconst", and "myfunc". The definitions of the values may be
+ defined in the 'pkg' specification, but it is preferred to implement
+ them in the body of the code for readability. Scanning the export
+ list is desirable from a readability perspective.
+
+ 3.7. Expressions:
+
+ Myrddin expressions are relatively similar to expressions in C. The
+ operators are listed below in order of precedence, and a short
+ summary of what they do is listed given. For the sake of clarity,
+ 'x' will stand in for any expression composed entirely of
+ subexpressions with higher precedence than the current current
+ operator. 'e' will stand in for any expression. Unless marked
+ otherwise, expressions are left associative.
+
+ BUG: There are too many precedence levels.
+
+
+ Precedence 0: (*ok, not really operators)
+ (,,,) Tuple Construction
+ (e) Grouping
+ name Bare names
+ literal Values
+
+ Precedence 1:
+ x.name Member lookup
+ x++ Postincrement
+ x-- Postdecrement
+ x[e] Index
+ x[from,to] Slice
+
+ Precedence 2:
+ ++x Preincrement
+ --x Predecrement
+ *x Dereference
+ &x Address
+ !x Logical negation
+ ~x Bitwise negation
+ +x Positive (no operation)
+ -x Negate x
+
+ Precedence 3:
+ x << x Shift left
+ x >> x Shift right
+
+ Precedence 4:
+ x * x Multiply
+ x / x Divide
+ x % x Modulo
+
+ Precedence 5:
+ x + x Add
+ x - x Subtract
+
+ Precedence 6:
+ x & y Bitwise and
+
+ Precedence 7:
+ x | y Bitwise or
+ x ^ y Bitwise or
+
+ Precedence 8:
+ `Name x Union construction
+
+ Precedence 9:
+ x casttto(type) Cast expression
+
+ Precedence 10:
+ x == x Equality
+ x != x Inequality
+ x > x Greater than
+ x >= x Greater than or equal to
+ x < x Less than
+ x <= x Less than or equal to
+
+ Precedence 11:
+ x && x Logical and
+
+ Precedence 12:
+ x || x Logical or
+
+ Precedence 13:
+ x = x Assign Right assoc
+ x += x Fused add/assign Right assoc
+ x -= x Fused sub/assign Right assoc
+ x *= x Fused mul/assign Right assoc
+ x /= x Fused div/assign Right assoc
+ x %= x Fused mod/assign Right assoc
+ x |= x Fused or/assign Right assoc
+ x ^= x Fused xor/assign Right assoc
+ x &= x Fused and/assign Right assoc
+ x <<= x Fused shl/assign Right assoc
+ x >>= x Fused shr/assign Right assoc
+
+ Precedence 14:
+ -> x Return expression
+
+ All expressions on integers act on complement-two values which wrap
+ on overflow. Right shift expressions fill with the sign bit on
+ signed types, and fill with zeros on unsigned types.
+
+4. TYPE SYSTEM:
+
+ The myrddin type system is a system similar to the Hindley Milner
+ system, however, types are not implicitly generalized. Instead, type
+ schemes (type parameters, in Myrddin lingo) must be explicitly provided
+ in the declarations. For purposes of brevity, instead of specifying type
+ rules for every operator, we group operators which behave identically
+ from the type system perspective into a small set of classes
+
+ Binop:
+ + -
+
+5. TOOLCHAIN:
+
+ The toolchain used is inspired by the Plan 9 toolchain in name. There
+ is currently one compiler for x64, called '6m'. This compiler outputs
+ standard elf .o files, and supports these options:
+
+ 6m [-h] [-o outfile] [-d[dbgopts]] inputs
+ -I path Add 'path' to use search path
+ -o Output to outfile
+
+
5. EXAMPLES:
-6. GRAMMAR:
+6. STYLE GUIDE:
-7. FUTURE DIRECTIONS:
+7. GRAMMAR:
+
+8. FUTURE DIRECTIONS:
BUGS:
--- a/parse/gram.y
+++ b/parse/gram.y
@@ -422,14 +422,14 @@
| castexpr
;
+
+cmpop : Teq | Tgt | Tlt | Tge | Tle | Tne ;
+
castexpr: unionexpr Tcast Toparen type Tcparen
{$$ = mkexpr($1->line, Ocast, $1, NULL);
$$->expr.type = $4;}
| unionexpr
;
-
-
-cmpop : Teq | Tgt | Tlt | Tge | Tle | Tne ;
unionexpr
: Ttick name borexpr