Examples
Following are examples of some of the interesting features of the language. Given
that the language description is hundreds of pages long, this is of necessity just
a taste of what the language does and how it works. As well, many features do not
work well for small examples: they need much larger examples to be convincing.
Hello World
We start with the basic "Hello World" program, which is required for all programming
language examples:
println ("Hello World");
People with Java experience will notice that println doesn't need prefixing with "System.out". This is a user selected option that applies
to anything the user chooses. There are a few defaults in the "standard" profile
(which can be overridden). Here is the declaration that makes the above happen:
use System.out.(print, println, printf);
The user can declare their own short-hand forms. Where the short-hand is not appropriate,
fully qualified names can be used, as in:
System.err.println ("Goodbye Cruel World");
XML Processing
Here is an XML example that converts a XML document to a JSON form:
program xmltree;
parseXml (System.in);
choose xmlElementNode {
print (" [\"%s\" {" (*.qName));
for a = *.attributes, comma = "" then "," do
print ("%s\"%s\":\"%s\"" (comma, a.qName, a.allValues));
print ("}");
processChildren;
print ("]");
}
So:
<section id="X"><title>Title Text</title><p>Para 1.</p></section>
is converted to:
["section", {"id":"X"}, ["title", {}, "Title Text"], ["p", {}, "Para 1."]]
An XML processing program is prefixed by a program directive that indicates what kind of XML processing is to be done, in this case,
XML tree processing, as in the original release of XSLT.
A markup processing rule is heralded by a two-word prefix: choose, indicating it is a markup processing rule and, in the above case, xmlElementNode indicating what is to be recognized by the rule.
More XML Processing
Here is another XML processing program, using serial parsing this time:
program xmlserial;
sendToMatch (System.in);
choose xmlElement ("section") {
processChildren;
}
choose xmlElement ("title") when *.parent..is ("section") {
print (".section ");
processChildren;
println ();
}
choose xmlElement ("p") {
print (".para ");
processChildren;
println ();
}
About markup processing:
-
Serial processing uses different names (xmlElement for a serially processed XML element rather than xmlElementNode for a tree processed XML element) to recognize what is being processed. Amongst
other thing, this separates the two types of processing, and allows them to be both
used within a program.
-
Markup processing rules can be defined to have a "name" on which they can be chosen,
as well as a when (or unless) condition based on other markup component and contextual properties.
-
The "conditional selection" operator ".." (two dots instead of one, as for the usual
field or method selection), instead of being an error when the base value is null, returns false (or null or 0 as appropriate). For example, in this case, the when returns false if there is no "parent" or if the parent is not named "section".
Even More XML Processing
Serial and tree markup processing can be combined. For eample, when processing a
potentially large document that is best done with serial processing, there may be
subcomponents, like tables, that require the added functionality of tree parsing.
As a simple example:
program xmlserial, xmltree, textpatterns;
parseXmlSerially ();
choose xmlElement ("doc") {
println (".startdoc"); processChildren; println (".enddoc");
}
choose xmlElement ("p") :- processChildren;
choose xmlElement ("table") :- xmlElementNode (*.captureTree ());
choose xmlElementNode ("table") {
println (".starttable"); processChildren; println (".endtable");
}
choose xmlElementNode ("tr") {
println (".startrow"); processChildren; println (".endrow");
}
choose xmlElementNode ("td") {
println (".tableitem "); processChildren;
}
choose xmlText :-
println (*.data) unless *.data matches ([" \t\n"]* & -|);
choose xmlTextNode :-
println (*.data) unless *.data matches ([" \t\n"]* & -|);
The "Node" rules do tree processing, and the non-"Node" rules, serial processing.
"captureTree" captures the current element (in this case "<table>") as a tree data
structure, and invoking "xmlElementNode" passes the tree to the tree processing rules.
XML Processing Without Rules: Serial Processing
A markup parser can be invoked without using rule-base processing. It is an iterator
of the serial processing item type returned by the parser, in this case XmlItem:
import bj.xml.*;
for item = XmlSerialParser.parse () do
println ("%s: %s" (name of item, item));
With the following input:
<doc><p>The text.</p></doc>
this program outputs the following:
bj.xml.XmlStartDocumentItem: <?xml version="1.0"?>
bj.xml.XmlStartTagItem: <doc>
bj.xml.XmlStartTagItem: <p>
bj.xml.XmlTextItem: The text.
bj.xml.XmlEndTagItem: </p>
bj.xml.XmlEndTagItem: </doc>
bj.xml.XmlEndDocumentItem: <!-- end document -->
The above sample program illustrates two minor but useful features of Bobbee:
-
A string can be used as if it were method, in which case the string is treated as
a "format string", with the arguments as the arguments of the formatting.
-
The prefix name of operator returns the name of the type of its argument.
XML Processing Without Rules: Tree Processing
As well as direct access to a markup parser's serial parser as an iterator, its tree
parser can be used to return a tree-structured data structure:
import bj.xml.*;
println (XmlTreeParser.parse ());
With input as:
<?xml version=\"1.0\"?><doc><p>The text.</p></doc>
this program produces the same output as its input.
However, because "XmlTreeParser.parse" returns a data structure, working with it can
be very useful. For example, if one wants to sort a table prior to outputing information
from it, based on some of its fields, that data structure is something one can do
that with the data structure.
Text Matching Rules
For a taste of text pattern matching rules, here is a program that converts all words
by capitalizing the first letter followed by all the other letters in lower case.
It has two rules: the first recognizes a word (a string of letters, including appostraphies
and dashes), and does the conversion, and the second copies out all other things:
program textpatterns;
sendToMatch (System.in);
match uLetter => "first" & (uLetter | ["'-"])* => "rest" {
print (upper => "first" + lower => "rest");
}
match \uLetter+ => "other" {
print (=> "other");
}
The "=>" operator is used in two ways (in the same way that "+" can be used in various
ways):
-
in the match rules, as an infix operator assigning what was matched by the first argument pattern
to the pattern variable named by its second argument, and
-
in the body statements, as a prefix operator retrieving the value named by its argument.
"upper", "lower" and "length" are prefix operators that respectively upper-case and
lower-case their argument: which in this example is captured text from a text pattern
match. And "+" joins two string values.
One major feature in the above example is starting the program with a program declaration. In this example, it names the text pattern features that are used in
the program. (More about program later.)
Using Regular Expressions
Here is the previous text matching rules exmaple using Regular Expressions instead
of text pattern matching expressions:
program textpatterns;
sendToMatch (System.in);
match `\=(\p{L})([\p{L}\p{N}]*)` {
print (upper * [1] + lower * [2]);
}
match `\=([^\p{L}])` {
print (* [1]);
}
When using regular expressions:
-
` (backquote or grave) is used to quote regular expressions rather than the string
quoting ",
-
"\=" suppresses the string parser from looking at following "\" characters, so that
they can be left in the form expected in regular expression syntax,
-
"*" in indexed or unindexed form identifies the captured parts of the most recent
regular expression pattern match.
"\=" helps make strings a bit easier to read. The alternative to using "\=" is to
double "\" characters as is used in Java text patterns:
match `(\\p{L})([\\p{L}\\p{N}]*)` {...}
"\=" is like Java's "\Q" except that:
-
there is no equivalent to "\E" to turn it off, and
-
any "\" following "\=" does not need doubling as is the case for "\Q". It is this
latter provision that is the primary advantage of using "\=".
A string can be split into parts, and the effect of "\=" and "\Q" is limited to one
part:
print ("\=\\" "\\");
prints three backquote characters: two because "\=" means that neither of the following
"\"'s are considered escapes, and one more because "\=" doesn't affect the second
part of the string, where an first "\" is needed to escape the second one. The two-part
string in this example is a single string, not two strings "joined" together.
Defining Operators
In the previous example, a variety of operators are used. Operators can be defined
in a Bobbee program or library. For example, these operators are defined with their name, their
arguments names and types, the result of the operator and finally the code that produces
the result:
operator length (arg1 : String) : integer :- arg1.length ();
operator lower (arg1 : String) : String :- arg.toLowerCase ();
operator upper (arg1 : String) : String :- arg.toUpperCase ();
operator (arg1 : String) + (arg2 : String) : String :- arg1.concat (arg2);
defines the "length", "lower", "upper" and "+" operators as they apply to "String"
values. These operators are defined in a "functional" style, with ":-" meaning that
what follows is the returned value of operator. Operators and methods can be defined
in this functional style, or have a "body" containing "return" statements.
Operators need a "precedence" defined for them, which is what determines how tightly
they bind. For example, one wants "*" (for multiplication) to bind more tightly than
"+" (for addition). For these operators, again defined in a Bobbee program or library:
operator length 311;
operator lower 311;
operator upper 311;
operator upper 311;
operator 240 + 241;
To improve program performance, Bobbee provides a mechanism for eliding the call to the "length" operator definition, by
telling the compiler what to call instead, by use of a "@Builtin" annotation:
@Builtin ("VCALL1:public:java.lang.String.length:()I")
operator length (arg1 : String) : javaInt :- arg1.length ();
@Builtin ("VCALL1:public:java.lang.String.toLowerCase:()Ljava/lang/String;")
operator lower (arg1 : String) : String :- arg1.toLowerCase ();
@Builtin ("VCALL1:public:java.lang.String.toUpperCase:()Ljava/lang/String;")
operator upper (arg1 : String) : String :- arg1.toUpperCase ();
@Builtin ("VCALL2:public:java.lang.String.concat:(Ljava/lang/String;)Ljava/lang/String;")
operator (arg1 : String) + (arg2 : String) : String :- arg1.concat (arg2);
(The "length" operator actually returns a "javaInt" value rather than whatever Bobbee's "integer" type may be implemented as. This reflects what the underlying Java library
method returns.)
The "length", "lower", "upper" and "+" operators are defined in a class (bj.lang.Operators).
To allow them to be used in a user's program without class qualification, Bobbee provides a means for declaring which values, methods and operators can be used unqualified.
For these operators this is:
use bj.lang.Operators.{length, lower, upper, +};
(Different overloadings of an operator or method can have a "use" from different defining
classes.)
All operators, even plain old arithmetic "+" and "*", are defined by these mechanisms.
The primary motivation for including operator definitions in Bobbee is to allow libraries to be implemented in Bobbee itself, and not requiring them to be "built in". All of Bobbee's markup language and text pattern matching support, and much of its other functionality,
is implemented in libraries written in Bobbee using these mechanisms.
Procedural And Functional Methods
A classic method example is the "factorial" function. In Bobbee there are a number of ways of defining it, that illustrate different ways of implementing
it. First, here's the traditional recursive functional form:
def factorial (n : integer) : integer :-
if n == 0 then 1 else n * factorial (n - 1);
Here's the same thing in an iterative procedural form:
def factorial (n : integer) : integer {
var f : integer = 1;
for i = 2 to n do
f *= i;
return f;
}
":-" indicates that the (returned) value of the method or other form is given as an
expression. return provides a method result in a procedural manner. Both methods and operators can
be defined in either a functional or procedural form. For some things one form is
best. For others the other.
Numbers and Strings
Bobbee used two kinds of numbers: integer (64-bit integer) and real (64-bit floating number). The idea is that given that computer memories are way
bigger than they were not long ago, there's no real need to be careful with storage
sizes the way there used to be. This makes programming a bit easier.
var n : integer = 0;
# declare "n" to be an integer, and initialize it to zero.
One can access Java type numbers using special names: javaByte, javaShort, javaInt , javaLong, javaFloat and javaDouble. The next section has an example of using this feature.
Bobbee supports two kinds of strings (and characters):
-
String (or java.lang.String, spelled with upper-case "S") uses the Java 16-bit character encoding, with two characters
for encoding larger character values, and
-
string (spelled lower-case), uses 21-bit character encoding, meaning that every character
is one character.
Most importantly, for string, the length of a string is the actual number of characters in the string, unlike
the case for Java strings, where the length may be more than the number of characters.
By default, all types, including numbers, characters and boolean values, are implemented
as objects. There's a lot to be said for making everything an object: it makes the
language more uniform in its use. And there's not as much cost as one might think:
for example, calls to the various "print" methods convert all their arguments to objects.
On the other hand, the language can handle non-object forms of these types.
Classes
Here is a simple example of a class that implements a subclass of OutputStream, that just puts what is written to it in a buffer, which then becomes the class's
"toString" value:
class BufferedOutput : OutputStream {
def this () {}
def this (this.buffer) {}
def write (b : javaInt) : void :-
write (new javaByte [] {b}, 0, 1);
def write (b : javaByte []) : void :- write (b, 0, length b);
def write (b : javaByte [], off, len : javaInt) : void :-
buffer += new String (b, off, len);
def close () : void {}
def flush () : void {}
def toString () : String :- buffer;
private:
var buffer : String = "";
}
This class illustrates a number of features of the language:
-
Initializers are defined using def this, rather than using the class's name.
-
The types javaInt and javaByte are used for compatibility of the extended class's redefined methods.
-
String is used as the return type of "toString", because it extends the String-returning method of type Object.
-
The default scope of a class's properties is public, and a group of properties (only "buffer" in this example) can be prefixed by a single
statement of scope.
Parameterized Classes
The "generic type" or "parameterized type" feature of many current languages is very
useful in the the definition and use of classes whose subcomponents can be of many
different types. For example, the following is a useful way of defining a variable-sized
list of string values, that can have values added to it or removed from it:
val stringList : ArrayList<string>;
As well, the very useful Iterable and Iterator types can be used with a specified type that is returned when using those types.
There's an iterator example later that illustrates this feature.
Parameterized classes (interfaces and enums) are defined using type parameters following
the name of the class, interface or enum, as in this class that defines a very general
implementation of value pairs:
class Pair<H,T> {
val head : H;
val tail : T;
def this (this.head, this.tail);
def toString () : String :- "Pair(%s,%s)" (head, tail);
}
Used like:
var myPair : Pair<string,Integer>;
myPair = new Pair<string,Integer> ("third", 3);
This simple, class-level parameterized type syntax is supported, but there's no corresponding
support for parameterizing methods and the "super-type" relation.
Synchronous Pipes and Streams
Bobbee supports synchronous threads -- which are called "coroutines" in other languages.
Synchronous threads run in parallel as do other threads, but are implemented so that
only one is running at the same time. This means that no synchrously is required
for one thread to use properties of another such thread. There are two kinds of synchronous
threads: object-passing pipes and text streams.
Synchronous pipes and text streams are useful when doing context-sensitive data processing:
where where one is in the original input is significant for down-stream processing.
Object-passing pipes are useful for things such as parsing and processing markup languages
and other data encodings in parallel. Text streams are useful where what is passed
is a text stream.
Pipes
An object-passing synchronous pipe passes objects of a particular type from one thread
to another, pausing the sending pipe until the receiving pipe has used the passed
value and requires another value, and pausing the receiving pipe when it requires
another value until the sending pipe has a value to send to it. A single SynchronousPipe is created to communicate between two processes:
val pipe = new SynchronousPipe<string> ();
The pipe is initialized to wait for something to be written ("put") to it by one process:
pipe.put ("Start");
Once a value is written, the writing process is suspended until another process retrieves
the value. That other process can wait for something to be written to the pipe, and
get the passed value when it is written:
val nextValue : String = pipe.get ();
This reading process continues until it does another "get". At which time the reading
process is suspended, and the writing process is resumed so that it produce another
value.
Text Streams
A variation on the synchronous pipe is the SynchronousStream, which instead of passing objects between synchronous threads, passes a stream of
text. Text-communicating coroutines are created by calling one of two static methods
of the SynchronousStream type, as in:
local System.out = outputTo (theOtherRegime);
or as in:
local System.in = inputFrom (theOtherRegime);
where "theOtherRegime" is a java.lang.Runnable class which is started out by setting its "standard output" to return text to the
current thread (for "outputTo") or which is started out by setting its "standard in"
to send text to the current thread (for "inputFrom"). In both cases, synchronization
is kept by only allowing one of the two threads to be active at any one time.
Locally Scoped Names
A couple of features have been revived from 1960's languages.
The local prefix says that the value of given name is to be restored on exit from the current
local scope, no matter how it is exited:
local depth += 1;
means save away the current value of "depth", increment its value for use in the local
scope, and restore its saved value at the end of the scope. Even if the local scope
exits with a throw or an error, the restoring will happen. This reduces the corruption of data when
scopes are exited in strange ways.
local can be used with qualified names as well. The following temporarily rebinds "System.out"
but ensures it is restored for later use:
local System.out = new PrintStream ("myoutputfile.txt");
Selecting
select is the Bobbee language version of "switch" in other languages. It has a number of features beyond
the selecting of numeric and string values. Two forms of select are of special interest. Pattern matching can be done using select with match parts rather than case parts:
def upperize (x : string) : string {
select x {
match uLetter => "first" & (uLetter | ["'-"])* => "rest":
result += upper => "first" + lower => "rest";
match \uLetter+ => "other":
result += => "other";
}
return result;
}
A value can be selected based on its type:
def show (x : Object) : string {
select x {
case (y : String) : return "String: \"%s\"" (y);
case (y : Long) : return "Number: %d" (y);
default: return "Other: %s" (x);
}
}
The type selecting statement does three things of use:
-
it identifies the type of the argument,
-
it binds the argument value to a new name with the identified type, and
-
it supports identifying and binding to multiple types.
Iterators
One can define iterators in Bobbee as methods, rather than as classes as in Java. For example, the following method
creates an Iterator that returns all the space separated words in a passed-in string:
def splitWords (sentence : string) : Iterator<string> {
selectAll sentence {
match [" \t\n"]* & \[" \t\n"]+ => "word":
yield => "word";
}
}
This example illustrates the use of the selectAll statement, which extends the select statement by looping over a string, array, Iterator or other collection, by selecting
parts of the string or components of the collection on each iteration of the selectAll.
Iterators are used extensively in the implementation of the language's markup libraries,
where the serial parsers are all iterators of their item types.
Patchable Print Streams
Patchable print streams allow later-found information, such as chapter numbers, to
be used earlier in an output document, even when using serial processing.
with pps = new PatchablePrintStream () do {
with local System.out = pps do processDocument ();
pps.emit ();
}
Within whatever "processDocument" does, the current print output is bound to the "patchable"
stream. This means "marks" can be written and defined. The "pps.emit ()" writes
out the result to the current output, which in the example is the "System.out" outside
of its binding to the patchable print stream. "Marks" are written to the patchable
stream using "writePrintMark". In the following example, a "<ref>" element's "id"
attribute value is used as a mark:
choose xmlElement ("ref") {
writePrintMark (* ["id"].value);
}
"Marks" are defined by assigning values to items in the PatchablePrintStream value.
In the following example, a copy of the title text is bound to the mark value given
by the chapter title's "id" attribute value, if it has one:
choose xmlElement ("title") when *.parent.is ("chapter") {
with title = new ByteArrayOutputStream () do {
print ("<H2>");
with local System.out = new PrintStream (title) do
processChildren;
with id = *.parent ["id"] do
if id != null then {
print ("<A NAME=\"%s\"></A>" (id.value));
pps [id] = "<A HREF=\"#%s\">%s</A>" (id.value, title);
}
println ("%s</H2>" (title));
}
}
Program Profiles
Program features are defined by one or more "profile" files. There is one that defines
the basic features of the language that is always used (the "standard profile"), and
others imported at the start of a program using the program directive, as in:
program xmlserial;
Profiles define defaults for:
-
what classes and interfaces are used,
-
what short-hand names are available,
-
the meaning of rules,
-
what operators are used, where they are defined, and what their precedences are,
-
what methods are available, and
-
the types of differently quoted strings.
The user can define their own profile files, and override the "standard profile".
More than one profile can be declared for a program. For example, the following says
that the program can use text pattern, serial XML processing, tree XML processing
and tree JSON processing:
program textpatterns, xmlserial, xmltree, jsontree;
Defining XML Processing
As an example of a profile file, here is how XML serial processing is defined:
"Use XML serial profile.";
import bj.xml.*;
def choose xmlComment (XmlCommentItem) default {}
def choose xmlDataEntity (XmlDataEntityItem)
choose (*.entity.name)
default {System.err.println
("ERROR: No rule for entity \"%s\"!" (*.entity.toRef));}
def choose xmlDocument (XmlStartDocumentItem ... XmlEndDocumentItem)
default {processChildren;}
def choose xmlDtd (XmlStartDtdItem ... XmlEndDtdItem) default {processChildren;}
def choose xmlElement (XmlStartTagItem ... XmlEndTagItem)
choose (*.uri default null) : (*.localName)
default {System.err.println
("ERROR: No rule for element \"%s\"!" (*.element.name));
processChildren;}
def choose xmlError (XmlErrorItem)
default {System.err.println ("ERROR: %s" (*.message));}
def choose xmlProcessingInstruction (XmlProcessingInstructionItem)
choose (*.target) default {}
def choose xmlText (XmlTextItem) default {print (*.data);}
def choose xmlTextEntity (XmlStartTextEntityItem ... XmlEndTextEntityItem)
choose (*.entity.name) default {processChildren;}
def parseXmlFileSerially (systemId : String, options : String = "") : void :-
parseXmlSerially (XmlSerialParser.parseFile (systemId, options));
def parseXmlSerially (data : String, options : String = "") : void :-
parseXmlSerially (XmlSerialParser.parse (data, options));
def parseXmlSerially (in : InputStream = System.in,
options : String = "") : void :-
parseXmlSerially (XmlSerialParser.parse (in, options));
def parseXmlSerially (buffer : CharSequence, options : String = "") : void :-
parseXmlSerially (XmlSerialParser.parse (buffer, options));
def parseXmlSerially (source : Readable, options : String = "") : void :-
parseXmlSerially (XmlSerialParser.parse (source, options));
def parseXmlSerially (parser : XmlSerialParser) {
val iterator : Iterator<XmlItem> = parser.iterator ();
for item = iterator do
$processAllChildren (item, iterator);
}
This definition for a serial XML parser consists of:
-
a text string value to be displayed by the compiler: "Use XML serial profile.",
-
the appropriate import directives, identifying where the markup language parser and other used facilities
can be found (in this case the contents of the "bj.xml" library),
-
definitions of the various choose rules that are appropriate for processing the components of the markup language (xmlComment for XML serial found comments etc.), and
-
definitions of the various methods that initiate parsing (parseXmlSerially and parseXmlFileSerially).
The choose definitions consist of:
-
the name of the defined rule (xmlComment etc.),
-
the name of the class or classes (for elements XmlStartTagItem and XmlEndTagItem, for example) that this rule recognizes,
-
the property of that class that is the "name" recognized by the rule, using choose (the local name of an element, for example), and
-
the default behaviour for the rule: which can be to do nothing (for comments), to issue an error
message (for an unhandled element), or to do something sensible (like copying text
to the output, for text, for example).