Fork me on GitHub

April 26, 2015

0.5.3: Faster compilation and extended vector! support

The main point of this minor release is to speed up compilation time by introducing a new way for the compiler to store Red values required for constructing the environment during the runtime library startup.

Introducing Redbin

Red already provides two text-oriented serialization formats, following the base Rebol principles. Here are the available serialization formats now in Red with some pros/cons:
  • MOLD format
    • provides a default readable text format, very close to the source code version
    • cannot properly encode many values
  • MOLD/ALL format
    • can encode series offsets
    • some values with literal forms that rely on words that can be natively encoded (none, true/false, objects, ...)
    • human-readable, but not always nice-looking
  • Redbin format
    • can encode any value accurately
    • supports words binding
    • can encode contexts efficiently
    • supports cycles in blocks
    • can encoded name/value pairs in any context
    • extremely fast loading time
    • very small storage space used when compressed
    • non human-readable
So far, the existing environment source code (mostly block values) was converted to pure Red/System construction code which was pretty simple and straightforward to implement, but was generating thousands of extra lines of code, slowing down the native compilation process. The right solution for that was to introduce a new binary serialization format for Red values called Redbin (very inspired by Carl's REBin proposal).

Redbin's specification focuses on optimizing the loading time of encoded values, by making their stored representation very close to their memory representation, bypassing the parsing and validation stages. Moreover, the Redbin payload is compressed using the Crush algorithm (that Qtxie ported to Red/System), which features one of the fastest decompressors around while having a general compression ratio very close to the deflate algorithm (but compression speed is about an order of magnitude slower). This fits perfectly the needs for our Redbin use-case.

So the gains compared to pre-0.5.3 version are:
  • compilation time of empty Red program is ~40% faster!
  • generated executable of empty Red program is about 100KB smaller (278KB only on Windows now).
  • faster startup time, as the Redbin decoding process is much faster than the previous Red-stack-oriented construction approach.
Those benefits also extend to user code, your static series will be saved in Redbin format as well.

Redbin format is currently emitted by the compiler and decoded by the Red runtime, but there is no encoder yet in the runtime that would allow user code to emit Redbin format. We will provide that support in a future version, it is not high priority for now. A "compact" version of the encoding format will also be added, so that Redbin can also be a good choice for remote data exchange.

Compilation from Rebol console

For those using Red toolchain from Rebol2 console, a new rc function is introduced to avoid reloading the toolchain on each run. Typical session looks like this:

    >> do %red.r
    >> rc "-c tests\demo.red"

    -=== Red Compiler 0.5.3 ===-

    Compiling /C/Dev/Red/tests/demo.red ...
    ...compilation time : 416 ms

    Compiling to native code...
    Script: "Red/System PE/COFF format emitter" (none)
    ...compilation time : 12022 ms
    ...linking time     : 646 ms
    ...output file size : 284160 bytes
    ...output file      : C:\Dev\Red\demo.exe
 
    >> call/output "demo.exe" s: make string! 10'000
    == 0
    >> print s

      RedRed              d
      d     d             e
      e     e             R
      R     R   edR    dR d
      d     d  d   R  R  Re
      edRedR   e   d  d   R
      R   e    RedR   e   d
      d    e   d      R   e
      e    R   e   d  d  dR
      R     R   edR    dR d

Collation tables

Since 0.5.2, Red provides collation tables for more accurate case folding support. Those tables can now be accessed by users using these paths:
    system/locale/collation/upper-to-lower
    system/locale/collation/lower-to-upper
Each of these tables is a vector of char! values which can be freely modified and extended by users in order to cope with some specific local rules for case folding. For example, in French language, the uppercase of letter é can be E or É. There is a divide among French people about which one should be used and in some cases, it can just be a typographical constraint. By default, Red will uppercase é as É, but this can be easily changed if required, here is how:

    uppercase "éléphant"
    == "ÉLÉPHANT"
    
    table: system/locale/collation/lower-to-upper
    foreach [lower upper] "àAéEèEêEôOûUùUîIçC" [table/:lower: upper]

    uppercase "éléphant"
    == "ELEPHANT"

Extended Vector! datatype

Vector! datatype now supports more actions and can store more datatypes with different bit-sizes. For integer! and char! values, you can store them as 8, 16 or 32 bits values. For float!, it is 32 or 64 bits. Several syntactic forms are accepted for creating a vector:
    make vector! <slots>
    make vector! [<values>]
    make vector! [<type> <bit-size> [<values>]]
    make vector! [<type> <bit-size> <slots> [<values>]]

    <slots>    : number of slots to preallocate (32-bit slots by default)
    <values>   : sequence of values of same datatype
    <type>     : name of accepted datatype: integer! | char! | float!
    <bit-size> : 8 | 16 | 32 for integer! and char!, 32 | 64 for float!
The type of the vector elements can be inferred from the provided values, so it can be omitted (unless you need to force a bit-size different from the values default one). If a value with a bit-size greater than the vector elements one, is inserted in the vector, it will be truncated to the bit-size of the vector.

For example, creating a vector that contains 1000 32-bit integer values:
    make vector! 1000
Or if you want to specify the bit-size of the vector element:
    make vector! [char! 16 1000]
    make vector! [float! 64 1000]
You can also initialize a vector from a block as below:
    make vector! [1.1 2.2 3.3 4.4]
Again you can also specify the bit-size of the vector element:
    make vector! [integer! 8 [1 2 3 4]]
For integer! and char! vectors, you can use all math and bitwise operators now.
    x: make vector! [1 2 3 4]
    y: make vector! [2 3 4 5]
    x + y
    == make vector! [3 5 7 9]
In case of different bit-sizes, the resulting vector will be using the highest bit-size. If a math operation is producing a result that does not fit the bit-size, the result is currently truncated to the bit-size (using a AND operation). Ability to read and change the bit-size of a vector will be added in future releases.

The following actions are added to vector! datatype: clear, copy, poke, remove, reverse, take, sort, find, select, add, subtract, multiply, divide, remainder, and, or, xor.

The vector! implementation is not yet final, some of its actions will get optimized for better performances and, in future, rely on SIMD for even faster operations. For multidimensional support, it will be implemented as a new matrix! datatype in the near future, inheriting from vector!, so the additional code required will be kept minimal.

Bugfixing

This was a short-term release, but we managed to fix a few bugs anyway.

What's next

Another minor release will follow with many runtime library additions and new toolchain improvements. See the planned features for 0.5.4 on our Trello board.

The 0.6.0 release will also most probably be split in two milestones, one for GUI and another for Android support.

In the meantime, enjoy this new release! :-)

April 4, 2015

0.5.2: Case folding and hash! support

This is minor release mainly motivated by the need to fix some annoying issues and regressions we have encountered in the last release:

  • the help function was displaying an error when used with no arguments, preventing newcomers from seeing the general help information
  • the console pre-compilation issue with timezones was back.

Some significant new features managed to sneak into this release too, along with some bugfixes.

Case folding

Red now provides uppercase and lowercase natives and more generally, better support for Unicode-aware case folding. Red runtime library contains now a general one-to-one mapping table for case folding that should cover most user needs.
    red>> uppercase "hello"
    == "HELLO"
    red>> uppercase/part "hello" 1
    == "Hello"
    red>> uppercase "français"
    == "FRANÇAIS"
    red>> uppercase "éléphant"
    == "ÉLÉPHANT"
    red>> lowercase "CameL"
    == "camel"
This applies also to words, so now case insensitivity is Unicode-aware in Red:
    red>> É: 123
    == 123
    red>> é
    == 123
    red>> "éléphant" = "ÉLÉPHANT"
    == true
    red>> "éléphant" == "ÉLÉPHANT"
    == false
For special cases, we will expose, in a future release, the collation table we use internally, so that anyone can provide a customized version that is a better fit for some local special rules or usages. For example, some lower case characters (such as "ß") actually map to two or more upper case code points ("SS" in this case). So in Red, by default, you will get:
    red>> lowercase "ß"
    == ß
    red>> uppercase "ß"
    == ß
You can read more about our plans for full Unicode support on the wiki.

Hash datatype

The new hash! datatype works exactly the same way as in Rebol2. It provides a block-like interface but with fast lookups for most values (block series can be stored in hash! too, but they will not be hashed, so no faster access). It is a very flexible container for any kind of hashed tables (not only associative arrays) while keeping the handy navigational abilities of blocks. The underlying hashing function is a custom implementation of the MurmurHash3 algorithm. Some usage examples:
    red>> list: make hash! [a 123 "hello" b c 789]
    == make hash! [a 123 "hello" b c 789]
    red>> list/c
    == 789
    red>> find list 'b
    == make hash! [b c 789]
    red>> dict: make hash! [a 123 b 456 c 789]
    == make hash! [a 123 b 456 c 789]
    red>> select dict 'c
    == 789
    red>> dict: make hash! [2 123 4 456 6 2 8 789]
    == make hash! [2 123 4 456 6 2 8 789]
    red>> select/skip dict 2 2
    == 123

A map! datatype (strictly associative array) should also be provided in the next release, though, we are still investigating some of its features and use-case scenarios before deciding to release it officially.

Good news also about our Mac build server, a new one was kindly provided by Will (thanks a lot for that).

Our next release should mainly feature the Redbin format support for the Red compiler, providing much faster compilation times and reduced generated binaries.

Enjoy! :-)

March 15, 2015

0.5.1: New console and errors support

This new release brings many new features, improvements and some bugfixes that will make Red more usable, especially for newcomers. The initial intent for this release was just to replace the existing console implementation, but it looked like the right time to finally implement also proper general error handling support.

New console engine

The old console code we were using so far for the Red REPL was never meant to last that long, but as usual in software development, temporary solutions tend to become more permanent than planned. Though, the old console code really needed a replacement, mainly for:
  • removing the dependency to libreadline and libhistory, they were creating too many issues on the different Unix platforms, so became troublesome for many newcomers.
  • having a finer-grained control over keystrokes on text input, in order to implement convenient features like word completion.
  • having a bigger platform-independent part, so that we can add any kind of backends, like GUI ones, without duplicating too much code.
So, the new console code gets rid of third-party libraries and runs only on what the OS provides. The new features are:
  • built-in history, accessible from system/console/history
  • customizable prompt from system/console/prompt
  • word and object path completion using TAB key
  • ESC key support for interrupting a multi-line input
Other notable console-related improvements:
  • about function now returns also the build timestamp.
  • what function has now a more readable output.
  • Console output speed on Windows is now very fast, thanks to the patch provided by Oldes for buffered output.
The console code is not in its final form yet, it needs to be even more modular and wrapped in a port! abstraction in the future.

Errors support

Red now supports first class errors as the error! datatype. They can be user-created or produced by the system. The error definitions are stored in the system/catalog/errors object.
 red>> help system/catalog/errors
 `system/catalog/errors` is an object! of value:
     throw            object!   [code type break return throw continue]
     note             object!   [code type no-load]
     syntax           object!   [code type invalid missing no-header no-rs-h...
     script           object!   [code type no-value need-value not-defined n...
     math             object!   [code type zero-divide overflow positive]
     access           object!   [code type]
     user             object!   [code type message]
     internal         object!   [code type bad-path not-here no-memory stack...

User errors can be created using make action followed by an error integer code or a block containing the category and error name:
 red>> make error! 402
 *** Math error: attempt to divide by zero
 *** Where: ???

 red>> make error! [math zero-divide]
 *** Math error: attempt to divide by zero
 *** Where: ???

These examples are displaying an error message because the error value is the returned value, we still need to implement a full exception handling mechanism using throw/catch natives in order to enable raising user errors that can interrupt the code flow. The error throwing sub-system is implemented and used by the Red runtime and interpreter, just not exposed to the user yet.

Errors can be trapped using the try native. An error! value will be returned if an error was generated and can be tested using the error? function.
 red>> a: 0 if error? err: try [1 / a][print "divide by zero"]
 divide by zero
 red>> probe err
 make error! [
    code: none
    type: 'math
    id: 'zero-divide
    arg1: none
    arg2: none
    arg3: none
    near: none
    where: '/
    stack: 3121680
 ]
 *** Math error: attempt to divide by zero
 *** Where: /
Currently the console will display errors if they are the last value. That behavior will be improved once the exception system for Red will be in place.

Errors when displayed from compiled programs, provide calling stack information to make it easier to locate the source code where the error originated from. For example:
    Red []
    
    print mold 3 / 0
will produce the following error once compiled and run:
    *** Math error: attempt to divide by zero
    *** Where: /
    *** Stack: print mold /

SORT action

Sorting data is now supported in Red, in a polymorphic way as in Rebol. The sort action is very versatile and useful. Let's start from a basic example:
    scores: [2 3 1 9 4 8]
    sort scores
    == [1 2 3 4 8 9]
As you can see, sort modifies the argument series, you can keep the series unchanged by using copy when passing it as argument:
    str: "CgBbefacdA"
    sort copy str
    == "aABbCcdefg"
    sort/case copy str
    == "ABCabcdefg"
    str
    == "CgBbefacdA"
By default, sorting is not sensitive to character cases, but you can make it sensitive with the /case refinement.

You can use /skip refinement to specify how many elements to ignore, it's handy when you need to sort records of a fixed size.
    name-ages: [
        "Larry" 45
        "Curly" 50
        "Mo" 42
    ]
    sort/skip name-ages 2
    == ["Curly" 50 "Larry" 45 "Mo" 42]
The /compare refinement can be used to specify how to perform the comparison. (It does not yet support block! as argument)
    names: [
        "Larry"
        "Curly"
        "Mo"
    ]
    sort/compare names func [a b] [a > b]
    == ["Mo" "Larry" "Curly"]
Combining it with /skip refinement, you can do some complex sorting task.
    name-ages: [
        "Larry" 45
        "Curly" 50
        "Mo" 42
    ]
    sort/skip/compare copy name-ages 2 2    ;-- sort by 2nd column
    == ["Mo" 42 "Larry" 45 "Curly" 50]
The /all refinement will force the entire record to be passed to the compare function. This is useful if you need to compare one or more fields of a record while also doing a skip operation. In the following example, sorting is done by the second column, in descending order:
    sort/skip/compare/all name-ages 2 func [a b][a/2 > b/2]
    == ["Curly" 50 "Larry" 45 "Mo" 42]
Sort uses Quicksort as its default sorting algorithm. Quicksort is very fast, but it is an unstable sorting  algorithm. If you need stable sorting, just add /stable refinement, it will then use Merge algorithm instead to perform the sort.

New datatypes

A couple of new datatypes were added in this release, mostly because of internal needs in Red runtime to support the new features.

The typeset! datatype has been fully implemented, and is on par with the Rebol3 version. A typeset! value is a set of datatypes stored in a compact array of bits (up to 96 bits). Datatype lookups are very fast in typesets and they are mostly used internally for runtime type-checking support. The following actions are supported on typeset! values: make, form, mold, and, or, xor, complement, clear, find, insert, append, length?. Comparison operators are also supported.

A preliminary implementation of the vector! datatype is also part of this release. A vector! value is a series of number values of same datatype. The internal implementation uses a more compact memory storage format than a block! would do, while, on the surface, behaving the same way as other series. Only 32-bit integer values can be stored for now in vectors. The following actions are supported by vector! values: make, form, mold, at, back, head, head?, index?, insert, append, length?, next, pick, skip, tail, tail?. The implementation will be completed in future releases.

Runtime type checking support

It has finally being implemented, as proper error handling support is now available. So from this release on, function arguments types will be check against the function specification and non-conforming cases will result in an error. Return value type-checking will be added later.

The type-checking might break some existing Red code around that was letting silently pass invalid arguments, so check your code with this new release before upgrading.

The compiler does not do any type checking yet, that will be added at a later stage (though, don't expect too much from it, unless you annotate with types every function exhaustively).

Also notice that the runtime type-checking implementation is making the Red interpreter a little bit faster, thanks to a new optimized way to handle function specification blocks (an optimized spec block is cached after first call, resulting in much faster processing time afterwards).

Red/System improvements

Exceptions handling has been improved, introducing the catch statement allowing to catch exceptions using an integer filtering value. Here is a simple example in the global context:

    Red/System []

    catch 100 [
        print "hello"
        throw 10
        print "<hidden>"
    ]
    print " world"
will output
    hello world

The integer argument for catch intercepts only exceptions with a lower value, providing a simple, but efficient filtering system.

In addition to that, uncaught exceptions are now properly reporting a runtime error instead of passing silently. This new enhanced low-level exception system is supporting the new higher-level Red error handling system.

A couple of new compiler directives have been also added in order to strengthen the interfacing with Red layer:
    #get <path>
The #get directive returns a red-value! pointer on a value referred by a Red object path. This is used internally in the runtime to conveniently access the Red system object content from Red/System code. This directive will be extended in the future to access also words from Red global context.

    #in <path> <word>
The #in directive returns a red-word! pointer to a Red word bound to the object context referred by path.

What's next?

In addition to many minor pending improvements, we will be working on a minor release that will introduce the Redbin format for accurately serialize Red values in binary form. Redbin format will be used to make the compilation process much faster, as it currently slows down pretty quickly as the Red-level environment code size grows up.

Enjoy this new release! :-)

January 15, 2015

Dream big, work hard and make it happen!

Today is a big day for the Red team and all the Red followers.

After four years of hard work on building our dream tool, I announce today the creation of a new company, Fullstack Technologies. The company has raised $500,000 from InnovationWorks and GeekFounders, Chinese VC early-stage investors. This money will help us fuel the launch of Red this year, and spread it everywhere, especially in the mobile market. The mission of Fullstack Technologies is to provide to individual developers and corporations, a simpler and much more productive software creation solution, reducing drastically both costs and development time.

The company has its headquarters in Zhongguancun, Beijing (Chinese's "Silicon Valley"). I am CEO now (once more) and Xie Qingtian (qtxie) is joining the company as tech lead. We plan to recruit more people in the next months.

I want to use this opportunity to express my deepest gratitude to all the people in the community who have helped me, contributed to Red and supported my work with donations since the beginning. I simply could not have made it so far without you.

Also, even if the funding needs are now covered, the Red community still has a major role to play in helping us build the best possible tool and spreading it. Fullstack Technologies intends to work closely with the community, by providing some contracted jobs, bounties and sponsoring for helping spread Red locally.

People who follow closely my work on Red know that I have a strong vision and big ambitions for it, now I have the means to make all that happen! This journey is getting even more exciting. :-)



December 24, 2014

Objects implementation notes

I would like to share some notes about how some of the object features were implemented in the Red compiler. As these are probably the most complex parts in the Red toolchain right now, I thought that it would be worth documenting them a bit, as it can be useful to current and future code contributors.

Shadow objects and functions

Reminder: the Red toolchain is currently written entirely in Rebol.

During the work on object support in the Red compiler, I realized that I could leverage the proximity of Red with Rebol much deeper than before, in order to more easily map some Red constructs directly to Rebol ones. That's how I came up with the "shadow" objects concept (later extended to functions too).

It is pretty simple in fact, each time a Red object is processed from the source code, an equivalent, minimized object is created by the compiler in memory and connected to a tree of existing objects in order to match the definitional scoping used in the Red code.

Here is an example Red source code with two nested objects:
    Red []
    
    a: object [
        n: 123
        b: object [
            inc: func [value][value + n]
        ]
    ]
Once processed by the compiler, the following shadow objects are created in memory:
    objects: [
        a object [
            n: integer!
            b: object [
                inc: function!
            ]
        ]
    ]
But it does not just stop there, the body of the Red object is bound (using Rebol's bind native) to the internal Rebol object, in such way that the definitional scoping order is preserved. So the Red code is directly linked to the Rebol shadow objects in memory. The same procedure (including the Red code binding to Rebol objects part) is applied to all compiled functions, which context is represented as a nested Rebol object in the compiler's memory.

If you get where I am heading, yes, that means that resolving the context of any of the words contained in a Red object/function body becomes as simple as calling Rebol's bind? native on the word value. (Remember that Red source code is converted to a tree of blocks before being compiled). The bind? native will return one of the Rebol's objects, that can then be used as a key in an hashtable to retrieve all the associated metadata.

I wish I had come up with that simple method when I was implementing namespaces support for Red/System. I think that I will rework that part in Red/System in the future, reusing the same approach in order to reduce compilation times (namespaces compilation overhead is significant in Red/System, roughly taking 20% of the compilation time).

Choosing Rebol as the bootstrapping language for Red, shows here its unique advantages.


Dynamic invocation

Processing path values is really difficult in Red (as it would be in Rebol if it had a compiler). The main issue can be visualized easily in this simple example:
    foo: func [o][o/a 123]
Now if you put yourself in the shoes of the compiler, what code would you generate for o/a ?... Could be a block access, could be a function call with /a as refinement, could be an object path accessing a field, could be an object path calling the function a defined in the object. All these cases would require a different code output, and the compiler has no way to accurately guess which one it is in the general case. Moreover, foo can be called with different datatypes as argument, and the compiled code still need to account for that...

One method could be to generate different code paths for each case listed above. As you can guess, this would become quickly very expensive to manage for expressions with multiple paths, as the possible combinations would make the number of cases explode quickly.

Another, very simple solution, would be to defer that code evaluation to the interpreter, but as you cannot know where the expression ends, the whole function (or at least the root level of the function) would need to be passed to the interpreter. Not a satisfying solution performance-wise.

The solution currently implemented in Red compiler for such cases, is a form of "dynamic invocation". If you go through all cases, actually they can be sorted in two categories only:

a) access to a passive value
b) function invocation

Only at runtime you can know which category the o/a path belongs to (even worse, category can change at each foo function call!). The issue is that the compiler generates code that evaluates Red expressions as stack manipulations (not the native stack, but a high-level Red stack), so the compiler needs to know which category it is, so it can:
  • create the right corresponding stack frames.
  • consume the right number of arguments in case of a function invocation.
Basically, the generated Red/System code for the foo function would be (omitting prolog/epilog parts for clarity):

For a) case:
    stack/mark-native ~eval-path 
    stack/push ~o
    word/push ~a 
    actions/eval-path* false 
    stack/unwind
    integer/push 123
    stack/reset
For b) case (with /a being a refinement):
    stack/mark-func ~o
    integer/push 123
    logic/push true               ;-- refinement /a set to TRUE
    f_o
    stack/unwind
As you can see, the moment where the integer value 123 is pushed on stack for processing is very different in both cases. In case a), it is outside of the o/a stack frame, in case b), it is part of it. So what should the compiler do then...looks unsolvable?

Actually some stack tricks can help solve it. This is how the compiler handles it now:
  • The stack can either overwrite new expressions (default) or accumulate them.
  • At each level of a path evaluation, a check for a function result is applied. When a function is detected, it is pushed on stack and a new stack frame is opened to gather the required arguments. Such function is named a "dynamic call" in this context.
  • Some stack primitives (like stack/reset) are modified to not only support the overwritting/accumulative modes, but also check if the arity for the pending dynamic call has been fulfilled, and when appropriate, run the deferred function call, clean-up the stack and revert to the default overwritting mode.
This is the code currently produced by the Red compiler for o/a 123:
    stack/push ~o
    either stack/func? [stack/push-call path388 0 0 null] [
        stack/mark-native ~eval-path
        stack/push stack/arguments - 1      ;-- pushes ~o
        word/push ~a
        actions/eval-path* false
        stack/unwind-part
        either stack/func? [
            stack/push-call path388 1 0 null
        ][
            stack/adjust
        ]
    ]
    integer/push 123
    stack/reset
This generated code, with the help of the dual-mode stack, can support evaluation of o/a whatever value the path refers to (passive or function). stack/func? here checks if the stack top entry is a function or not. There is a little performance impact, but it is not significant, especially in respect to the high flexibility it brings.

So far so good. What happens now if the path is used as argument of a function call:
    foo: func [o][probe o/a 123]
The outer stack frame that probe will create then becomes problematic, because it will close just after o/a, preventing it to fetch eventual arguments when o/a is a function call...so back to the drawing board? Fortunately not, we can apply the same transformation for the wrapping call and defer it until its arguments have been fully evaluated. This is the resulting code:
    f_~path389: func [/local pos] [
        pos: stack/arguments 
        stack/mark-func ~probe 
        stack/push pos 
        f_probe 
        stack/unwind
    ] 

    stack/defer-call ~probe as-integer :f_~path389 1 null
    
    stack/push ~o
    either stack/func? [stack/push-call path388 0 0 null] [
        stack/mark-native ~eval-path
        stack/push stack/arguments - 1
        word/push ~a
        actions/eval-path* false
        stack/unwind-part
        either stack/func? [
            stack/push-call path388 1 0 null
        ][
            stack/adjust
        ]
    ]
    ------------| "probe o/a"
    integer/push 123
    stack/reset
As you can see, it gets more hairy, but still manageable. The outer stack frame is externalized (into another Red/System function), so it can be called later, once the nested expressions are evaluated.

That said, dynamic calls still need a bit more work in order to support routine! calls and refinements for wrapping calls. Those features will be added in the next releases. Also, the gain in flexibility makes the compiler more short-sighted when a particular structure is expected, like for control flow keywords requiring blocks. I don't see yet how this dynamic call approach could support such use-cases in a more user-friendly way.

But another feature can come to the rescue, the upcoming #alias directive proposed in the previous blog post. As long as the user will be willing to use this new directive, it would simply avoid these dynamic constructions, by providing enough information to the compiler to statically determine what kind of value, paths are referring to, resulting in much shorter and faster code, without the short-sightness issue.

This is the kind of problem I had to solve during object implementation and why it took much longer than planned initially.

Hope this deeper look inside the compiler's guts is not too scary. ;-) Now, back to coding for next release!

And, by the way, Merry Christmas to all Red followers. :-)

December 22, 2014

0.5.0: Objects support

We are bumping the version number up higher as we are bringing a new foundational layer and important construct to Red: object! datatype and contexts support.

Supporting objects in the Red interpreter is relatively easy and straightforward. But adding those features in the compiler has proven to be more complex than expected, especially for access-path support, paths being especially tricky to process, given their highly dynamic nature. Though, I have pushed Red beyond the edges I was planning to stop at for objects support, and the result so far is really exciting!

Object model

Just a short reminder mainly intended for newcomers. Red implements the same object concept as Rebol, called prototype-based objects. Creating new objects is done by cloning existing objects or the base object! value. During the creation process, existing field values can be modified and new fields can be added. It is a very simple and efficient model to encapsulate your Red code. There is also a lot to say about words binding and contexts, but that topic is too long for this blog entry, we will address that in the future documentation.

Object creation

The syntax for creating a new object is:
    make object! <spec>
 
    <spec>: specification block
Shorter alternative syntaxes (just handy shortcuts):
    object  <spec>
    context <spec>
The specification block can contain any valid Red code. Words set at the root level of that block will be collected and will constitute the new object's fields.

Example:
    make object! [a: 123]
    
    object [a: 123 b: "hello"]
    
    c: context [
       list: []     
       push: func [value][append list value]
    ]
You can put any valid code into a specification block, it will be evaluated during the object construction, and only then.

Example:
    probe obj: object [
        a: 123
        print b: "hello"
        c: mold 3 + 4
    ]
will output:
    hello
    make object! [
        a: 123
        b: "hello"
        c: "7"
    ]
Objects can also be nested easily:
    obj: object [
        a: 123
        b: object [
            c: "hello"
            d: object [
                data: none
            ]
        ]
    ]

Another way to create an object is to use the copy action which does not require a specification block, so does just a simple cloning of the object. Existing functions will be re-bound to the new object.

Syntax:
    copy <object>
Object access paths

In order to access object fields, the common path syntax is used (words separated by a slash character). Each word (or expression) in a path is evaluated in the context given by the left side of the path. Evaluation of a word referring to a function will result in invoking the function, with its optional refinements.

Example:
    book: object [
        title: author: none
        show: does [print [mold title "was written by" author]]
    ]

    book/title: "The Time Machine"
    book/author: "H.G.Wells"
    print book/title
    book/show
will output:
    The Time Machine
    "The Time Machine" was written by H.G.Wells
SELF reference

A special keyword named self has been reserved when self-referencing the object is required.

Example:
    book: object [
        title: author: none
        list-fields: does [words-of self]
    ]
    book/list-fields
will output:
    [title author list-fields]
Object inheritance

While cloning produces exact replicas of the prototype object, it is also possible to extend it in the process, using make action.

Syntax:
    make <prototype> <spec>

    <prototype> : object that will be cloned and extended
    <spec>      : specification block
Example:
    a: object [value: 123]
    
    c: make a [
        increment: does [value: value + 1]
    ]
    
    print c/increment
    print c/increment
will output:
    124
    125
It is also possible to use another object as <spec> argument. In such case, both objects are merged to form a new one. The second object takes priority in case both objects share same field names.
 
Example:
    a: object [
        value: 123
        show: does [print value]
    ]
    b: object [value: 99]
    
    c: make a b
    c/show
will output:
    99
Detecting changes in objects

Sometimes, it can be very useful to detect changes in an object. Red allows you to achieve that by defining a function in the object that will be called just after a word is set. This event is generated only when words are set using a path access (so inside the object, you can set words safely). This is just a first incursion in the realm of metaobject protocols, we will extend that support in the future.

In order to catch the changes, you just need to implement the following function in your object:
    on-change*: func [word [word!] old new][...]
    
    word : field name that was just affected by a change
    old  : value referred by the word just before the change
    new  : new value referred by the word
It is allowed to overwrite the word just changed if required. You can directly set the field name or use set:
    set word <value>
Example:
    book: object [
        title: author: year: none
  
        on-change*: func [word old new /local msg][
            if all [
                word = 'year
                msg: case [
                    new >  2014 ["space-time anomaly detected!"]
                    new < -3000 ["papyrus scrolls not allowed!"]
                 ]
            ][
               print ["Error:" msg]
            ]
        ]
    ]

    book/title: "Moby-Dick"
    book/year: -4000
will output:
    Error: papyrus scrolls not allowed!
Extended actions and natives for objects

You can use set on an object to set all fields at the same time. get on an object will return a block of all the fields values. get can also be used on a get-path!.

Example:
    obj: object [a: 123 b: "hello"]
    probe get obj
    set obj none
    ?? obj
    set obj [hello 0]
    ?? obj
    probe :obj/a
will output:
    [123 "hello"]
    obj: make object! [
        a: none
        b: none
    ]
    obj: make object! [
        a: 'hello
        b: 0
    ]
    hello

Find action gives you a simple way to check for a field name in an object. If found it will return true, else none.

Select action does the same check as find, but returns the field value for matched word.

    obj: object [a: 123]
    probe find obj 'a
    probe select obj 'a
    probe find obj 'hello
will output:
    true
    123
    none
The in native will allow you to bind a word to a target context:
    a: 0
    obj: object [a: 123]
    probe a
    probe get in obj 'a
will output:
    0
    123

Bind native is also available, but not completly finished nor tested.

Reflectors

Some reflective functions are provided to more easily access objects internal structure.

  • words-of returns a block of field names.
  • values-of returns a block of field values.
  • body-of returns the object's content in a block form.

Example:
     a: object [a: 123 b: "hello"]
     probe words-of a
     probe values-of a
     probe body-of a
will output:
    [a b]
    [123 "hello"]
    [a: 123 b: "hello"]
SYSTEM object

The system object is a special object used to hold many values required by the runtime library. You can explore it using the new extended help function, that now accepts object paths.
red>> help system
`system` is an object! of value:
    version          string!   0.5.0
    build            string!   21-Dec-2014/19:27:05+8:00
    words            function! Return a block of global words available
    platform         function! Return a word identifying the operating system
    catalog          object!   [datatypes actions natives errors]
    state            object!   [interpreted? last-error]
    modules          block!    []
    codecs           object!   []
    schemes          object!   []
    ports            object!   []
    locale           object!   [language language* locale locale* months da...
    options          object!   [boot home path script args do-arg debug sec...
    script           object!   [title header parent path args]
    standard         object!   [header]
    view             object!   [screen event-port metrics]
    lexer            object!   [make-number make-float make-hexa make-char ...
Note: not all system fields are yet defined or used.

Future evolutions

As this release already took a lot of time, some of the planned features are postponed to future releases. Here are a few of them.

Sometimes, it is convenient to be able to add fields to an object in-place, without having to recreate it, losing lexical binding information in the process. To achieve that, a new extend native will be added, working like originaly intended in Rebol3.

In order to help the Red compiler produce shorter and faster code, a new #alias compilation directive will be introduced. This directive will allow users to turn an object definition into a "virtual" type that can be used in type spec blocks. For example:
    #alias book!: object [
        title: author: year: none
        banner: does [form reduce [author "wrote" title "in" year]]
    ]
    
    display: func [b [book!]][
        print b/banner
    ]
This addition would not only permit finer-grained type checking for arguments, but also help the user better document their code.

Another possible change will be in the output mold produces for an object. Currently such output will start with "make object!", this might be changed to just "object", in order to be shorter and easier to read in addition to be more consistent to the way function! values are molded.

Fixed issues

In order to make this release happen as quickly as possible, we have not fixed all the open tickets that were planned to be fixed in this release, but we still managed to fix a few of them. The other pending tickets will be fixed in the upcoming minor releases.

I should also mention that 537 new tests were added to cover objects features. The coverage is already good, but we probably need more of them to cover edge cases.

That's all for this blog article! :-)

I will publish another blog entry about additional information regarding the implementation strategy used by the compiler for supporting contexts and object paths.

As we have almost completed other significant features during the last months, you should expect new minor releases happening very quickly in the next weeks. They will include:

  • New cross-platform console engine written entirely in Red (no dependencies).
  • New Android toolchain for creating APK files 100% ported to Red (no dependencies).
  • Full error and exceptions support at Red level.
  • Redbin initial implementation (not started yet).

Also, the work for 0.6.0 has started already (GUI support), even if its at prototype stage right now. I plan to release a first minimal version in the next few weeks (we will extend it step by step until 1.0).

Hope the waiting for the new release was worth it. ;-)

August 3, 2014

0.4.3: Floating point support

After a long time having only partial floating point support in Red/System, it comes now to Red with a broader support, thanks to the terrific work from Qtxie and Oldes, who managed to push Red further while I was busy moving from Europe to China (though, this might be the topic for another blog entry, as requested by many of you).

The new float! datatype implements IEEE-754 64-bit floating point format. It is available with most of the usual math functions support:

  • infix operators: +, -, *, /, **.
  • prefix base functions: add, substract, multiply, divide, power.
  • trigonometric functions:  cosine, sine, tangent, arcsine, arccosine, arctangent.
  • other math functions: log-2, log-10, log-e, exp, square-root, round

Note that these trigonometric functions are taking arguments in degrees, a /radians refinement is provided for input values in radians. However, this can result in extra verbosity for some long math expressions where using only radians, like:
((sine/radians b) * (cosine/radians c)) + ((cosine/radians b) * (sine/radians c))
Some radians-oriented shortcuts to these functions are also provided for convenience: cos, sin, tan, arcsin, arccos, arctan. So the above expression becomes:
((sin b) * (cos c)) + ((cos b) * (sin c))
Here are some code examples from Red console:
red>> 1.23456
== 1.23456
red>> 1e10
== 10000000000.0
red>> 1.23 * 2
== 2.46
red>> 1.23 * 2.0
== 2.46
red>> to integer! 1.23 * 2.0
== 2
red>> cos pi
== -1.0
red>> sin pi
== 0.0
red>> cos pi / 2
== 0.0
red>> cos pi / 3
== 0.5
red>> cosine/radians pi / 3
== 0.5
red>> cosine 60
== 0.5
red>> .1 + .2 + .3
== 0.6
red>> .1 + .2 + .3 = .6
== true
red>> .1 + .2 + .3 - .6
== 1.110223024625157e-16
red>> float? load "0.1"
== true
red>> to float! 1
== 1.0
red>> 1 = to integer! to float! 1
== true
As you can see, Red tries to give you meaningful outputs even when the result is not exact, but this approach has its limits too. Qtxie has ported partially dtoa() functions to Red/System, however, the overhead of the additional code (20-40KB once compiled) is quite costly given how tiny is currently our runtime library (~350KB). So, for now, that implementation has been provided as an optional library for Red/System, and will be modularized for Red, once modules will be supported.

IEEE-754 special values

You might know that standard floating point format supports a few extra special values that are meant to make some calculation possible in edge cases. Those are also supported natively by Red, with the following literal formats:
Not a Number (NaN)        :  1.#NaN
Positive Infinity (+INF)  : +1.#INF (or just 1.#INF)
Negative Infinity (-INF)  : -1.#INF
Positive Zero             : +0.0 (or just 0.0)
Negative Zero             : -0.0
These values are mostly intended for scientific calculations, you do not have to worry about them. They can be produced as results of some math operations on floats, but by default, an error will be thrown instead.

In case, you need to operate with maximum precision, and have all the special float values as results instead of errors, a couple of flags are available for that through the system special access. The syntax is:
system/float-options [spec]

[spec]: block of flags (word! | set-word!) with values (logic! | word!)
Valid flags are:

  • pretty?: enables pretty printing of float numbers when very close to an integer value (default: true)
  • full?: enables math operations on float special values (default: false)

 Examples:
red>> 4.000000000000001e32
== 4.0e32
red>> system/float-options [pretty?: no]
red>> 4.000000000000001e32
== 4.000000000000001e32
Armhf support

So far, Red supported only the armel ABI for ARM backends. Since this release, we fully support now armhf ABI too, through a specific compilation option that can be found in the new RPi compilation target (intended mainly for default OS on RaspberryPi). The main difference between these ABI is the way float values are passed as arguments to functions, armel requires passing them on stack, while armhf requires passing them through FPU registers.

Other changes

  • url! datatype preliminary support: all actions are working, but no path access support yet.
  • New actions: reverse, random, swap, take, to(*), trim
  • New natives: same?, NaN?
  • New mezzanines: float?, routine?
  • Red/System FPU direct access through system/fpu/* options.
  • Help command now displays full help on routines too.
  • Many bug fixes and a few wishes granted.

(*) to is currently limited to integer/float/string conversions only.

What's next?

After the digression in the floating point lands, we go back to our main roadmap, so in the next releases, expect (in no particular order):

  • GUI support for Android / Windows platforms
  • Improved toolchain for Android APK generation
  • Object compilation support
  • New console engine
  • Error! datatype and exceptions handling
  • Typeset! and other new datatypes
  • Redbin format specification and implementation for the compiler
  • Improved compiler performance

Thanks for your patience and support during these last months, we are now back to our cruise development speed, so expect faster changes until the end of the year.