April 26, 2015

0.5.3: Faster compilation and extended vector! support

The main point of this minor release is to speed up compilation time by introducing a new way for the compiler to store Red values required for constructing the environment during the runtime library startup.

Introducing Redbin

Red already provides two text-oriented serialization formats, following the base Rebol principles. Here are the available serialization formats now in Red with some pros/cons:
  • MOLD format
    • provides a default readable text format, very close to the source code version
    • cannot properly encode many values
  • MOLD/ALL format
    • can encode series offsets
    • some values with literal forms that rely on words that can be natively encoded (none, true/false, objects, ...)
    • human-readable, but not always nice-looking
  • Redbin format
    • can encode any value accurately
    • supports words binding
    • can encode contexts efficiently
    • supports cycles in blocks
    • can encoded name/value pairs in any context
    • extremely fast loading time
    • very small storage space used when compressed
    • non human-readable
So far, the existing environment source code (mostly block values) was converted to pure Red/System construction code which was pretty simple and straightforward to implement, but was generating thousands of extra lines of code, slowing down the native compilation process. The right solution for that was to introduce a new binary serialization format for Red values called Redbin (very inspired by Carl's REBin proposal).

Redbin's specification focuses on optimizing the loading time of encoded values, by making their stored representation very close to their memory representation, bypassing the parsing and validation stages. Moreover, the Redbin payload is compressed using the Crush algorithm (that Qtxie ported to Red/System), which features one of the fastest decompressors around while having a general compression ratio very close to the deflate algorithm (but compression speed is about an order of magnitude slower). This fits perfectly the needs for our Redbin use-case.

So the gains compared to pre-0.5.3 version are:
  • compilation time of empty Red program is ~40% faster!
  • generated executable of empty Red program is about 100KB smaller (278KB only on Windows now).
  • faster startup time, as the Redbin decoding process is much faster than the previous Red-stack-oriented construction approach.
Those benefits also extend to user code, your static series will be saved in Redbin format as well.

Redbin format is currently emitted by the compiler and decoded by the Red runtime, but there is no encoder yet in the runtime that would allow user code to emit Redbin format. We will provide that support in a future version, it is not high priority for now. A "compact" version of the encoding format will also be added, so that Redbin can also be a good choice for remote data exchange.

Compilation from Rebol console

For those using Red toolchain from Rebol2 console, a new rc function is introduced to avoid reloading the toolchain on each run. Typical session looks like this:

    >> do %red.r
    >> rc "-c tests\demo.red"

    -=== Red Compiler 0.5.3 ===-

    Compiling /C/Dev/Red/tests/demo.red ...
    ...compilation time : 416 ms

    Compiling to native code...
    Script: "Red/System PE/COFF format emitter" (none)
    ...compilation time : 12022 ms
    ...linking time     : 646 ms
    ...output file size : 284160 bytes
    ...output file      : C:\Dev\Red\demo.exe
 
    >> call/output "demo.exe" s: make string! 10'000
    == 0
    >> print s

      RedRed              d
      d     d             e
      e     e             R
      R     R   edR    dR d
      d     d  d   R  R  Re
      edRedR   e   d  d   R
      R   e    RedR   e   d
      d    e   d      R   e
      e    R   e   d  d  dR
      R     R   edR    dR d

Collation tables

Since 0.5.2, Red provides collation tables for more accurate case folding support. Those tables can now be accessed by users using these paths:
    system/locale/collation/upper-to-lower
    system/locale/collation/lower-to-upper
Each of these tables is a vector of char! values which can be freely modified and extended by users in order to cope with some specific local rules for case folding. For example, in French language, the uppercase of letter é can be E or É. There is a divide among French people about which one should be used and in some cases, it can just be a typographical constraint. By default, Red will uppercase é as É, but this can be easily changed if required, here is how:

    uppercase "éléphant"
    == "ÉLÉPHANT"
    
    table: system/locale/collation/lower-to-upper
    foreach [lower upper] "àAéEèEêEôOûUùUîIçC" [table/:lower: upper]

    uppercase "éléphant"
    == "ELEPHANT"

Extended Vector! datatype

Vector! datatype now supports more actions and can store more datatypes with different bit-sizes. For integer! and char! values, you can store them as 8, 16 or 32 bits values. For float!, it is 32 or 64 bits. Several syntactic forms are accepted for creating a vector:
    make vector! <slots>
    make vector! [<values>]
    make vector! [<type> <bit-size> [<values>]]
    make vector! [<type> <bit-size> <slots> [<values>]]

    <slots>    : number of slots to preallocate (32-bit slots by default)
    <values>   : sequence of values of same datatype
    <type>     : name of accepted datatype: integer! | char! | float!
    <bit-size> : 8 | 16 | 32 for integer! and char!, 32 | 64 for float!
The type of the vector elements can be inferred from the provided values, so it can be omitted (unless you need to force a bit-size different from the values default one). If a value with a bit-size greater than the vector elements one, is inserted in the vector, it will be truncated to the bit-size of the vector.

For example, creating a vector that contains 1000 32-bit integer values:
    make vector! 1000
Or if you want to specify the bit-size of the vector element:
    make vector! [char! 16 1000]
    make vector! [float! 64 1000]
You can also initialize a vector from a block as below:
    make vector! [1.1 2.2 3.3 4.4]
Again you can also specify the bit-size of the vector element:
    make vector! [integer! 8 [1 2 3 4]]
For integer! and char! vectors, you can use all math and bitwise operators now.
    x: make vector! [1 2 3 4]
    y: make vector! [2 3 4 5]
    x + y
    == make vector! [3 5 7 9]
In case of different bit-sizes, the resulting vector will be using the highest bit-size. If a math operation is producing a result that does not fit the bit-size, the result is currently truncated to the bit-size (using a AND operation). Ability to read and change the bit-size of a vector will be added in future releases.

The following actions are added to vector! datatype: clear, copy, poke, remove, reverse, take, sort, find, select, add, subtract, multiply, divide, remainder, and, or, xor.

The vector! implementation is not yet final, some of its actions will get optimized for better performances and, in future, rely on SIMD for even faster operations. For multidimensional support, it will be implemented as a new matrix! datatype in the near future, inheriting from vector!, so the additional code required will be kept minimal.

Bugfixing

This was a short-term release, but we managed to fix a few bugs anyway.

What's next

Another minor release will follow with many runtime library additions and new toolchain improvements. See the planned features for 0.5.4 on our Trello board.

The 0.6.0 release will also most probably be split in two milestones, one for GUI and another for Android support.

In the meantime, enjoy this new release! :-)

4 comments:

  1. Lot's of nice stuff inside. I especially value Redbin, as it is another example, where Red goes into the new territory, in comparison to Rebol. Looking forward to the precompiled runtime coming in the next release :-)

    ReplyDelete
  2. Nice Nenad! I appreciate the new vector! and the announced matrix! types. Very useful for image processing. Great!
    François

    ReplyDelete
  3. Congratulations! And Thank You! I feel like I've been holding my breath for the past month or so, following your work and anticipating the release. It looks to be a huge move forward based upon features. Just got it running on Mac and I'm looking forward to the play. Thanks again to the whole team for all the hard work! REALLY excited about the upcoming GUI release; it will soon be time to start holding my breath again! :-0 --gnat

    ReplyDelete
    Replies
    1. Thanks, glad you like our work, the upcoming features will be worth the wait. ;-)

      Delete

Fork me on GitHub