May 31, 2011

Short look on last three months

Three months have passed since Red has gone public. It seems short, but a lot happened since then and I wanted to sum up the work accomplished for the people interested in Red that don't have the time to follow the progress daily on every channel.

  • Red/System is implemented at 98%
  • ~2600 unit tests were created (thanks to Peter WA Wood)
  • Red/System language specification draft added, now close to 95% of completion
  • Code base almost doubled: 60KB to 110KB (~3200 LOC)
  • 164 commits on Github, 44 by contributors
  • 4600+ page views on Github's Red repository

Main new features
  • New datatypes: byte!, logic!, pointer!
  • Linux port (thanks to Andreas Bolka)
  • Syllable port (thanks to Andreas and Kaj)
  • MacOS X port in progress

  • 0MQ library support (Kaj de Vos)
  • Quick-Test framework (Peter WA Wood)
  • 3 contributors on Github, 22 followers
  • 65 issues opened on Github, 53 closed (thanks to Rudolf Meijer)
  • 8 blog posts, 10'000+ page views
  • #red_lang: 130 tweets, 39 followers
  • Mailing-list: 32 members, 45 topics, 150 posts
  • IRC: 5 people permanently logged

Last but not least, we got an article on OSNews two days ago that boosted the page views on (+3'000) and Github's repo (+1'000).

Thanks for all people supporting Red, one way or another. This is a very good start and a big incentive for me to work even harder on the next steps.

May 25, 2011

Red meeting on 28/05 at Lille

I will present Red language and the Cheyenne Web Server at Lille (France) on May, 28 to a group of REBOL and Red users. The presentations will be done in french, but if some none-french speakers want to come, I will try to do it both in french and english.

Detailed information to get there are available in this thread from RebelBB forum. See you there on Saturday!

May 14, 2011

Red/System compiler overview

Source Navigation

As requested by several users, I am giving a little more insights on the Red/System compiler inner workings and a map for navigating in the source code.

Current Red/System source tree:
%compiler.r ; Main compiler code, loads everything else
%emitter.r ; Target code emitter abstract layer
%linker.r ; Format files loader
%rsc.r ; Compiler's front-end for standalone usage

formats/ ; Contains all supported executable formats
%PE.r ; Windows PE/COFF file format emitter
%ELF.r ; UNIX ELF file format emitter

library/ ; Third-party libraries

runtime/ ; Contains all runtime definitions
%common.reds ; Cross-platform definitions
%win32.r ; Windows-specific bindings
%linux.r ; Linux-specific bindings

targets/ ; Contains target execution unit code emitters
%target-class.r ; Base utility class for emitters
%IA32.r ; Intel IA32 code emitter

tests/ ; Unit tests

Once the compiler code is loaded in memory, the objects hierarchy looks like:
 system/words/          ; global REBOL context

system-dialect/ ; main object
loader/ ; preprocessor object
process ; preprocessor entry point function
compiler/ ; compiler object
compile ; compiler entry point function

emitter/ ; code emitter object
compiler/ ; short-cut reference to compiler object
target/ ; reference the target code emitter object
compiler/ ; short-cut reference to compiler object

linker/ ; executable file emitter
PE/ ; Windows PE/COFF format emitter object
ELF/ ; UNIX ELF format emitter object

Note: the linker file formats are currently statically loaded, this will be probably changed to a dynamic loading model.

Compilation Phases

The compilation is a process that goes through several phases to transform a textual source code to an executable file. Here is an overview of the process:

1) Source loading

This is a preparatory phase that would convert the text source code to its memory representation (close to an AST). This is achieved in 3 steps:
  1. source is preprocessed in its text form to make the syntax REBOL-compatible
  2. source is converted to a tree of nested blocks using the REBOL's LOAD function
  3. source is postprocessed to interpret some compiler directives (like #include and #define)
2) Compilation

The compiler walks through the source tree using a recursive descent parser. It attempts to match keywords and values with its internal rules and emits:
  • the corresponding machine code in the code buffer
  • the global variables and complex data (c-string! and struct! literals) in the data buffer
The internal entry point function for the compilation is compiler/comp-dialect. All the compiler/comp-* functions are used to recursively analyze the source code and each one matches a specific semantic rule (or set of rules) from the Red/System language specification.

The production of native code is direct, there is no intermediary representation, machine code is generated as soon as a language statement or expression is matched. This is the simplest way to do it, but code cannot be efficiently optimised without a proper Intermediate Representation (IR). When Red/System will be rewritten in Red, a simple IR will be introduced to enable the full range of possible code optimisations.

As you know, a Red/System program entry point is at the beginning of the source code. During the compilation, the source code in the global context is compiled first and all functions are collected and compiled after all global code. So the generated code is always organised the same way:
  • global code (including proper program finalization)
  • function 1
  • function 2
  • ...
The results of the compilation process are:
  • a global symbol table
  • a machine code buffer
  • a global data buffer
  • a list of functions from external libraries (usually these are OS API mappings)
The compiler is able to process one or several source files this way before entering the linking phase.

3) Linking

The linking process goal is to create an executable file that could be loaded by the target platform. So the executable file needs to conform to the target ABI for that, like PE for Windows or ELF for Linux.

During the linking, the global symbol table is used to "patch" the code and data buffer (see linker/resolve-symbol-refs), inserting the final memory address for the pointed resources (variable, function, global data). The different parts to assemble are grouped into so-called "sections", that can be themselves be grouped into "segments" (as, e.g., in ELF).

Finally, some headers describing the file and its sections/segments are inserted to complete the file. The file is then written down on disk, marking the end of the whole process.

Static linking of external libraries (*.lib, *.a,...) will be added at some point in the future (when the need for such feature will appear).

I hope this short description gives you a better picture on how Red/System compiler works, even if it is probably obvious for the most experienced readers. Feel free to ask for more in the comments, or better, on the Google Groups mailing-list.

Fork me on GitHub