April 4, 2015

0.5.2: Case folding and hash! support

This is minor release mainly motivated by the need to fix some annoying issues and regressions we have encountered in the last release:

  • the help function was displaying an error when used with no arguments, preventing newcomers from seeing the general help information
  • the console pre-compilation issue with timezones was back.

Some significant new features managed to sneak into this release too, along with some bugfixes.

Case folding

Red now provides uppercase and lowercase natives and more generally, better support for Unicode-aware case folding. Red runtime library contains now a general one-to-one mapping table for case folding that should cover most user needs.
    red>> uppercase "hello"
    == "HELLO"
    red>> uppercase/part "hello" 1
    == "Hello"
    red>> uppercase "français"
    == "FRANÇAIS"
    red>> uppercase "éléphant"
    == "ÉLÉPHANT"
    red>> lowercase "CameL"
    == "camel"
This applies also to words, so now case insensitivity is Unicode-aware in Red:
    red>> É: 123
    == 123
    red>> é
    == 123
    red>> "éléphant" = "ÉLÉPHANT"
    == true
    red>> "éléphant" == "ÉLÉPHANT"
    == false
For special cases, we will expose, in a future release, the collation table we use internally, so that anyone can provide a customized version that is a better fit for some local special rules or usages. For example, some lower case characters (such as "ß") actually map to two or more upper case code points ("SS" in this case). So in Red, by default, you will get:
    red>> lowercase "ß"
    == ß
    red>> uppercase "ß"
    == ß
You can read more about our plans for full Unicode support on the wiki.

Hash datatype

The new hash! datatype works exactly the same way as in Rebol2. It provides a block-like interface but with fast lookups for most values (block series can be stored in hash! too, but they will not be hashed, so no faster access). It is a very flexible container for any kind of hashed tables (not only associative arrays) while keeping the handy navigational abilities of blocks. The underlying hashing function is a custom implementation of the MurmurHash3 algorithm. Some usage examples:
    red>> list: make hash! [a 123 "hello" b c 789]
    == make hash! [a 123 "hello" b c 789]
    red>> list/c
    == 789
    red>> find list 'b
    == make hash! [b c 789]
    red>> dict: make hash! [a 123 b 456 c 789]
    == make hash! [a 123 b 456 c 789]
    red>> select dict 'c
    == 789
    red>> dict: make hash! [2 123 4 456 6 2 8 789]
    == make hash! [2 123 4 456 6 2 8 789]
    red>> select/skip dict 2 2
    == 123

A map! datatype (strictly associative array) should also be provided in the next release, though, we are still investigating some of its features and use-case scenarios before deciding to release it officially.

Good news also about our Mac build server, a new one was kindly provided by Will (thanks a lot for that).

Our next release should mainly feature the Redbin format support for the Red compiler, providing much faster compilation times and reduced generated binaries.

Enjoy! :-)

14 comments:

  1. Why not `set-block-properties block [hashed]` or something (pick your notation here), and have it still be something you can pass to something that is `func [blk [block!]] [...]`? Having a separate HASH! datatype creates a user-facing difficulty where a conscientious author would write `func [blk [block! hash!]] [...]` every time. HASH! should be a property of a block, not an independent datatype. :-(

    ReplyDelete
    Replies
    1. We will proceed with the merge of the two datatypes eventually once we decide on the semantics of the construction syntax, as such block property needs to be reflected ,once serialized, in a accurate and elegant way.

      Delete
  2. Great stuff Nenad! Congratulations on another quick release.

    ReplyDelete
  3. Excellent les caractères accentués pour les words!

    ReplyDelete
    Replies
    1. Ca fonctionne aussi avec d'autre langues que le Français. ;-) Dans la prochaine release, les utilisateurs pourront modifier/étendre la table de conversion pour couvrir des besoins particuliers.

      Delete
  4. Agreed with Gregg, especially Hash! type is a surprise. I don't get the working of it from the examples, the final example I expected the result to be 8.

    ReplyDelete
    Replies
    1. The /SKIP refinement treats the series as records of fixed size (= 2 in the example). You can get that info from the console:

      red>> help select

      Delete
  5. @Anonymous: The /SKIP refinement is not an action like 'NEXT which would move along the series. It is information to the 'SELECT function to treat the data as being records with two "fields" in each record.

    In this case, 'SELECT will match only every second word against the supplied criteria. For example:

    red>> select/skip dict 456 2
    == none

    red>> select/skip dict 6 2
    == 2

    ReplyDelete
  6. Naive question: how should one report issues if (s)he doesn't have a github account? Is it okay to post a comment in this site? Maybe you could add a comment section to the Contributions page for this purpose? Thank you.

    By the way, there are some broken links at the bottom of the Contributions page:
    - ImageMagick binding: the link should be https://github.com/red/red/blob/master/system/library/lib-iMagick.reds
    - OpenCV binding: should be https://github.com/ldci/OpenCV-red
    - DAQmxBase binding: should be https://github.com/ldci/NI/tree/master/Red

    ReplyDelete
    Replies
    1. As Peter answered, there are other places, like the mailing-list, our Facebook page or Stackoverflow chat group. We don't encourage anonymous posts as they usually left a door open to spamming. Thanks for the newer links, the page has now been updated with them.

      Delete
  7. Whilst it is best to report issues on github, you can also tell us about issues on the Red Language Google Group / Mailing List (https://groups.google.com/forum/#!forum/red-lang).

    Posting to the Google Group does require registration as otherwise we will be overrun with spam.

    There is also a Red Development Chat on Stackoverflow.com which not only requires registration but also a stack overflow "reputation" before you can post - there rules not ours.

    Thanks for letting us know about the broken links.

    ReplyDelete
  8. I came across another Red language - did anybody know that the Red name was already taken ?
    http://iment.com/maida/computer/redref/toc.htm#toca.

    wet_wet_wet

    ReplyDelete
    Replies
    1. Do you really think I didn't do extensive research before picking up the "Red" name?

      From the RED [1979] documents: "The RED language [Nestor and Van Deusen 79, Brosgol 79] was designed as a candidate for the Department of Defense common hlgh-order Language, Ada, in Phase 2 of the language design competition.[...] No Implementation of RED, beyond the original translator delivered with the language design, has been planned [Davis and Levine 79]."

      Basically, it's just a specification from 1979 with no implementation.

      Delete

Fork me on GitHub