Perl 6: Giving with One Hand and Taking with the Other

I’ve been programming Perl full-time for almost two years now. The reason is my new job, where 99% of back-end our code base is Perl. Before that, I had not written a single line of Perl, so the language has been a new experience for me. I think Perl is actually better than its reputation. It is pleasant to write most of the time. I also love to see and understand how many cool features in other languages were inspired by Perl. Nevertheless, it’s undeniable that the language has pitfalls and design problems. This year, during FOSDEM, Larry Wall announced that a first version of Perl 6 will be released before Christmas. For those not familiar with Perl, please notice that it’s not exactly the successor of Perl 5, but it’s rather a completely new and different language. Perl and Perl 6 are called sister languages. The idea of Perl 6 started fifteen years ago, when Larry Wall and others decided to design and implement a new language from scratch. The idea was to keep Perl’s essence, while fixing all of its design issues and quirks. I decided to give Perl 6 a try. Mostly because I was curious to see if it actually fixed the problems of Perl 5. I haven’t started a serious project or anything, I’ve been just playing with the language and solving some Project Euler problems. Also, I’ve started my tests with a pre-release version of Perl 6 (Rakudo Star with MoarVM version 2015.01), some things are not implemented and some others are not properly documented. If any Perl 6 dev ever reads this, please forgive me if something is wrong. I’ll be happy to be corrected. Both Perl 5 and Perl 6 languages are pretty big. The motto of these languages is “There is more than one way to do it” (TIMTOWTDI). So there are tons of features, many of them just to do the same things. I will only discuss a few of them. In particular, I will discuss how Perl 6 addresses those that I consider the biggest problems of Perl 5: context and references.

First Big Problem: Context Variable Behavior

Context is something very unique to Perl as far as I know. It’s a way to mimic spoken language where the meaning of some words depends on the context they’re used in. Perl 5 has 3 contexts: scalar, list and void. So for example, a function might do and return something completely different depending on the context of the call:

my $x = do_something(); # Scalar context: returns one thing.
print do_something();   # List context: might return other thing.
do_something();         # Void context: a third possibility.

When implementing a function, you can use the special wantarray construct to know in which context the function is being called. For example:

sub do_something {
   if (wantarray) {
       # this will run if called in List context
   } elsif (defined wantarray) {
       # this will run if called in Scalar context
   } else {
       # this will run if called in Void context
   }
}

The reason I consider context one of the biggest problems of Perl 5 is because, unlike other bad “features”, it’s pretty much impossible to avoid. The concept is so fundamental to the language and its core libraries that you just have to learn to live with it. It’s important to be very careful of how you are calling your functions all the time, otherwise you might introduce bugs or even security issues. Unfortunatelly, there doesn’t seem to be a convention on what functions should return in each context. For example, the built-in function keys return the list keys of a hash table / dictionary when called in list context, but instead returns then number of keys if called in scalar context. However, the splice function removes elements from an array and returns them in list context but it only returns the last element in scalar context. Similarly, the regexp match operator returns a Boolean if the string matches in scalar context, but the matching groups in list context (only if groups were defined). And so on. All this makes Perl hard to learn because you have to read and memorize the documentation of all the functions, and their behaviors on different contexts. Things are rarely intuitive and I’ve seen developers with many years of Perl experience bitten by this from time to time.

Second Big Problem: References

The other big problem with Perl, in my opinion, are references. More specifically, array and hash references. To explain why these are a problem, first I have to describe Lists. Lists are a language construct in Perl that can be used to initialize data structures, passing arguments to functions or assigning variables. A list is basically a group of expressions separated by commas. For example:

my @array = (1, 2, 3);       # Initialize an array.
my ($one, $two) = (1, 2);    # Assign variables.
join('/', 'home', 'manuel'); # Arguments to a function.

One important thing about lists is that they flatten all the inner lists. For example, the following lines are equivalent:

my @array = (1, 2, 3, 4);
my @array = (1, (2, (3, 4)));

The flattening also happens if you use arrays as expressions, for example:

my @end = (3, 4);
my @array = (1, 2, @end); # same as (1, 2, 3, 4);

List flattening is actually a nice feature, it allows a lot of cool things such as destructuring of assignment and function arguments. However, it has a problem: it makes it difficult to create nested data structures such as arrays of arrays. To fix this, Perl introduced the concept of references. References are some sort of high level pointers. Instead of storing the actual data, references store a pseudo address which points to the data. The important aspect of references, in relation to the problem described above, is that they’re scalars, and as such they can be taken as a single element of an array. So you can’t have arrays of arrays in Perl because of the flattening feature, but you can have arrays of array references, which are a good substitute. For example:

my @a = (1, 2);
my @b = (3, 4);
my @parts = (\@a, \@b); # Two elements. Doesn't flatten.

Perl also provides a nice syntax for defining array references, so the code above could be better written as:

my @parts = ([1, 2], [3, 4]); # an array of two arrayrefs.

Or you can write it as a single array reference and the syntax is very similar to other programming languages:

my $arrayref = [ [1, 2], [3, 4] ];

A similar thing happens with hashes/dictionaries, which are constructed also with lists. To have nested hashes you have to use hash references. The problem with array references is that they behave completely differently in both list and scalar contexts. So every time you want to do something with them you have to know if you’re dealing with a real array or a reference. Just like context, this problem is unavoidable in Perl because you need references every time you need nested arrays, but you can’t use only references because most built-in functions to operate with collections accept real arrays instead. To illustrate this problem, let’s assume that you’re using an API with a get_employees() function, and you want to print the name of each one. Your code is different depending on whether the function returns an array or an arrayref:

# Array version:
my @employees = get_employes();
for my $employee (@employees) {
    print $employee->name;
}

# Arrayref version:
my $employees = get_employes();
for my $employee (@{$employees}) {
    print $employee->name;
}

So for every function returning a collection, you have to read the documentation and memorize if they return a reference or an array. Some libraries help you with this, for example, DBI (the database interface for Perl) adds a suffix to some functions indicating which kind of value they return:

@row = $sth->fetchrow_array();
$row = $sth->fetchrow_arrayref();

Other libraries make use of the wantarray special function to return an array when in list context or a reference when in scalar context. This sometimes helps, but it also creates even more confusion because it’s not a widespread idiom, so you have to check the documentation carefully anyway. To make things worse, Perl built-in functions usually work in a way that’s completely counterintuitive for people used to all this lists/array logic. For example, the push function can be used to add one or more elements to an array:

my @array = (1, 2, 3);
push(@array, 4, 5, 6); # will be (1, 2, 3, 4, 5, 6)

This looks perfectly fine, however, if you were paying attention at how lists work in Perl, you know that arguments passed to a function are lists, and lists flatten. So applying the flattening logic to the push function, these two lines should be equivalent:

push(@array, 4, 5, 6);
push(1, 2, 3, 4, 5, 6);

So how does Perl know when the array part of the arguments ends and the elements part starts? If you use the list/array logic, to make this work, the first element should be a reference. That way, you will know that the array to push into is the first element of the list. You’d use it like this:

push(\@array, 4, 5, 6);

But that’s not how the built-in push function works in Perl. So, how does it work? A very experienced Perl developer once told me that in general, I should not assume that built-in functions behave as regular functions, some times they use some magic. However, there is a feature that explains this: function prototypes. These are small specs added to each function specifying what each argument is. It allows you to override the flattening logic. For example, if you want to have a function that behaves like push, you have to use:

sub my_push(\@@) {
    my ($array, @elements) = @_;
    push @$array, @elements;
}

One more thing to look for when reading API documentation. Fortunately, it’s not common that libraries abuse this feature.

Perl 6 to the Rescue?

Let’s finally talk about Perl 6. Good news is that it addresses these issues. Although Perl 6 still has a concept of context, it’s completely different from what Perl 5 uses. In practical terms this means that there is no wantarray function (yay!). Instead context flows outwards. That means that functions just return objects that know how to behave in different scenarios using methods for it. For example, if you want to represent an object as a string, you implement an Str method. If you want it as a number, you implement a Numeric method and so on. That’s not much different from what other languages do, it’s a well known pattern. Perl 6 also drops references. Everything is an object now. In fact you can assign arrays to scalar variables and use them almost in the same way as regular arrays:

$./perl6
> my @array = 1, 2, 3, 4;
1 2 3 4
> @array.elems # returns the number of elements in the array.
4
> @array[0] # element access
1
> @array[0] = 10 # assigning an element
10
> say @array;
10 2 3 4
>
> # scalars work in the same way:
> my $scalar = @array;
10 2 3 4
> $scalar.elems
4
> $scalar[0]
10
> $scalar[0] = 1
1
> say $scalar
1 2 3 4

So, if arrays behave like objects, and you can assign them to scalar variables and use them in the same way, why does Perl 6 still use sigils (the symbol before the variable name) to differentiate arrays from scalars? The answer is that, unfortunately, we still have context in Perl 6. And we still have flattening lists, and everything is mixed in a very confusing cocktail. Let’s start with the context part. A list, just as in Perl 5, it’s a series of items separated by commas. When assigning a list to something, the value will be different depending on the context:

my @array = 1, 2, 3; # list context assigns all the items
my $array = 1, 2, 3; # item context: assigns only the first item

List also flatten, so these two arrays are the same:

my @array1 = 1, 2, 3, 4;   # four elements
my @array2 = 1, 2, (3, 4); # also four elements

Because there are no references in Perl 6, if you want to have nested arrays, you have to explicitly ask for item context:

my @array2 = 1, 2, $(3, 4); # three elements
my @array2 = 1, 2, [3, 4];  # brackets also work. TIMTOWTDI

So far, sounds reasonable. But there is one problem: in addition to Lists, Perl 6 introduces another construct called parcels, which stands for Parenthesis Cells. Just as lists, they can have elements separated by commas, but they behave differently: while Lists flatten, Parcels don’t. The fact that both constructs are dangerously similar, creates a lot of confusion.

my $a = 1, 2, 3, 4;       # $a is a single integer = 1.
my $b = (1, 2, 3, 4);     # $b is a parcel with 4 elements
my $c = ((1, 2), (3, 4)); # $c is a parcel with 2 elements

my @a = 1, 2, 3, 4;       # @a is an array with four elements
my @a = (1, 2, 3, 4);     # @b is also an array with 4 elements
my @a = ((1, 2), (3, 4)); # @c is also an array with 4 elements

How does Perl 6 know if a list of things surrounding by parenthesis and delimited by commas are lists or parcels? It depends of the context, in this case the sigil of the variable being assigned. Note that in the case of scalar context, the parenthesis surrounding the expression are fundamental to determine if the value assigned is a parcel or just the first element of the list. You might think the key to recognize parcels are the parenthesis, after all, they’re called Parenthesis Cells for a reason. But this is not the case. First, in list context parenthesis are pretty much ignored. And even in scalar context, some things can be parcels even if they don’t have parenthesis. For example the value returned by a function:

sub my_function { 1, 2, 3 }
my $a = my_function(); # $a is a parcel with three elements.

Sometimes it’s harder to determine if a list or a parcel is going to be used, because you don’t have a sigil to determine it. For example, when you do something like this:

((1, 2), (3, 4))[0] # returns 1 2

In this case Perl 6 assumes it’s a parcel, hence the items are not flattened. Same thing seems to happen when trying to call methods:

((1, 2), (3, 4)).elems # returns 2

However, sometimes it seems to take the value as a list:

((1, 2), (3, 4)).map({ say $_ }) # flattens and print four lines.

I don’t know how this is possible at all, since parcels don’t even have a map method. And I haven’t figured out how could Perl 6 interpret this as a list. I suspect this is either a bug or an exception hard coded in Rakudo (there are a few of those). Do you think that’s all? Of course not. Perl 6 has a third construct for comma separated items: Captures. These are used for function arguments. The rules for flattening Captures are variable, they depend on a Signature object associated with it. Each case is unique. I’m not going to describe how Captures work, they’re really complex. You can read the documentation if you’re curious.

Conclusion

Perl 6 is definitely an improvement over Perl 5 in many areas. It would require many posts to describe all the nice fixes in design. However, my feeling is that while some issues have been fixed, new quirks have been introduced. Of course, Perl 6 has not been released yet and some of these things might change. Also, I have not tried all the features to have a solid opinion on it. I think I’ll take a look again in a year when version 6.0.0 is out there.

Update:

Thanks to Reddit, I just found out that most of the issues mentioned here about Perl 6 are going to be fixed. More specifically, the flattening of lists and the elimination of parcels. This is called The Great List Refactor and it’s supposed to be there before the final release of the language.

A powerful unused feature of Python: function annotations.

Something I’ve always missed when using Python (and dynamically typed languages in general) is nice tooling support. C# and Java have powerful IDEs that can improve your productivity significantly. Some people say that IDEs are a language smell. I disagree, IDEs are a truly valuable tool and the “nice language or IDE” statement is a false dilemma.

The problem with dynamically typed languages is that it’s impossible for the IDE to infer things about some parts of  your code. For example, if you start typing this:

def myfunction(a, b):
...

It’s impossible for the editor to give you any hint about a or b.

I’ve been playing with Dart and TypeScript recently. These are languages that compile to Javascript and both try to improve tooling support. They’re interesting because, despite being dynamically typed languages, both implement optional type annotations. These have no different purpose than aiding editors and IDEs. Let me show you a simple example of how this can be seriously useful, consider the following Javascript code:

function findTitle(title) {
	var titleElement = document.getElementById('title-' + title);
	return title;
}

var t = findTitle('mytitle');
t.innerHTML = 'New title';

The code has a small error that is not very easy to notice. Now let’s see the TypeScript Web Editor with the same code adding a single type annotation to findTitle:

typescript

TypeScript found an error. By knowing that title is a string, it knows that findTitle is returning a string too, and therefore t is a string and strings don’t have an innerHTML method.

Early error detection is one advantage of good tooling support. Another interesting thing is accurate code completion. With good code completion you don’t have to browse huge API docs looking for what you need. You can discover the API while you type and use automatic re-factor tools without worrying about breaking code.

typescript-small

Anders Hejlsberg’s introduction video to TypeScript contains more interesting details about how annotations are really useful.

While playing with TypeScript I couldn’t stop thinking how cool would be to have something like that in Python. Then I realized that Python had syntax for annotations years before TypeScript or Dart were even planned. PEP 3107 introduced function annotations in Python. Here is a small example:

def greet(name: str, age: int) -> str:
    print('Hello {0}, you are {1} years old'.format(name, age))

Here I annotated the greet function with the types of each argument and return value. Python annotations are completely optional and if you don’t do anything with them, they’re just ignored. However, with some little magic, it’s possible to tell python to check types at run-time:

>>> @typechecked
... def greet(name: str, age: int) -> str:
...     print('Hello {0}, you are {1} years old'.format(name, age))
...
>>> greet(1, 28)
Traceback (most recent call last):
    ...
TypeError: Incorrect type for "name"

Run-time type checking is not very useful though. However, a static analyzer could use that information to report errors as soon as you type. Also, IDEs and code completion libraries such as Jedi could use that information to provide nice completion tips just like TypeScript does.

Some people might say that this makes the language too verbose. People using dynamic languages often want concise code. But in practice, if you take a look at any medium to large Python project or library, chances are that you’ll find something like this:

def attach_volume(self, volume_id, instance_id, device):
    """
    Attach an EBS volume to an EC2 instance.

    :type volume_id: str
    :param volume_id: The ID of the EBS volume to be attached.

    :type instance_id: str
    :param instance_id: The ID of the EC2 instance to which it will
                        be attached.

    :type device: str
    :param device: The device on the instance through which the
                   volume will be exposted (e.g. /dev/sdh)

    :rtype: bool
    :return: True if successful
    """
    params = {'InstanceId': instance_id,
              'VolumeId': volume_id,
              'Device': device}
    return self.get_status('AttachVolume', params, verb='POST')

I took this code from the boto library, they annotate functions using docstrings and sphinx. It’s a very common way of annotating public APIs. However, this method has some drawbacks: first, it’s really verbose and you repeat your self a lot writing code like this; second, it’s harder to parse because there are different docstring formats (sphinx, epydoc, pydoctor), so editors don’t bring code completion or early error checking; third, it’s very easy to make mistakes that unsync the docstrings and the code. In this particular example, if you ever run this function, you’ll notice that it returns a string, not a bool as the annotation suggests.

Google Closure uses a similar docstring approach for type annotations in Javascript.

So, if people are already writing verbose docstrings to annotate functions, why not just use real function annotations? They’re completely optional and you don’t have to use them for non-public APIs or small scripts. They’re more concise, easier to process and easier to verify. Function annotations are only available on Python 3, you might say, but there are some approaches to emulate them in Python 2.x using decorators and it’s still way better than docstrings.

An interesting thing about Python annotations is that they don’t have to be types. In fact, you can use any Python expression as a function annotation. This opens the possibilities for a lot of interesting applications: typechecking, auto documentation, language bridges, object mappers, adaptation, design by contract, etc.

The typelanguage library defines a whole language for communicating types. This language can be used with just string annotations. For example:

def get_keys(a_dict: '{str: int}') -> '[str]':
    ...

The downside of this flexibility is that it causes some confusion in the community about how annotations should be used. A recent discussion in the Python-ideas mailing list unveiled this problem.

Personally, I would love to see this feature more used in the Python community. It has a lot of potential. I started a small library to work with type annotations. It implements the typechecked decorator described before, and some other useful things like structural interfaces, unions and logic predicates that can be used as function annotations. It’s still very immature, but I would like to improve it in the future by adding function overloading and other features. A detailed description of the library probably deserves a whole post for it. I would love to hack Jedi to add some basic support for auto-completion based on annotations.

Aaron Swartz

Ha pasado ya más de un mes desde que Aaron Swartz falleció. Aaron fue un activista que dedicó gran parte de su vida a defender nuestros derechos. Su muerte fue verdaderamente lamentable.

Quise hacer un pequeño tributo a Aaron. Decidí traducir y subtitular su emotiva conferencia sobre “How we stopped SOPA” en F2C 2012 :

La traducción la hice poco a poco en ratos libres. Si alguien encuentra cualquier error, por favor informen me.

Aquí está la transcripción del inglés que también hice.

Aquí está el archivo de subtítulos en español.

Aquí está el vídeo original.

Ludum Dare 25

Last week  I participated in Ludum Dare, one of the most popular game making competitions out there. The idea is to write a game in 48 hours. You have to create everything in those 48 hours. That includes graphics, sounds and code. This time the theme was “you are the villian”. I tried to participate before but failed to finish something. This time my primary goal was to finish a game, even if it was very simple. I decided to write something between Space Invaders and Galaxian, where you actually played the aliens. I also decided to mix some tower defence elements. I had a lot of fun writing this game, even when in the end it was boring and buggy. Next time it will be better for sure.

For the game code I used Dart and a very immature library I’ve been working on. The result wasn’t very good. The controls were poor and the game is not very fun. It also has some ugly bugs. Writing a game in 48 hours is really hard; more than I initially thought. I was new with these tools and that made everything harder too. For graphics I used The Gimp and Inkscape.

Here is a bit summary of my experience.

What went right:

  • I finished! That’s the best thing!
  • I made something simple.
  • I started using very simple graphics and decided to improve them later only if there was time.
  • I could come up with a design pretty quickly, this allowed me to spend more time on coding and creating graphics.
  • I created a simple plan and was able to follow it on time.

What went wrong

  • Game mechanics and controls. The controls didn’t fit quite right with the game. The game mechanics could have been improved.
  • No sound :( I didn’t have time for it.
  • Final game had some bugs because I tweaked the controls at the last minute.
  • I had bugs with other browsers that I didn’t detect until the last minute.
  • Sunday was significantly less productive than Saturday and Friday night. I was really tired and took a lot of breaks. Probably because I had a lot of work the previous week. I will try to take a rest before the compo next time.
  • I’m not an expert with the tools. That slowed me down with the code. And the engine I wrote is still very immature.

What I learned

  • Playing and rating games is equally or even more fun than writing the game. I love to see such explosion of creativity!
  • The community rocks! Thanks for everything.

Ludum Dare was an incredible fun experience. I won’t miss the next one!

Beyond Javascript part 2: Dart and Typescript

I was glad to see the release of Microsoft TypeScript last week. After Google with Dart, it’s nice to see one more big player trying to create new languages for client side web development.

I’ve been playing with Dart for a while and TypeScript really impressed me. In terms of syntax, I feel that TS got some bits much better than Dart. Anders Hejlsberg has a true talent for language design. Some things I like about TypeScript:

  • Full interoperability with the JavaScript world. This is both ways: from JS to TS and vice-versa. There is a huge ecosystem of code available for JS.
  • Better syntax. For example: type annotations are much more flexible, and they look nicer. Interfaces are better too, they cover all the cases and there is no need for ugly constructs such as “typedef” in Dart.
  • They offer support for private things, both in classes and modules. Although, this is only useful at compile time.
  • The web playground is really cool. It has auto completion, error highlighting and side by side compilation. It even has nice key bindings, almost like a good IDE.
  • The Visual Studio support and the online playground showed an amazing type inference engine. I have not seen that with Dart.
  • The module system looks better, it’s possible to explicitly importonly the things you need from a module. I like that.

As a side note, I really like they way Microsoft is approaching open source with this project. They have open sourced a lot of things in the past, but this time it feels different. They used an Apache license, added a node.js package, Chrome and MongoDB were used in the demo. It shows a MS less afraid of interoperating with competing open source products and more interesting in truly participating in the community process.

Dart, on the other side, is a more ambitious project in my opinion. Although many of the cool promised features are not really there yet. For example: mirrors and tree shaking.

There are some things that I think Dart got better than TypeScript:

  • It really fixes all the insanity of JavaScript: it has sane equality operators, real arrays, real hash maps, sane comparisons, sane scope, lexical “this” and many more things. TypeScript doesn’t fix any of these problems.
  • More features: operator overloading, string formatting, for-in loops, better collections, isolates, annotations, generics.
  • It improves the DOM interface. This is one of my favorite features.
  • Multiplatform IDE. Visual Studio is cool, but I don’t want to use Windows.

Dart also provides a new VM. This is interesting because it allows optimizations based on type inference, direct debugging and other cool things. However, I think it’s very unlikely that other browsers ever implement the Dart VM. Dart2js will be the only option for a long time.

Another thing I like about Dart is how fast the project moves. Almost every week you see language changes and improvements for the IDE. I wonder if TypeScript is going to be as dynamic.

I’m currently working on a small personal project written in Dart. I would like to play with TypeScript but I don’t want to use Visual Studio. I think some traction is needed before support for other IDEs and editors appears. I guess I have to wait.

CoffeeScript: less typing, bad readability

I’ve used CoffeeScript for a few months now. Coming from Python, I felt that CoffeeScript was more concise than Javascript, so I decided to use it for a few small projects. Initially, it was a nice experience, but then I gradually realized that, while writing CoffeeScript code was very pleasant, reading it wasn’t so. I started to notice that it was hard to read my own code a few months later. It was even harder to read other people’s code. I often found my self reading the translated JavaScript code to understand a line or two of CoffeeScript. I concluded that CoffeeScript was a language designed for writability at the cost of readability, easier to write, but harder to read.

The roots of CoffeeScript readability problems are two principles applied to the design of the language:

  • Implicit is better than explicit
  • There is more than one way to do it

1. Implicit is better than explicit.

Implicit or optional tokens in a programming language usually bring readability problems. For example, in C-like languages, you can omit curly brackets after a conditional expression if you only have one statement:

if (condition)
    action();

But what happens if we add a new statement:

if (condition)
    action();
    action2();

Now let’s take a look at a classic problem associated with implicit semicolon insertion in Javascript:

function foo() {
  return
    {
      foo: 1
    }
}

Both examples show cases where, at first glance, the code looks like it’s doing something, but after looking more carefully it’s doing something completely different. Even if you know the rules, it’s easy to fall into this trap if you’re an unwary reader. That’s a readability problem.

CoffeeScript introduces multiple implicit or optional tokens that create a lot of situations like these ones. And that’s something you can easily see in real code. For example, take this statement:

action(true, {
   option1: 1,
   option2: 2
})

In CoffeeScript, you can omit the parenthesis, the curly brackets and the commas. They’re optional. So you can rewrite the statement above as this:

action true
   option1: 1
   option2: 2

Problems with optional parenthesis

Take a look at these two snippets. Next to the CoffeeScript code is the resulting JavaScript:

doSomething () ->  'hello'
doSomething(function() {
  return 'hello';
});
doSomething() ->  'hello'
doSomething()(function() {
  return 'hello';
});

Both statements do completely different different things, although they look very similar. The first one takes the space after the function name and applies implicit parenthesis to the function call, taking the function as a single parameter. The second one interprets the parenthesis as a function call with no arguments and applies implicit parenthesis on that result. Note that in CoffeeScript parenthesis are also optional in function definitions with no arguments. That means that the following two statements have exactly the same meaning:

x = -> 'result'
x = () -> 'result'

Something curious about the rules used by CoffeeScript for implicit parenthesis is that the case for function calling is exactly the opposite of the case for function definition. In function calling you can omit parenthesis except when the function takes no arguments, whereas in function definition you can omit parenthesis only when the function has no arguments.

Now let’s take a look at some interesting case of how implicit parenthesis make things harder to read. This a small snippet taken directly from the CoffeeScript source code:

action = (token, i) ->
      @tokens.splice i, 0, @generate 'CALL_END', ')', token[2]

The @tokens.splice function call has five elements separated by commas. At first glance you can think that the function is taking five arguments, but if you read carefully, you will notice that there is another function call as an argument: @generate. The last two arguments are for @generate not for @token.splice.  A more readable way of writing this would have been:

action = (token, i) ->
      @tokens.splice i, 0, @generate('CALL_END', ')', token[2])

Problems with optional commas

In CoffeeScript you can omit commas for separating function arguments if you put them in a new line. For example the following two statements are equivalent:

moveTo 10, 20, 10
moveTo 10,
  20
  10

The comma after the first argument is mandatory, except if the next argument is an object definition:

moveTo(10, {key: value})

moveTo 10
  key: value

Also, if you’re not using explicit parenthesis, indentation is important, but not alignment, take a look at these examples with the resulting JavaScript next to them:

doSomething 1,
  2
  3
  4
doSomething(1, 2, 3, 4);
doSomething 1,
2
3
4
doSomething(1, 2);
3;
4;
doSomething 1,
  2
    3
   4
doSomething(1, 2, 3, 4);
doSomething(1,
2
3
4)
doSomething(1, 2, 3, 4);

You’re not safe from indentation problems if you use parenthesis, for example:

doSomething (->
'hello'), 1
doSomething((function() {}, 'hello'), 1);
doSomething (->
  'hello'), 1
doSomething((function() {
  return 'hello';
}), 1);

In the first case, the line break after the function definition is replaced by an implicit comma, the parenthesis seem to be ignored.

Problems with optional curly brackets

Suppose that you have a function that takes two objects as arguments:

action({key: value}, {option: value}, otherValue)

If you omit the curly brackets, you might think you get the same result:

action(key: value, option: value, otherValue)

However, in this case CoffeeScript will take the first comma as a separator for object properties instead of a separator for arguments. The second comma however, it is taken as argument separator because it’s not an explicit key-value pair. The code will be translated to the following Javascript:

action({key: value, option: value}, otherValue);

Something curious here is that in CoffeeScript, explicit key-value pairs are optional in object definitions, but only if you use explicit curly brackets. That means that you can write something like this:

x = {
  key1
  key2
  key3: value3
}
x = {
  key1: key1,
  key2: key2,
  key3: value3
};

2. There is more than one way to do it (TIMTOWTDI)

In CoffeeScript TIMTOWTDI is a strong principle. For example, instead of just having true and false keywords for boolean values, you can also have yes and no, off and on.

Also, you can write a simple conditional statement in multiple ways:

x = 1 if y != 0

if y != 0
  x = 1

x = 1 unless y == 0

unless y == 0
  x = 1

All the four statements above do exactly the same thing.

The problem with having multiple ways of doing one thing, is that the language end up with too many idioms. This makes code harder to read because a programmer trying to understand a piece of code must be familiar with all those idioms.

When we combine multiple idioms with implicit stuff and the fact that everything is an expression, the result is a bomb for readability. Here are a few examples taken directly from CoffeeScript’s source code.

Fancy for loop

  break for [tag], i in @tokens when tag isnt 'TERMINATOR'
  @tokens.splice 0, i if i

This code deletes leading newlines from the list of tokens. The for loop is
just a “cool” one liner to write this:

  for [tag], i in @tokens
    if tag is 'TERMINATOR'
      break

Tricky while

i += block.call this, token, i, tokens while token = tokens[i]

In CoffeeScript everything is an expression. In the code above, is the while
expresion an argument of block.call? or is it acting as while for the
whole statement? When we translate it to Javascript, this is what we get:

while (token = tokens[i]) {
  i += block.call(this, token, i, tokens);
}

Much easier to read in my opinion. Also, note that the while expression is
using an assignment operator instead of a comparision one. That adds 10 points
to the readability bomb.

Tricky if

@detectEnd i + 1, condition, action if token[0] is 'CALL_START'

Here is a similar example, but this time, we’re using an if statement. As in
the previous example, the if here is acting over the whole statement:

if (token[0] === 'CALL_START') {
  this.detectEnd(i + 1, condition, action);
}

But what happens if we add an else to the if?

@detectEnd i + 1, condition, action if token[0] is 'CALL_START' else false

Now the if is assumed as an expression argument for the @detectEnd function:

this.detectEnd(i + 1, condition, action(token[0] === 'CALL_START' ? void 0 : false));

Fancy redefinition

mainModule.moduleCache and= {}

This code clears the module cache only if the value is not null (or something
falsy). This could have been writen this way:

if mainModule.moduleCache
  moduleCache = {}

But short and original code is much cooler. This is a good example of how TIMTOWTDI kills readability.

Nested made flat

js = (parser.parse lexer.tokenize code).compile options

In this example we see how a nested chain of calls looks flat thanks to the
magic of implicit parenthesis. The code translates to the following Javascript:

js = (parser.parse(lexer.tokenize(code))).compile(options);

When the nested calls are explicit, the code becomes easier to read.

Conclusion

Of course readability is a very subjective topic. The problems described here might not apply to you if you come from a different background. I come from Python, C# and C++. But if you come from Ruby or Perl, you might think these are not problems but actually cool features.

I think that readability is more important than writability for a programming language. Code is usually written once, but read many times. Given that CoffeeScript doesn’t fix any of the fundamental problems of JavaScript, but damages readability, I decided not to use it anymore.

Update:

Another interesting post with some other readability problems in CoffeeScript: http://ryanflorence.com/2011/case-against-coffeescript/

Beyond Javascript: Coffeescript

During this year I’ve introduced my self into the world of front-end web development. I’ve never been a fan of the web as a development platform, but I have to admit that the web seems to be the unavoidable platform of the future. During my adventures with web development, I had to deal with Javascript, of course. My opinion about this language is not different from almost everyone else’s: It’s a language with good intentions which ended up being not so good. No wonder why is the most “WTF” language in Stack Overflow. After some months of continuously writing Javascript, I got used to it.

Parallel to my daily JS programming, I’ve been looking for alternatives. These come mostly in the form of languages that compile to Javascript. I decided to try CoffeeScript after watching a cool video titled Better JS with CoffeeScript by Sam Stephenson from 37signals. CS is a nice little language inspired by Ruby, Python and others. I used CS for some small toy projects. It’s very cool and it has a strong community.

Probably the most striking feature of CoffeeScript is that it’s just Javascript. There is almost no semantic changes between them. The difference is purely aesthetic. This has some interesting advantages: 1. Debugging is not a problem because CS can be compiled to human readable JS. 2. CS can easily interoperate with any existent JS code.  The main disadvantage is that CS fixes none of the fundamental problems of JS.

I feel that writing CS is much better than writing JS. However, it took me a while to realize that reading CS is most of the time harder than reading JS. I realized this when reading my own code a few months later. My conclusion is that CoffeeScript’s design has a strong focus on writability, but not on readability. There are two factors that contribute to this in my opinion: 1. The design has a preference for implicit stuff. Some important tokens such as parenthesis, curly brackets, commas and others are optional. This leads to ambiguity in the code that must be resolved by precedence rules and creates code that is very hard to read.  2. The language adopts Perl’s motto: “There’s more than one way to do it“.  This ends up in code with too much different idioms, making things hard to read. I personally prefer Python’s motto: “There should be one and preferably only one obvious way to do it“.

A detailed description of all the readability problems in CoffeeScript deserves its own post. I’ll leave that for later. Meanwhile, I decided not to code in CS anymore. I don’t really see any value on it. I’ll go with plain Javascript when necessary, and I’m also exploring new alternatives such as Google Dart and ClojureScript.