Crowbar central

A blog about the ooc language, with source code examples, updates on the state of the llama (our beloved mascot), and random thoughts by nddrylliog, the benevolent dictator for life of ooc.

The opinions expressed in this blog are not necessarily those of the whole ooc community.

Nov 14

oc vs rock - the state of the Llama

So, I’ve been working on a new ooc compiler in my spare time, with the end-of-the-road goal to replace rock completely.

This post should address a few concerns/questions some of you have regarding this new project.

Q: Why a new compiler? Isn’t rock fine?

No. rock’s not fine. It’s awful. If you disagree, it’s because we have done a good job at patching it up for most bugs encountered.

Don’t get me wrong: rock is still better than many, many compilers out there. For some reason, it seems that sucky languages have good compilers and good languages have sucky compilers (example: Objective-C/LLVM vs Scala/scalac)

Q: What are rock’s visible drawbacks?

  • It doesn’t scale well. It’s faster than most for small projects (esp. if you have a custom, small sdk), but for large projects (ie. itself) it becomes tiresome
  • Related to above: error reporting is of variable quality. In complex situations with hundreds of modules, sometimes irrelevant errors are reported first.
  • Partial recompilation is not 100% reliable. Sometimes C compilation errors disappear when cleaning cache files.
  • Everyday new ways are found to make the compile

These problems are well-known, and they are the result of bad design decisions.

Q: What’s wrong with rock’s internal design?

  • The whole resolving process is flawed. Looping is less than optimal, and many unnecessary tree walks happen. Most resolve() methods have a lot of noise and sometimes it’s not exactly clear what’s the best thing to do when were waiting on something to resolve.
  • The AST confuses ooc and C. Some AST nodes (such as StructLiteral) don’t even have a direct mapping to ooc code, but they’re used regardless because it would be too hard to do everything in the backend without an additional C AST.

Those are the most important points. Other areas that could be improved as well (parallelizing, using a proper command line arguments lib, etc.)

Q: Why not just fix rock then?

First off, because these changes are too big. Changing the whole way resolving is done means having to rewrite 50% of rock without being able to iteratively test anything.

There’s more than that: originally, rock was written so that j/ooc could compile it. In other words, with many limitations. By writing the same functionality again from scratch, I am able to use higher-level ooc idioms that are well-supported by current compilers (generics, closures with ACS, properties, improved match)

This often means: less code, that is more readable, with less bugs, and at least te same level of functionality.

Q: How is oc’s design better?

This could probably fill a book, so I’m not gonna go into too much details, but here are a few nice features:

  • Parallel parsing of files - on my dual core, the parsing of 200 files is up to 1.8x faster in oc than in rock (both use nagaqueen).
  • Resolving using coroutines rather than dumb loops: this allows completely out-of-order resolution, only one tree walking per module
  • Better error reporting: because coroutines are used rather than looping “in hope of something to happen”, errors are reported much sooner. When running rock and ooc on the same 3000-lines file, it takes rock ~3.161s and oc ~0.178s to report the undefined symbol used at the end. This is almost a 18x speedup.
  • There’s a pure ooc AST, allowing for inference, error checking, optimization, etc. and the C backend uses a C AST (good for partial recompilation). This also means alternative backends are easier to implement because the ooc AST they’ll be dealing with won’t be full of C-isms.
  • Parallel generation and compilation of C files, as soon as they are resolved. Which means even if the resolving process is single-thread, some modules are very quickly resolved and then compiled while other modules are still resolving.

And there are many other internal details that make it a lot more pleasant to work with than rock. Bugfixes in oc are usually improving the design. Bugfixes in rock are usually patches because changing the design would be too costly.

Q: Sounds great! Can I start using oc now?

No you can’t. As every good vaporware, it’s wicked cool but still far from supporting everything rock does.

It’s useless to make a list of what it supports, because the point is that most of the architecture is there and it’s not hard to add support for language features.

You can, however, contribute to its development. It’s on github, you can fork and ask what you can do to help. Alternative backends (targeting other high-level languages) would be a good easy first contribution.

Q: Last question: What about us poor devils/rock users?

I’m still a rock user as well - since oc obviously isn’t self-hosting yet. It means rock gets the usual love, bugs get fixed, features get added - no worries.

That’s it for today! Thanks for reading, hope most of your questions have been answered, if not there’ll be other posts, hit me up on Twitter (@nddrylliog)


Sep 28

Error reporting improvements - parser generator tricks

Several had noticed that rock error reporting was particularly sucky these last few weeks. I’m happy to announce that it has been fixed and even improved, as you can see:

Which, surely you agree, is much much nicer to work with.

(Disclaimer: the code below is greg syntax, not ooc code)

Turns out it was broken by my fault. I had changed things like

FunctionCall = Name OPEN_PAREN Args CLOS_PAREN
OPEN_PAREN = ‘(‘
CLOS_PAREN = ‘)’

To

FunctionCall = Name ‘(’ Args ‘)’

It didn’t break parsing (ie. produced exactly the same AST) but it broke error reporting. Why?

The original peg/leg had no error reporting facility. Nor did greg, _why’s re-entrant fork of it. So in my fork, I added some error reporting facility.

peg/leg/greg allow the specification of blocks of C code inside brackets {} when a rule is fully matched, such as this:

FunctionCall = Name OPEN_PAREN Args CLOS_PAREN { onCall() }

So in the same spirit, I added ~{}, which is called when a rule is *not* matched.

FunctionCall = Name OPEN_PAREN Args CLOS_PAREN ~{ onError() } { onCall()}

In that example, onError() will be called if the closing parenthesis isn’t matched.

But wait, isn’t that illogical? Why choose the closing parenthesis to test for errors? After all, we could do something like:

FunctionCall = Name OPEN_PAREN (Expr (COMMA Expr ~{ onError() })*)) CLOS_PAREN

Ie. if Expr is not matched (malformed expression) it would throw an error. But that won’t work. Why? Because it’s alright for Expr not to be matched. ‘*’ means ‘0 or more’. So if it’s not matched, the parser generated by greg will just assume that there are no more Expr to be matched and that the parent rule ()* is fully matched and then it’ll move on to the next rule, ie. the ‘)’.

Now we see why it makes sense to test for closing delimiters such as ‘)’, ‘}’, ‘]’. If there is indeed a malformed expression, then Expr will no longer match and it’ll stop matching ()*, and when trying to match the closing parenthesis it will fail! And that’s when we want to throw an error.

But as I said the only thing I did was to change from

FunctionCall = Name OPEN_PAREN Args CLOS_PAREN ~{ /* … */ } { /* … */ }

To

FunctionCall = Name ‘(’ Args ‘)’ ~{ /* … */ } { /* … */ }

And the explanation is very simple: ~{} only works after named rules. ‘(’ is a string rule and thus ~{} is simply ignored. This is a little limitation of my greg error reporting solution that I had totally forgotten. It might be considered a bug, if anyone else uses greg and wishes that particular issue to be fixed, you know where GitHub is.

I hope you enjoyed this little journey in the world of parser generators, it’s an exciting domain and I’ve only grazed the surface. If you’re interested in developing your own toy language in ooc with greg, decadence is a nice tutorial I was corrupted into writing.


Sep 22

.fr ooc talks - OSDC & JDLL, 9 and 16 october

After EmergingLangs’ incredible success, I now have two more occasions to get embarassed in front of people talking about our awesome llamaness.

On the 9-10th of October 2010, OSDC.fr will happen. It’s huge, it’s all open-source developers in arms, and it’s in Paris (the capital of Love, as everyone knows). There are even english talks! So fly to Paris for a magical week-end.

The OSDC, aka Open Source Developer Conference is historically more about dynamic languages such as Ruby, Perl, Python - but their invitation for an ooc talk clearly shows their will to broaden their horizon. May a thousand llamas express their gratitude to them.

One week later, on the 14-15-16th of October 2010, the beautiful city of Lyon will host the twelfth edition of the JDLL (Journées du Logiciel Libre - Free Software Days).

I had already given an ooc talk last year, sharing a timeslot with Patrice Ferlet, aka Metal3D (who was talking about a PHP compiler), and they didn’t threw me rocks so I thought I’d try again this year. Apparently they accepted it, so I’ll be there, for another 40 minutes of bliss.

For all ooc users of France, these are your unique occasions to bug me with your personal issues before, during, and after the talks! I’ll be 100% available for discussion.


Aug 27
I <3 the ooc community.

I <3 the ooc community.


Aug 14

How to make a hacker hate his keyboard

Documentation, standardization. I almost ripped out my hand when writing those two words.

And still, that’s what is happening right now. Tired of asking questions on IRC about the ins and outs of the ooc language? Wish you could study it all by yourself in a dark corner of your room without feeling silly for asking questions that were asked a thousand time before?

Well, here you go: http://github.com/nddrylliog/the-ooc-language I’ve begun writing a ‘book’ that explains the features of ooc chapter by chapter. Some chapters are mostly about syntax and easy to comprehend (ie. properties) and some go more deeply into the crusty stuff (ie. covers vs classes) - don’t be intimidated if you have to read it more than once.

Now for the next shocking revelation: do you think the ooc community is fragmented? Are you afraid of the many SDKs lying around? Do you wake up covered in sweat crying “NOooooOOO! Not Tango/Phobos all over again!” and staring at your poster of Walter Bright, breathing slowly to calm down?

Then fear no more: http://github.com/ooc-lang/ooc-std Mark Fayngersh, aka the man known for making ooc-lang.org look good for 20 dollars and a few rubys (thanks Mark!) has decided it was about time that the ooc SDK was standardized. This is an incremental effort - you can contribute, your best bet is to come discuss about it with us on IRC (#ooc-lang channel, on Freenode). Thanks in advance for all your contributions.

In other news, I’ve been working on.. fixing bugs, implementing generic inlining (still experimental), pure ooc coroutines (ucontext-based, still debugging it), building to shared libraries and true partial recompilation (ie. faster compile times), and tons of other cool stuff. Man, I love working on ooc. And I love you all. Until the next time, enjoy!


Jul 22

Channels and Coroutines in ooc

Hey there, tumbr’ling directly from OSCON, watching Mr. Shapiro talk about BitC.

So I watched Rob Pike’s talk (of Limbo/Go fame) about CSP, and Steve Dekorte (of Io fame) told me: “I wonder if it couldn’t be done simply as a library for an existing language”

So, I hereby present you ooc-channels. It steals its API right-out from the Go language.

The go() function takes a closure and allows you to start a Coroutine doing stuff. Since the closure can access the outer context, you can access…

Channels! Which are created using the make() function, which takes a type (really, a class) as an argument. No special syntax. Just goodness.

And two operator overloads to make this nice: sending a message is: channel « message, and receiving from a channel is: ! channel

Here’s a small example:

chan := make(Int)
go(|| for(i in 0..100) { chan « i }) 
go(|| while(true) { i := !chan; i toString() println() }) 

Nice, right? There’s a more complete ‘sieve’ example in the repo, again totally a rip-off from the Go website examples (which is pretty convenient).

There are two implementations in ooc-channels currently: ThreadedChannel, which starts one thread for every go() call, and CoroutineChannel, which uses one OS thread and one coroutine for every go() call with a trivial coroutine scheduler that runs on every send/recv call to Channel (ie. «, !).

The ThreadedChannel implementation works fully in parallel, but stalls at ~2600 concurrent channels on my machine (probably because of the max number of threads imposed by the OS - could be tweaked, but still insane).

On the other hand, the CoroutineChannel hits hard limits of the GC itself! With a Boehm GC compiled with -DLARGE_MODEL, I have been able to launch 86’000 concurrent channels before it dies because there are too many root sets (look up the implementation if you want more details). Again, can be tweaked, but there’s probably a better solution (e.g. using the GC_push_roots hook, which I don’t know how to do.)

The implementations is obviously not of the quality of the Go language (although I haven’t verified that) but it’s fun to see what one can come up in 24h =)


Jun 23

Jun 17

Jun 15

Jun 13

Page 1 of 3