Erlang is slow: and other rubbish

20 May 2013

How and Why We Switched from Erlang to Python tells an intern’s tale, from whose perspective it runs like this: we used Erlang, “No one on our team is an Erlang expert” (despite “how crucial this service is to our product”!), and also would you please suspend brain activity while I make some performance claims.

Hold your horses. The good decision to rewrite critical services in a language they actually know aside, let’s look at their notes on perf:

Another thing to think about is the JSON library to use. Erlang is historically bad at string processing, and it turns out that string processing is very frequently the limiting factor in networked systems because you have to serialize data every time you want to transfer it. There’s not a lot of documentation online about mochijson’s performance, but switching to Python I knew that simplejson is written in C, and performs roughly 10x better than the default json library.


Let’s distill these claims:

  • Erlang is historically bad at string processing
  • string processing is very frequently the limiting factor in networked systems
  • simplejson is written in C
  • simplejson performs 10x better than the default json library

Further down:

I went into Mixpanel thinking Erlang was a really cool and fast language but after spending a significant amount of time … I understand how important code clarity and maintainability is.

Thus by implication?

  • Erlang is not a really cool and fast language
  • Erlang is somehow not conducive to code clarity and maintainability

This whole paragraph is just a mess and I can’t quote it without losing some of its slimy essence.

Again, in full:

I’ve learned a lot about how to scale a real service in the couple of weeks I’ve been here. I went into Mixpanel thinking Erlang was a really cool and fast language, but after spending a significant amount of time sorting through a real implementation, I understand how important code clarity and maintainability is. Scalability is as much about being able to think through your code as it is about systems-level optimizations.

By your own admission, no-one on your team is an Erlang expert; you “have trouble debugging downtime and performance problems”. Plenty of Erlang users don’t, so it suggests the problem with your team’s command of the enviroment is severe. Similarly, you earlier mention the “right way” to do something in Erlang, and immediately comment that your code didn’t do that at all – never mind that the “right way” mentioned was wrong.

Yikes.

So why does the word “Erlang” feature in the above-quoted paragraph at all?

There’s no reason to expect either code clarity or maintainability of a service developed over 2 years without an engineer skilled in the environment overseeing the architectecture.

I didn’t say Erlang in that sentence, and yet it has greater explanatory power than the intern’s claim for the same phenomenon.

I suspect their explanation is more controversial, however, and it’s easier to make these claims than arrive at the conclusion that the team’s own shortcomings were responsible for the technical debt accrued – and it makes for a better article. I choose my explanation:

  • Erlang is somehow not conducive to code clarity and maintainability: there is not even anecdotal support in the article for this claim

That leaves 5 claims.

Let’s note an important confounding factor: the article is from August 2011. The state of Python and Erlang, and libraries for both have changed since.


As an aside: it’s easy to think that the performance claims they do indirectly make are incidental (and not essential) to the article.

But remove them, and note there’s not really an article any more; a prologue about mapping some concepts from an old codebase to new, and .. an epilogue espousing the virtues of code clarity and maintainability.

Ain’t nobody gonna argue with that, but, as noted above, just that alone does not a “How and Why We Switched from Erlang to Python” blog post make.


Let’s now dig into it – this won’t be much of an article without benchmarks either. Unlike their benchmarks, I’m actually comparing things in order to contrast; their decision to give benchmarks on the new system but not on the old is baffling at best.

I compared 4 Python JSON implementations and 3 Erlang ones:

  • json (built-in in Python 2.7.4)
  • simplejson 3.3.0 from PyPI, both with and without C extensions
  • ujson 1.30 from PyPI
  • mochijson and mochijson2 from mochiweb
  • jiffy

simplejson is what the intern picked for their rewrite. mochijson is what they were using before.

All versions are current at time of writing.

Testing method:

  • read 5.9MB of JSON from disk into memory
  • start benchmark timer
  • parse JSON from memory 10 times, each time doing some minimal verification that the parse was successful
  • force a garbage collect
  • stop our benchmark timer

The code is available on my GitHub account, and instructions to reproduce are found there.

Here are the results:

  • ujson: 1,160ms
  • jiffy: 1,271ms
  • simplejson (with C): 1,561ms
  • json: 2,378ms
  • mochijson2: 8,692ms
  • mochijson: 11,111ms
  • simplejson (no C): 16,805ms

ujson wins! jiffy a close second! simplejson a close third! These results are the average of three runs each, but I did many more runs in testing the benchmark code and can say the variance was quite low.

So:

  • simplejson performs 10x better than the default json library: this doesn't appear to be the case now. It may have been the case in 2011, depending on what the default json library was back then.
  • Erlang is not a really cool and fast language: in this particular example the best Erlang library is on par with both of the best Python libraries – all three C-boosted, of course – and the best pure Erlang library runs in half the time as the apparently-best pure Python one. (json is C-boosted)

That leaves us with these claims unrefuted:

  • Erlang is historically bad at string processing
  • string processing is very frequently the limiting factor in networked systems
  • simplejson is written in C

Erlang’s historical performance is somewhat irrelevant, but the claim stands nevertheless.

No evidence was advanced for the second claim: there was no way to determine whether faster string processing was responsible for any improvement in their benchmarks: we don’t even know if the benchmarks improved because we only were given one set (!). Of course, the changes being the entire system and not just string processing, before-and-afters would prove nothing, especially given the proficiency gap. Hence:

  • string processing is very frequently the limiting factor in networked systems: maybe, maybe not, but picking the right library makes a big difference!

I mean, jeez; they could reduce their string processing (and thus the “limiting factor”?) by 33% if they switched from simplejson to ujson!

As for the third claim, if I don’t nitpick, it stands. Kinda.


Why did I feel the need to write this up?

I saw the article pop up on Hacker News today, 2 years after its publication. In fact, I’d seen the article not long after it was originally published, and I know it was on HN back then too. I don’t care about the fact that it was reposted; more that it was written at all.

It’s exactly the sort of useless bullshit that seems to fill a quarter of HN’s front page at any given stage: articles with titles like “Why I Don’t Use Vim”; “Why You Should Learn Vim”; “The Reason You Need To Understand System F If You Want To Ever Amount To Anything”; “Stop Thinking. Just Write.”; “How We Saved 95% Of Our Datacentre Costs By Rewriting Everything In Ada”; etc. etc. etc.

It’s edgy shit that grabs your attention and way oversells one point of view, at the expense of reason. This comment thread started out by acknowledging this trend, but was ruined by someone not catching the pattern being invoked. Usually there’s a nice novel point to be made somewhere if you can read the moderate subtext between the bold claims.

Unfortunately this article had no such point, and so turned out to be 100% garbage. But still, some people will read that article and go away thinking their prejudices about Erlang have been confirmed by someone who’s been in battle and seen it for themselves.

And really, this isn’t about Erlang. I don’t care what language or environment you use. Your coding philosophies are irrelevant, if you deliver – and Mixpanel are delivering, and clearly made the right choice to get rid of some technical debt there.

But don’t then try to shift any part of the responsibility of the decision to pay off that debt as being your tools’ faults, and especially not with such flawed logic.