Scala vs C vs Java: Cycles Speed

Reading how to calculate Pi using akka actors (http://doc.akka.io/docs/akka/1.3.1/intro/getting-started-first-scala.html), I wondered: how much faster can be plain C code doing the same task?

So, this is my simple Scala code. It doesn’t use akka, however: it is as simple as it can be.

$ cat PiCalculator.scala
object PiCalculator extends App {
   val startTime = System.currentTimeMillis
   val start = 0
   val nrOfElements = 1000 * 1000 * 100
   var acc: Double = 0
   for (i <- start to nrOfElements)
        acc += 4.0 * (1 - (i % 2) * 2) / (2 * i + 1)
   println((System.currentTimeMillis - startTime) + " ms")
   println(acc)
}

Let us compile it:

$ scalac PiCalculator.scala
$ ls *.class
PiCalculator$$anonfun$1.class  PiCalculator.class  PiCalculator$.class  PiCalculator$delayedInit$body.class

The for cycle body is implemented as apply>/code> function in PiCalculator$$anonfun$1 class:

$ javap 'PiCalculator$$anonfun$1'
Compiled from "PiCalculator.scala"
public final class PiCalculator$$anonfun$1 extends scala.runtime.AbstractFunction1$mcVI$sp implements scala.Serializable{
    public static final long serialVersionUID;
    public static {};
    public final void apply(int);
    public void apply$mcVI$sp(int);
    public final java.lang.Object apply(java.lang.Object);
    public PiCalculator$$anonfun$1();
}

So, it seems that JVM has to call a virtual function of an anonymous object 100 millions times.

Let’s run the test:

$ java -classpath :/usr/share/scala/lib/scala-library.jar PiCalculator
1297 ms
3.141592643589326

And now the C equivalent:

$ cat PiCalculator.c
#include <stdio.h>
#include <time.h>

int main() {
   clock_t clock_start;
   int nrOfElements = 1000 * 1000 * 100;
   int start = 0;
   double acc = 0;
   register int i;
   clock_start = clock();
   for (i = start; i < nrOfElements; ++i)
        acc += 4.0 * (1 - (i % 2) * 2) / (2 * i + 1);
   printf("%d ms\n", (int)((clock() - clock_start) / (CLOCKS_PER_SEC / 1000)));
   printf("%.8lf\n", acc);
   return 0;
}

The assembled code is simple šŸ™‚

$ gcc -S PiCalculator.c
$ wc PiCalculator.s
  95  242 1618 PiCalculator.s

Test it:

$ ./a.out
1240 ms
3.14159264

Surprise! JVM JIT compilation and optimization are miraculous! OK, let’s disable JIT compiler:

$ java -Djava.compiler=NONE -classpath :/usr/share/scala/lib/scala-library.jar PiCalculator
15517 ms
3.141592643589326

Now Scala is more than 12 times slower than C.

But what about… Java?

$ cat PiCalculator.java
class PiCalculator {
    public static void main(String []args) {
        long startTime = System.currentTimeMillis();
        int start = 0;
        int nrOfElements = 1000 * 1000 * 100;
        double acc = 0;
        for (int i = start; i < nrOfElements; ++i) {
            acc += (4.0 * (1 - (i % 2) * 2)) / (2 * i + 1);
        }
        System.out.println((System.currentTimeMillis() - startTime) + " ms");
        System.out.println(acc);
    }
}
$ javac PiCalculator.java
$ java PiCalculator
901 ms
3.141592643589326

Wow! It’s faster than C! Turn off JIT compiler:

$ java -Djava.compiler=NONE PiCalculator
3766 ms
3.141592643589326

It’s 4 times better than Scala without JIT compilation and only 3 times slower than C. Even -O3 or -Os cannot help us…

But why our code was so stupid? We can replace

    acc += 4.0 * (1 - (i % 2) * 2) / (2 * i + 1);

with

    acc += (i % 2 == 0 ? 4.0 : -4.0) / (2 * i + 1);

Compile it and run…

$ ./a.out
890 ms
3.14159264

That is it! When you write in Scala, your code could be translated into bulky brain-dead constructions with anonymous classes, but Java JIT compiler is so cool that it makes code as efficient as it could be in C.

Advertisements

One response to “Scala vs C vs Java: Cycles Speed

  1. I get for Scala a timing of 473ms by using tail recursion.
    To check I get 910ms for the C code and 475ms for the java code.

    Scala code:

    object PiCalculator extends App {
    val startTime = System.currentTimeMillis
    //val start = 0
    val nrOfElements = 1000 * 1000 * 100
    def calcPi(i:Int, acc:Double):Double = {
    if(i == 0)
    acc + 4.0
    else
    calcPi(i-1, acc + (4.0 * (1 – (i % 2) * 2) / (2 * i + 1)))
    }
    val pi = calcPi(nrOfElements, 0)
    println((System.currentTimeMillis – startTime) + ” ms”)
    println(pi)
    }

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s