[H-GEN] Which is better?

Thu Apr 29 21:58:59 EDT 2004

On Thu, 2004-04-29 at 17:44, Andrae Muys wrote:
> I boggle that you have the chuptza to suggest that the very subject of 
> our disagreement is 'not contentious'.  Doubly so given I have already 
> posted one article stating the exact opposite.

Yes.  In retrospect it was not the brightest thing to say.

> while correct, has *nothing* to do with the use of heap allocation 
> inside the loop!  In the same manner, suggesting that doing extensive 
> memory-copying, or string-parsing (as per your two choices of example) 
> can be expensive provides *no* information regarding the relative cost 
> of a single heap allocation.[2]

OK.  Do I detect some common ground here?  I thought you were opposed to
doing any optimisation as you write the code.  Perhaps its just where we
place the bar that differs.

1.  At the bottom level, we both agree that Harry's for loop
    optimisation is not a coding "template" worth adopting.

2.  If I see memory allocation / object creation happening, I
    start to ask myself is it really necessary.  You don't.

3.  We both agree that if we see the equivalent of
    object.sleep(10 * 1000), in the middle of what looks to
    be a commonly executed piece of code (eg in an inner
    loop), we would stop coding and think for a bit.

> > It does seem from the reactions I get on this list that to suggest
> > someone should write reasonably optimal code is not politically correct
> > in programming circles.
> 
> Ahh, so we finally reduce to ad hominem attacks.  This isn't a matter of 

No its not.  It is an observation about the posts on this thread so
far.  Up to this point, I took it that everyone who has posted to the
thread more than once is opposed to optimising code as you write it. 
Apart from me that is.  I was beginning to feel lonely.

> Java is designed and implemented to encourage and 
> support the extensive use of heap allocation.  As a direct and 
> deliberate result of this design decision on the part of Gosling and 
> Steele, heap allocation simply is not sufficiently expensive to bother 
> with until a profiler flags it.

I don't agree with "Java is designed and implemented to encourage ..". 
As far as I recall, that is not the case.  Java was designed to be a
much safer language to write code in that what existed at the time
(assembler, C, etc).  In order to achieve that they had to ensure you
couldn't corrupt memory or the damage the integrity of the VM.  Garbage
collection was one thing they needed to achieve that.

If you can find some comments by Java's designers that were written
around the time its name changed from Oak to Java to back up your
statement, I would be like to see them.

> [2] I have attached a benchmark demonstrating this.  My measurements 
> indicate that a favourable comparison between string-append and heap 
> allocation shows a difference >5x.  This is without any intervening 
> operations poluting the cache.  I don't currently have the profiling 
> tools available to do a better job; but I would expect a fairer 
> comparison to favour heap allocation even more.

Egads!  What machine did you run it on?  When I ran it under 1.4.1, and
under 1.5.0 on RH 8.9 I got this:
  rstuart at ras:~$ /usr/java/j2sdk1.5.0/bin/java -cp . Looping
  Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
All very odd.  It looks like there is not more than 30Mb being allocated
on a machine that has 512Mb of physical memory.  On my machine, a
malloc(2Gb) succeeds.

Still, with a slightly modified version of your program I get similar
results.  Which is to say, memory allocation seems to be faster than
copying 16 bytes in Java.  In fact not just faster, but 5 times faster! 
To say that I find that amazing is an understatement - even if the GC is
not doing anything.  It almost defies belief.

I couldn't resist trying to find out why that may be so.  The original
figures I get are:
  Time spent in StringConcat = 318
  Time spent in MemoryAlloc = 117
  Time spent in StringConcat = 333
  Time spent in MemoryAlloc = 103
I then replaced the line: "str = new String(arg);" with "str = arg;",
and re-ran it.  The idea was to isolate the overhead of the test
harness.  The result was:
  Time spent in StringConcat = 310
  Time spent in MemoryAlloc = 32
  Time spent in StringConcat = 333
  Time spent in MemoryAlloc = 31
So, it appears the "new String()" operation takes about 80 somethings. 
Since the test is repeated 1M times, and the units are msecs, the answer
is 80 nanoseconds.  That equates to roughly 150-300 instructions on my
laptop.  Very impressive.  On the other hand it is taking about three
times that to do the equivalent of "strcpy" 16 characters.  Not so
impressive.  Neither is the VM kept running out of memory.  I never did
get it to work under 1.5.0 without reducing "loop" by a factor of 10.

As I ran these tests, I became increasing sceptical of Adrian's claims
about the effectiveness of Java's JIT.  I couldn't see any signs of it
"kicking in" anywhere.  That was until I tried to time how long a method
call takes.  There is no doubt about it, something is aggressively
inlining small functions in the latest JVM's, even doing it for
recursive calls.  1.4 & 1.5 and a factor of _30_ faster for the same
code in 1.2.  When I did manage to fool it, it appears calling a method
call takes 10-20ns.  In other words memory allocation is about 2-4 times
slower than a method call with no arguments.

All said and done your little program has made a powerful impression on
me.  After playing with it, it appears that the "new" operator takes
about 40ns on my machine.  "i += 1" takes about 8ns (also not so
impressive).  So a new is equivalent of about 5 small java statements. 
Not worth worrying about in all but most critical path.

This explains why StringBuffer.append() is slow in comparison.  It
executes quite a few java statements including a couple of method
calls.  The String constructor executes two small java statements.  It
take quite a while to execute all those java statements.

All this leads me to a different view of Java from what I had before.

  1.  Previously, I believed that straight line Java code ran about as
      fast as C.  I got this impression from background reading - it is
      not an uncommon claim.  It is wrong.  Straight line java appears
      to be 5..10 times slower than C.

  2.  Previously, I believed that heap allocation was expensive under
      Java, speed wise.  It was true for C, and I couldn't see what it
      wouldn't be true for Java.  This is wrong.  The speed of the new
      operator is comparable to other operators in Java.  Thinking about
      it, this is not because heap allocation is much faster in Java.
      It is because everything else is slower.

  3.  The one exception to all this is that small method calls in Java
      (or at least on JVM's after 1.2) are free.  The more I think about
      this, the more important it seems.  OO programs, and particularly
      Java ones, seem to have lots of small methods.  It means this
      style of programming has a 0 cost.  Marvellous!

Well, I guess I must thank you for forcing me to look at my beliefs more
closely, Andrae.  I use Java and C# frequently now, and evidently my
understanding of them had some large gaps.

On the issue the speed of manual versus automatic memory allocation, we
continue to differ.  I continue to believe that while malloc and modern
garbage collectors are comparable in resource requirements[1], the
ability of the C/C++ program to do stack based allocation for most
objects blows the program that is forced to mainly heap based allocation
for everything.

--

[1] A year or two ago I read that paper you posted a reference to
    earlier.  It seemed a reasonable paper to me.  The gut feeling I 
    came away with was that, very roughly:

       2 * (malloc resource usage) == (gc resource usage)

    where resource usage is defined as (memory usage * time).  The
    one advantage of GC you get to bias that equation to favour time
    or memory usage, something which is not easily done when doing
    things manually.  On the other hand manual memory allocation can
    be made to work in "real time", something which GC doesn't do
    well.

    But this is only a very rough guide.  Different programs exhibit
    very different behaviour, so much so that sometimes manual heap
    management will sometimes be less frugal than GC.  With that in
    mind, the best you can say is that they are comparable.

    If someone can demonstrate that this is no longer the case, I
    would very much like to hear it.  I regard keeping abreast of
    these things as an important part of my profession.