Re: Modified but not modified

From: Holger Hoffstätte (holge..izards.de)
Date: Tue Feb 11 2003 - 21:31:30 EST

  • Next message: Craig Miskell: "Re: Modified but not modified"

    OK guys, you asked for it, I can't sleep and the bottle of Copperidge
    Cabernet Sauvignon next to me is quite good, so here goes. :-)

    Craig Miskell wrote:
    > Timings are as follows
    > Type Current code "Equals", same obj Equals alt obj
    > String 6594 7634 16164
    > Double 6793 7972 51223
    > Float 6794 7963 48169
    > Integer 6794 7973 17367
    > Date 6792 7971 9883
    > Boolean 6791 7967 17124

    Are you by any chance running JDK 1.3.1? There's no way I can reproduce
    the first column with the client VM on Windows with my 2.4 GHz machine.
    jdk 1.4.1 is known to be slower on HashMap performance and it shows.

    Anyway, here are some more results with your test. I was busy. Your test
    was basically OK (allocation-free was my main concern) but I added some
    more loops & waits before the main one in order to give HotSpot a chance
    to get comfy. Btw, measuring with OptimizeIt was pointless since the
    profiling ate up all the finer performance differences of the server
    VM..in the end it told me that writeProperties() was the bottleneck. duh..

    Test environment: P4 2.4GHz, j2sdk 1.4.1_01

    1) client VM, original writeProps():
    Value class java.lang.String takes 10000
    Value class java.lang.Double takes 10000
    Value class java.lang.Float takes 10016
    Value class java.lang.Integer takes 10203
    Value class java.util.Date takes 10390

    I attribute this solely to the jdk 1.4 performance degredation in HashMap,
    this has been noted elsewhere too.

    2) server VM, original writeProps():
    Value class java.lang.String takes 3719
    Value class java.lang.Double takes 3765
    Value class java.lang.Float takes 3703
    Value class java.lang.Integer takes 3719
    Value class java.util.Date takes 3703
    Value class java.lang.Boolean takes 3719

    Now look who's cookin'! This is what we should measure against, and we
    will.

    3) client VM, CM writeProp, alternating values:
    Value class java.lang.String takes 24047
    Value class java.lang.Double takes 44390
    Value class java.lang.Float takes 45422
    Value class java.lang.Integer takes 24094
    Value class java.util.Date takes 10781
    Value class java.lang.Boolean takes 24390

    This is Craig's version with a custom implementation of writeProperty,
    checking for identity and then equality(). Also carries the cost of a
    Map.get() hit in order to retrieve the object so that it can be compared.
    This hit makes approx. 25% of the cost of writeProperty().

    4) server VM, CM writeProp, alternating values:
    Value class java.lang.String takes 9187
    Value class java.lang.Double takes 34602
    Value class java.lang.Float takes 31687
    Value class java.lang.Integer takes 9047
    Value class java.util.Date takes 4391
    Value class java.lang.Boolean takes 9047

    Already a different picture, compared to 1/2: while -server still yields a
    huge speedup, the equals() calls come into full effect. Double/Float
    require a native hit (the double/float value is bitfiddled into a long by
    the VM), this is still a very expensive operation, and it shows. Frankly
    I'm surprised by this since 1.4 was supposed to speed up JNI calls by a
    huge margin; I guess it was even worse before. The constantly good
    comparison speed of Date is a complete mystery to me; compared to the
    other value classes it has an instanceof check as well and compares two
    primitives. The only difference is that the Numbers have an if/then and
    Date uses a double condition, but this is not the reason. I wrote my own
    Boolean class and checked exactly that - no difference. Very interesting,
    very weird.

    5) server VM, CM writeProp, no (if persistenceState..):
    Value class java.lang.String takes 11188
    Value class java.lang.Double takes 36046
    Value class java.lang.Float takes 32688
    Value class java.lang.Integer takes 9312
    Value class java.util.Date takes 4500
    Value class java.lang.Boolean takes 9391

    Just for kicks I tried to remove the if/then clause that checks for the
    object's persistence state and simply always sets it. Apparently at 2.4
    GHz even writing instance variables bears a certain cost, probably throws
    the 1st level cache off for the write. Good to know that branch prediction
    actually works.

    6) server VM, custom HMap, orig. writeProp, same values, no identity check
    in put():
    Value class java.lang.String takes 5750
    Value class java.lang.Double takes 28922
    Value class java.lang.Float takes 21187
    Value class java.lang.Integer takes 5656
    Value class java.util.Date takes 5750
    Value class java.lang.Boolean takes 5594

    I realized that obviously paying for a Map.get(), equals() and a Map.put()
    is not very efficient. So I jumped into the lair of the JDK dragon,
    modified put() to do the comparison, and ventured on.

    7) server VM, custom HMap, orig. writeProp, same values, identity check in
    put():
    Value class java.lang.String takes 4781
    Value class java.lang.Double takes 4922
    Value class java.lang.Float takes 4875
    Value class java.lang.Integer takes 4906
    Value class java.util.Date takes 4860
    Value class java.lang.Boolean takes 4875

    I found the sword. +1 for me :-)

    8) server VM, custom HMap, orig. writeProp, alternating values, identity
    check in put():
    Value class java.lang.String takes 5906
    Value class java.lang.Double takes 30672
    Value class java.lang.Float takes 28484
    Value class java.lang.Integer takes 5813
    Value class java.util.Date takes 5562
    Value class java.lang.Boolean takes 5735

    Unfortunately, on my way out I found that even I can't fight the
    Double/Float native hit. +1 for the dragon. Good thing that Double/Float
    are pretty rare beasts.

       THE END

    Well, not really. I guess what we can see is that a custom Map
    implementation for CayenneDataObject might be a Good Thing and that we
    could consider this at a later time. It wouldn't affect any outside
    clients (maybe except for serialization), but could not only solve the
    equality problem 'correctly' but also be helpful for DataObject
    population: we could do something faster than repeatedly calling
    writeProperty() for each individual attribute.

    Oh, and I'd like to talk to anybody who actually modifies a hundred
    million+ in-memory DataObjects in a single sweep (without a single page
    fault) so that all this makes sense. I mean, I have pretty strange ideas
    (did I mention this vision I had the other night about integrating Cayenne
    with Gemstone's GemFire distributed shared memory cache?) but..like Ali G.
    sez: keep it real!

    with vote for a simple equality check in writeProperty() and tongue firmly
    in wine glass,
    Holger



    This archive was generated by hypermail 2.0.0 : Tue Feb 11 2003 - 21:33:26 EST