On Sat, Dec 12, 2009 at 1:23 AM, Aristedes Maniatis <ar..aniatis.org> wrote:
> On 12/12/09 3:54 PM, Michael Gentry wrote:
>
>> I created a brand new DC in each thread, got the system time, ran the
>> query (and possibly committed a change), then got the system time
>> again to compute the difference.
>
> OK. So you even though you aren't directly counting the time spent creating
> contexts, since your test is threaded you are implicitly counting that time.
> Say (I don't know) that one ORM locks around creating new contexts, then you
> are incorporating that into your timing results.
Good point.
> The problem with a test like this is that it doesn't always reproduce how
> this works in real life. Would you be creating lots of new contexts
> concurrently in your real application?
I'm well aware that the test isn't exactly realistic. I was just a
little stunned at how it actually fared. Most likely will not create
a lot of new contexts concurrently, but we were just trying to stress
it for a more worst-case spike type situation.
>> So the numbers are the time it takes
>> the thread to complete, for the most part (I kick them off at the same
>> time -- at least as closely as I can).
>
> You should be recording and reporting the start and end times of each thread
> in order to understand if locking is causing some to wait. If you graphed
> them on a little gantt style chart, then that would become apparent. If
> you've ever used Firebug's network analysis, you'll know what I mean. When
> did the thread start? When did you get a new context returned? How long did
> the fetch take? Then the commit?
I can make that change on Monday and try again.
>> My guess is it is
>> locking/concurrency, but I didn't have time to delve into it any
>> deeper (was the end of the day). When I ran RetrieveIndividual20 with
>> 5 threads instead of 20 the results were MUCH closer to the Hibernate
>> values (sometimes better). (Keep in mind both ORMs had a limit of 5
>> open DB channels.)
>
> There are so many things going on here, it is hard to isolate the
> differences. Perhaps you need to fire up a profiler and see where it is
> spending most of its time. Is there a big impact going from equal or less
> than the number of db channels to one more?
Good idea. I'll try that one, too.
>>> * There is so much variance between runs that something is odd here. What
>>> are you doing between runs to clear the cache? How are you dealing with
>>> variability at the database? Is the database running on another server
>>> which
>>> is unloaded and isn't itself interfering with the results?
>>
>> I ran them several times and each time gave similar numbers. The ones
>> in the spreadsheet are just from the last run I did, but they were all
>> similar from run-to-run.
>
> Ah, I thought these were the results of lots of runs. But you are showing
> the time from each thread in a single run. In that case the wild variance is
> definitely the result of come sort of locking. Can you compare CPU usage
> between the two ORMs? That would give you somewhat of an idea whether the
> time is spent in a lock or doing something.
I didn't think to watch the CPU, but I'll try that also on Monday.
>> The DB was on a different server, which is
>> also another reason I did several runs. I expect a few variances, but
>> the general pattern was the same. I don't have the QueryLogger
>> results here, but the DB times were significantly faster, so it really
>> was on the local Java side.
>>
>>> * The huge variability makes me think that memory pressures might be
>>> important. Are you seeing the JVM garbage collect during the middle of
>>> some
>>> runs? Can you run jstat and see if that is interfering.
>>
>> Didn't have time to get into GC. The machine I was on had about 5 GB
>> of RAM free and I had allocated plenty to the JVM. Also, fetching and
>> updating 20 records in the DB shouldn't be that memory intensive.
>
> That isn't really the point. No matter how much RAM you have Java may
> perform GC, depending on your test application, how much stack/heap you gave
> to the JVM, how long the application has been running, etc.
Well, this test application hadn't been running very long. I did give
it more RAM than the default, but I can try tweaking it again.
>>> * Are you able to pick the worst case (update) and time the various
>>> parts:
>>> create context, database UPDATE, commit finished... In your tests are you
>>> timing creating the context or is that outside the timing loop?
>>
>> I was creating the DC outside of the timing.
>
> Sort of, but not really. If you want to eliminate that from your
> measurement, you need to spawn 20 threads, have them all create contexts,
> fetch the data and then sleep. Then all at once you can wake them to perform
> an update concurrently and you'll be measuring only the update.
That's something else I can try, too.
> None of this is much use to tell your team how well Cayenne might work in
> your application in the real world since the tests don't really resemble how
> most ORMs are used. However if you are trying to isolate a specific thing to
> measure, it might be useful.
I completely understand the example isn't as real-world as desired.
We were just trying to get a quick gauge. (Spent less than a day on
the Hibernate + Cayenne example combined.) That one particular
example was so far off (over 10x) the Hibernate version that it was
disturbing, so I thought I'd ask for ideas (which you've given me --
thanks!).
I suspect that if instead of 20 concurrent threads I used 20
sequential threads the thing would run significantly quicker, but
that's something else to try on Monday.
FWIW, here is the pseudocode for the test:
UpdateIdea20's run() method (subclass of Thread):
DataContext dc = DataContext.createDataContext();
int start = System.currentTimeMillis();
Expression exp = ExpressionFactory.match(key, value);
SelectQuery query = new SelectQuery(Idea.class, exp);
List<Idea> list = dc.performQuery(query);
Idea idea = list.get(0);
idea.setName(random string);
dc.commitChanges();
int end = System.currentTimeMillis();
println(end - start);
Thanks again!
mrg
This archive was generated by hypermail 2.0.0 : Sat Dec 12 2009 - 09:39:13 EST