Arndt,
Holger already commented on this and I agree with everything he said.
I personally tried using Cayenne for batch processing (BTW, this was an
evaluation for the future conversion of the big wireless messaging
system from TopLink to Cayenne). It was a very specific scenario of
reading millions of rows from Oracle, processing info from each row and
adding the results to the object holding the totals (business rules were
pretty complex actually, so using simple GROUP BY didn't work), and then
saving the resulting report objects (tens or hundreds, not too many) to
a different set of tables. Worked pretty well, but this was optimized
for read, not write.
As Holger mentioned, now we are adding batch commits, using JDBC batch
feature - the big step in optimizing writes. Reusing PreparedStatements
is part of this effort as well. Though we already have the backend for
this implemented, it is not used by default just yet, and public API is
still being refactored.
Since you are bringing up an interesting scenario for this new feature,
could you describe the flow some more? Are the objects mostly created in
memory and then saved? How big of a transactional scope you need? I
mean, you don't plan to keep millions of uncommitted objects in the
memory at once? Or do you, and you simple write them via batch one by
one and do a commit after that?
Andrus
Arndt Nrenschede wrote:
> Hi,
>
> I just had a deeper look at cayenne trying
> to find out whether we could use it for
> our project (a large back-office system).
>
> I got through the examples quickly and it
> was really nice to see that everything worked
> as expected (alpha-6).
>
> So, for the interactive part of our system
> I'm sure cayenne would do the job.
>
> For the offline processing, however, we
> have high volumes (millions) and tight
> performance constraints, so
> we cannot deviate much from plain jdbc's
> performance.
>
> The features needed for that would be
> (and I think they are not implemented,
> or at least I didn't find them):
>
> - support for JDBC-batch updates during
> "commitChanges"
>
> - re-use of Prepared-Statements
>
> - more detailed control of what happens
> to the identy-map (Object-Store) in
> "commitChanges"
> The behaviour we need is to fully
> release the volume data (e.g. accounts),
> thus allowing them to be garbage-collected,
> while keeping the master-data
> (e.g. fund-properties) linked to the
> identy-map.
> (would require something like nested
> DataContext's - or TopLinks "unitOfWork")
>
> Could some of the gurus tell me if you
> have plans in that direction, or if
> I just missed something?
>
> thanx in advance,
>
> Arndt
>
> PS: I also evaluated toplink, but that failed
> because they support batch-writing, but messed
> it up, and also because their concept of
> data-comparisons (to find out what changed)
> when commiting a unitOfWork turned out to
> be a cpu-time grave :-(
>
>
This archive was generated by hypermail 2.0.0 : Sun Feb 16 2003 - 13:54:22 EST