Hi Andrus,
meanwhile I tried to batch-commit feature in
the current CVS - seems to work as expected (good job!).
(It took me some time to figure out
that I have to use "commit" instead
of "commitChanges" in order to make use
of it :-) )
> Hi Arndt,
>
> Lets see how Cayenne can address different issues here.
>
> 1. Reading. In fact Cayenne is already optimized pretty well for batch
> reading:
>
> http://objectstyle.org/cayenne/userguide/perform/index.html#iterator
>
> Using these features instead of raw JDBC has an obvious advantage of
> reusing all the mapping info you created.
I understand that, but to get the pipe clean and the last bottleneck
out of the way, I also have to do the loading of relations in a
batch-like fashion. On the sql-level, that means e.g. for a ToMany
relation to query instead of:
SELECT ... FROM painting t0 where t0.gallery_id = ?
the corresponding query for a set of objects simulanously:
SELECT ... FROM painting t0 where t0.gallery_id IN ( ?, ?, ?, ...)
That's a big boost (at least on oracle), at least in the
case where "ToMany" usually means "to-very-few" (1,2,3,...),
as in ou case.
I've no clear idea how such a feature could be
integrated in the API, one possibility would be
to have a method on DataContext:
resolveRelationsForObjects( List dataObjects, String relName );
to trigger the relation-resolving for a list of
objects of the same type.
(I understand that this is nothing for your 1.0 version,
but maybe you can give me a hint how to add it as
a quick&dirty patch...)
>
> 2. Batch commits. We discussed that - it should be done by Beta.
see above.
>3. Maintaining low memory footprint. As mentioned earlier,
> simply throwing away the whole DataContext after each
> commit will not be a good solution, since you mentioned
> around 10000 objects that are shared between the batches.
> So this is the area that will need special handling in
> Cayenne. I can see a few ways to handle that:
>
> a. complete custom handling of ObjectStore cleanup
> after commit. You can create custom code to remove
> some entities from cache, and to preserve some other.
>
> b. generic solution: having a special "shared" context
> (EOF people, think EOSharedEditingContext), which is
> not a *parent*, but rather a *peer* of any other
> DataContexts. SharedDataContext will probably be
> read-only (but doesn't have to be). Its important
> property is that all objects it contains are "shared"
> and can be accessed from other DataContexts by reference
> (not by copy like TopLink UnitOfWork does) as if they
> were local. It also means that local objects can have
> relationships to objects in the SharedDataContext (but
> not the other way around).
>
> With this you can simply throw away an instance of
> DataContext after each commit, creating a new one
> (DataContext by itself is very lightweight, before
> its cache gets filled in). At the same time "shared"
> DataContext will stay around, so you won't need to
> refetch reusable data, and memory footprint will
> stay constant.
>
> I really like (b) - the idea of cleanly separating
> "configuration" immutable objects from the objects
> being modified, but still maintaining the same object
> graph. Unfortunately this is not planned for 1.0 and
> will probably be included in the later releases.
I agree that having 2 dedicated "levels" of data-contexts
(shared-config/worker-context) would be sufficient, but
I cannot see that this is really simpler than a real
recursive solution of nested data-contexts. The api
would simply be:
DataContext subContext = ctxt.createSubContext();
...
subContext .query(...);
...
subContext .commit(...);
subContext = null;
I've no real idea of how difficult that is to implement,
but I can try to dig into it...
with best regards,
Arndt
-- -- Dr. Arndt Brenschede DIAMOS AG Innovapark Am Limespark 2 65843 SulzbachTel.: +49 (0) 61 96 - 65 06 - 134 Fax: +49 (0) 61 96 - 65 06 - 100 mobile: +49 (0) 151 151 36 134 mailto:arndt.brensched..iamos.com http://www.diamos.com
This archive was generated by hypermail 2.0.0 : Tue Feb 25 2003 - 05:14:46 EST