Re: high-volume offline processing using cayenne?

From: Arndt Brenschede (a..iamos.de)
Date: Sat Mar 01 2003 - 06:52:34 EST

  • Next message: Laszlo Spoor: "[question] Deleting DataObjects using the PK"

    Hi Andrus,

    I got it running!

    (The proof-of-concept for the
    high volume processing)

    For the two problems

    - prefetching relations
    - unregistering volume objects and

    I did quick&dirty patches which I append as a zip-file.

    > This is such an important feature that we couldn't overlook it :-). It is
    > called "prefetching":
    >
    > http://objectstyle.org/cayenne/api/cayenne/org/objectstyle/cayenne/query/SelectQuery.html#addPrefetch(java.lang.String)
    >
    > I haven't advertised it much (and haven't mentioned it in the user guide)
    > since there are a few serious bugs in implementation. We will work on
    > fixing them either now or while in Beta.

    Ok, for the prefetching I tried the idea of explicitly calling
    a method to tell what I want to prefetch. I put these
    2 methods (for the toMany and the toOne) in a class
    PrefetchHelper (->see attachment).

      PrefetchHelper.resolveToOneRelations(
                       DataContext context,
                       List objects,
                       String relName
                     );

      PrefetchHelper.resolveToManyRelations(
                       DataContext context,
                       List objects,
                       String relName
                     );

    This is possibly less elegant than attaching the
    prefetch-info on the query, but very flexibel.
    (How can an Iterated Query deal with prefetches?)

    With these two methods I can e.g. collect 100
    objects from an iterated query and then prefetch
    their relations, which resolves into
    SELECT ... WHERE GALLERY_ID IN (?, ?, ?, ?, ... )

    (For the twoMany, it's like that, but that doesn't
    work for compound keys. For the toOne, I used
    QueryHelper.selectQueryForIds(oids), which resolves
    into "OR ( GALLERY_ID = ? ) OR GALLERY_ID = ? ) OR ... ",
    which is frightening when you see the query, but
    similar in performance)

    >>I agree that having 2 dedicated "levels" of data-contexts
    >>(shared-config/worker-context) would be sufficient, but
    >>I cannot see that this is really simpler than a real
    >>recursive solution of nested data-contexts. The api
    >>would simply be:
    >>
    >>DataContext subContext = ctxt.createSubContext();
    >>...
    >>subContext .query(...);
    >>...
    >>subContext .commit(...);
    >>subContext = null;
    >>
    >>I've no real idea of how difficult that is to implement,
    >>but I can try to dig into it...
    >
    > Let us know!

    For keeping track of the objets in the context
    I now tried the simpler solution of putting a
    second map into ObjectStore. (->see patch)
    The 2 Methods:

    - ObjectStore.startTrackingNewObjects();
    - ObjectStore.unregisterNewObjects();

    Can be used to start filling this map
    and to unregister the objects
    (and the snapshots! see the bug-report...)
    collected.

    That basically does it.

    With these 2 patches I get a streamlined
    processing pipe with a throughput of many
    objects per millisecond, which then is no
    longer database-bounded, but limited by
    the Java-Processing.

    best regards,

    Arndt





    This archive was generated by hypermail 2.0.0 : Sat Mar 01 2003 - 06:38:10 EST