Object Graph Refreshing: design ideas?

From: Andrus Adamchik (andru..bjectstyle.org)
Date: Fri Dec 19 2003 - 23:48:12 EST

  • Next message: Andrus Adamchik: "Re: Object Graph Refreshing: design ideas?"

    One of the main 1.1 features is distributed cache and propagation of
    changes between ObjectStores in the same VM and cross VM
    (http://objectstyle.org/jira/secure/ViewIssue.jspa?key=CAY-30). It is
    pretty much finished, and works already within the current design
    limitations. It has all bells and whistles, including distributed
    updates configuration in the Modeler, delegate methods, etc.

    Now the limitations... This is something I wanted to discuss and maybe
    get input and design ideas.

    The shared cache (DataRowStore) stores DataRows (duh), i.e. maps of db
    values. Multiple ObjectStores contain DataObjects with relationships
    and share a single DataRowStore per Cayenne DataDomain. A typical
    scenario of synchronizing ObjectStores:

    1. DataContext A just finished successful commit, and asks its
    ObjectStore to post-process affected objects (e.g. change their state
    to COMMITTED)
    2. ObjectStore A does that, and pushes newly committed snapshots to the
    underlying (shared) DataRowStore.
    3. DataRowStore puts new snapshots in the cache, and generates an event
    notifying all listeners of the change. The event contains the "diffs"
    of all modified snapshots and ids of deleted ones.
    4. ObjectStore B receives the event and updates some of its objects
    whose snapshots have changed.
    5. Remote event is received in another VM by its own DataRowStore,
    ... Items 3-5 are repeated in each receiving VM...

    So far so good... But there is an obvious structural mismatch between
    the data stored at the two levels. It can be easily reconciled when
    going down the stack (ObjectStore -> DataRowStore): having a DataObject
    we can always convert it to a DataRow representation, compare with
    cache, and build an "update" query or something. Going the other way
    (DataRowStore -> ObjectStore) is more painful. Merging the changes to
    the (potentially different) object graph is not as trivial. Currently
    the main problem I ran into is that there is no reliable way to refresh
    to-many relationship list when an object was added to or deleted from
    it. SnapshotEvent simply doesn't have enough information (since
    removing an object from to-many doesn't modify the source in a DB
    sense). Though I only tried it with to-many, I can see problems with
    any type of "indirect" relationships (by "indirect" I mean that
    creating or destroying a relationship will not result in DB row update
    of the relationship SOURCE). This includes one-to-many, one-to-one, and
    all kinds of flattened relationships.

    I am looking for solutions... So far I see the following paths:

    1. "Kill Them All, Let God Sort Them Out":

    Currently sources of "indirect" relationships are marked as "MODIFIED"
    and stay this way until ContextCommit figures out that there is nothing
    to do in the DB, and changes them back to COMMITTED. We don't know
    anything about the nature of modification to these objects. But we can
    simply include their Ids in the event, and let the receiver invalidate
    these objects, so that on the next access they will be refetched,
    including all relationships. This is the most simple solution and
    should work within the current design frame (I haven't tried it yet
    though, so there maybe limitations). But it may result in some serious
    overhead refetching lots of invalidated objects.

    2. Track individual relationship modifications:

    This is a more subtle approach. Instead of invalidating the whole trees
    of objects, we can pinpoint a place where an object was removed or
    added and act accordingly. I started investigating this path, but
    quickly discovered that tracking all these things is a great pain.
    First, DataRows and DataRowStore have no notion of relationships, but
    if we want this to work, we will have to pass such information through
    the DataRowStore, stretching its responsibilities (most likely this
    means a big redesign - something I am trying not to think of). Second
    we need to track these changes throughout the commit cycle in the
    DataContext/ObjectStore, instead of doing simple comparisons with
    snapshots. Third, the serialized event size will grow, making remote
    notifications less attractive.

    This is the choices that I see... Ideas? Comments?

    Andrus



    This archive was generated by hypermail 2.0.0 : Fri Dec 19 2003 - 23:48:17 EST