Re: How to map/model files?

From: Aristedes Maniatis (ar..sh.com.au)
Date: Sat Mar 21 2009 - 20:51:55 EDT

  • Next message: Andrus Adamchik: "Re: inheritance"

    On 22/03/2009, at 9:41 AM, Joseph Schmidt wrote:

    > I mean that in the case of a file system based approach, where the
    > metadata is in the database, the operations on files are not guarded
    > by transactions, nor can they be process in the same Cayenne
    > transaction like on the metadata of those files. So when problems
    > occur during
    > editing, deleting, etc. the system is not consistent and can't be
    > really rollbacked.

    I don't see this as too hard. Save files with a filename as the SHA
    hash of the file contents, so every change to the file results in a
    new file on disk. Then you have perfect rollback ability, depending on
    exactly how you decide to implement the metadata store. Revision
    history, rollbacks, audit trail, ACID are all possible if the metadata
    is considered the authoritative way to access data in the file system.

    > Well, I hoped Cayenne has some solution for such a common problem,
    > e.g. to allow
    > to handle those "files" as simple persistent entities (something
    > similar to extend types), like all the other entities (since they
    > are in relationship) :).

    I know some frameworks like EOF have the ability to store data in XML
    files, but I've not heard of anything that does quite what you are
    asking for. On the other hand, this sounds like a great little
    project. A lightweight framework which makes it simple to relate
    objects from outside the database (such as files) back to metadata
    stored in Cayenne managed tables. If done in a clean reusable way, I
    could see this as a useful addition to Cayenne.

    Joseph, if you were to take up such a task, there would be people here
    who could help guide you through the tricky spots.

    >> * do you wish to end up with these blobs in memory or can you avoid
    >> that by streaming directly from disk
    > Of course: as less memory as possible :).

    Well, that is like saying you'd like world peace. An admirable goal,
    but some clearer objectives are needed. If you can't afford to
    materialise the data as a Java object, you'll need to stream them in
    some way. That suggests a completely different approach to using
    Cayenne to manipulate your large data blobs.

    >> (eg. through Apache httpd without touching Java)
    > I need a pure Java solution, no httpd, ajp connectors etc. :(.

    Purely within Java, I'd suggest a framework like Jetty which has the
    ability to stream large objects from disk without gobbling lots of
    memory. You will need to bypass Cayenne since otherwise your data will
    be loaded into memory by the JDBC adapter and then turned into a
    ObjectEntity by Cayenne.

    >> * how will your database cope?
    >> * how will your database backup processes cope?
    > Ideally I wouldn't care much about the database (that's why I'm trying
    > to adopt an ORM as a layer over it :) ).

    Well, at some point you still need to care about the database. Cayenne
    isn't trying to save you from knowing about the database, but rather
    from hardcoding assumptions about it in your Java code. You still need
    to understand the limitations of where you are putting your data.

    > E.g. This https://issues.apache.org/jira/browse/CAY-155
    > would allow to care less about the DB :).

    I just read that task and don't understand it. Is it just a duplicate
    of CAY-762 to add drawing tools? Or is this something else?

    > Also, an ANT task similar to Cayenne http://cayenne.apache.org/doc/cdataport.html
    > but to hide this entire backup/restore mess, would be ideal e.g.
    > "cbackup/crestore" :).

    Maybe. But database specific backup tools give you a lot of advantages
    you can never get in code:

    * atomic snapshots (that is, completely consistent backups)
    * incremental backups
    * replication (master-master or master-slave)

    >> * how will you manage the files on the file system? Will people
    >> touch them without going through your code and therefore break the
    >> links between the metadata and the files?
    > Most webapplations are on some remote server, so users can access them
    > only with the web UI. Unfortunately this is not as good as if
    > they're part of the Cayenne DataContext transaction.

    Ari

    -------------------------->
    ish
    http://www.ish.com.au
    Level 1, 30 Wilson Street Newtown 2042 Australia
    phone +61 2 9550 5001 fax +61 2 9550 4001
    GPG fingerprint CBFB 84B4 738D 4E87 5E5C 5EFA EF6A 7D2E 3E49 102A



    This archive was generated by hypermail 2.0.0 : Sat Mar 21 2009 - 22:54:19 EDT