Re: Blobs in the DataContext

From: MGargan..scholar.com
Date: Mon May 24 2010 - 15:42:56 UTC

  • Next message: Joe Baldwin: "DBCP Properties - testOnBorrow"

    Hi Andrew,

            These be flame war words. :-p Actually, I would normally
    totally agree with both your and Aristedes' recommendations, unfortunately
    the architecture I inherited for this product bills the BLOBs in the DB as
    a sort of "feature" that I don't see changing anytime soon. Currently we
    are using hibernate and it handles these situations without a problem
    (although it causes more problems with many-to-many's) and even Cayenne
    seems to handle this the first time for the largest file I'm uploading.
    However, after multiple files the memory is not being released, it seems.
    Now I haven't thrown this under the profiler yet, which I will try, but
    the code isn't all the complicated and I did a code review and don't see
    anyplace where I'm holding a reference to the data. I also saw some other
    posts this weekend of others getting OutOfMemoryExceptions and one of the
    recommendations was to create a new context. If this is the case, then
    I'm assuming something is getting cached or new objects created in the
    context are not destroyed after a commit. This is more of what I trying
    to find out... what happens in the Cayenne internals when you create a new
    object in the context, commit it, and continue using the same context.

    btw. Andrew, I remember you and your LEWOstuff from wowodc... very cool
    "stuff"! :)

    Thanks.
    -Mike

    From:
    Andrew Lindesay <ap..indesay.co.nz>
    To:
    use..ayenne.apache.org
    Date:
    05/22/2010 02:43 AM
    Subject:
    Re: Blobs in the DataContext

    Hi there Mike;

    I'd have to agree with Ari there; small BLOBs (usually in a sub-table)
    work fine with an object-relational mapping system like Cayenne, but
    trying to use an object-relational technology for big BLOBs is generally
    troublesome owing to the cost of shifting those big hunks of data around
    and also gobbling-up all that memory.

    Some database products offer "streaming" to and from BLOBs which is one
    get-around for these problems. This means you can theoretically get away
    with not having to hold the whole hunk of data in memory at once.

    Some time ago I was having to work cross-database with BLOBs of arbitrary
    size and had some such troubles. For this reason I wrote a system which
    lays down a whole series of smaller BLOBs which are then linked by a
    header-table holding some very basic meta-data such as a "short unique
    code" to link into the object-relational world. It's non-transactional
    across the whole data, but generally special handling is required to deal
    with large data sets anyway. In java, I then have an input/output stream
    writing to and reading from this data structure. There are some other
    advantages to this system such as being able to do "out of order" writes
    to the stream.

    That is actually part of my "lestuff" project which is open-source so you
    are welcome to use that if you would like; drop me a note and I'll give
    you some pointers. Otherwise, maybe this gives you some ideas.

    Regards;

    > I'm using cayenne to store large files in BLOBs as a process runs.
    > The first step of the process is storing large files (~ 600MB) and they
    > are ending up in the DB just fine, then we run some tasks and get some
    > output files, and then store the large output files (~ 500MB) to the DB.

    ...
    > note: I'm also compressing the stream in memory as I'm adding it to the
    > byte[], but still... it works for the input files. also, each of these

    ___
    Andrew Lindesay
    www.silvereye.co.nz



    This archive was generated by hypermail 2.0.0 : Mon May 24 2010 - 15:43:36 UTC