Re: Blobs in the DataContext

From: MGargan..scholar.com
Date: Mon May 24 2010 - 15:42:56 UTC

Next message: Joe Baldwin: "DBCP Properties - testOnBorrow"

Previous message: Andrus Adamchik: "Re: Maven2 cgen task problem"
In reply to: Andrew Lindesay: "Re: Blobs in the DataContext"
Next in thread: Andrew Lindesay: "Re: Blobs in the DataContext"
Reply: Andrew Lindesay: "Re: Blobs in the DataContext"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Andrew,

These be flame war words. :-p Actually, I would normally
totally agree with both your and Aristedes' recommendations, unfortunately
the architecture I inherited for this product bills the BLOBs in the DB as
a sort of "feature" that I don't see changing anytime soon. Currently we
are using hibernate and it handles these situations without a problem
(although it causes more problems with many-to-many's) and even Cayenne
seems to handle this the first time for the largest file I'm uploading.
However, after multiple files the memory is not being released, it seems.
Now I haven't thrown this under the profiler yet, which I will try, but
the code isn't all the complicated and I did a code review and don't see
anyplace where I'm holding a reference to the data. I also saw some other
posts this weekend of others getting OutOfMemoryExceptions and one of the
recommendations was to create a new context. If this is the case, then
I'm assuming something is getting cached or new objects created in the
context are not destroyed after a commit. This is more of what I trying
to find out... what happens in the Cayenne internals when you create a new
object in the context, commit it, and continue using the same context.

btw. Andrew, I remember you and your LEWOstuff from wowodc... very cool
"stuff"! :)

Thanks.
-Mike

From:
Andrew Lindesay <ap..indesay.co.nz>
To:
use..ayenne.apache.org
Date:
05/22/2010 02:43 AM
Subject:
Re: Blobs in the DataContext

Hi there Mike;

I'd have to agree with Ari there; small BLOBs (usually in a sub-table)
work fine with an object-relational mapping system like Cayenne, but
trying to use an object-relational technology for big BLOBs is generally
troublesome owing to the cost of shifting those big hunks of data around
and also gobbling-up all that memory.

Some database products offer "streaming" to and from BLOBs which is one
get-around for these problems. This means you can theoretically get away
with not having to hold the whole hunk of data in memory at once.

Some time ago I was having to work cross-database with BLOBs of arbitrary
size and had some such troubles. For this reason I wrote a system which
lays down a whole series of smaller BLOBs which are then linked by a
header-table holding some very basic meta-data such as a "short unique
code" to link into the object-relational world. It's non-transactional
across the whole data, but generally special handling is required to deal
with large data sets anyway. In java, I then have an input/output stream
writing to and reading from this data structure. There are some other
advantages to this system such as being able to do "out of order" writes
to the stream.

That is actually part of my "lestuff" project which is open-source so you
are welcome to use that if you would like; drop me a note and I'll give
you some pointers. Otherwise, maybe this gives you some ideas.

Regards;

> I'm using cayenne to store large files in BLOBs as a process runs.
> The first step of the process is storing large files (~ 600MB) and they
> are ending up in the DB just fine, then we run some tasks and get some
> output files, and then store the large output files (~ 500MB) to the DB.

...
> note: I'm also compressing the stream in memory as I'm adding it to the
> byte[], but still... it works for the input files. also, each of these

___
Andrew Lindesay
www.silvereye.co.nz

Next message: Joe Baldwin: "DBCP Properties - testOnBorrow"
Previous message: Andrus Adamchik: "Re: Maven2 cgen task problem"
In reply to: Andrew Lindesay: "Re: Blobs in the DataContext"
Next in thread: Andrew Lindesay: "Re: Blobs in the DataContext"
Reply: Andrew Lindesay: "Re: Blobs in the DataContext"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2.0.0 : Mon May 24 2010 - 15:43:36 UTC