Hi,
Of course I don't want to load the whole thing into memory.
I want to run the query and use an iterator to go through the results.
Using paging the jdbc driver is able to produce chunks which prevents
the whole resultset to be loaded into memory.
This same principle I was trying to accomplish using cayenne but clearly
without success.
So I'm going to fall back to cayenne iterated query or even jdbc.
tx
Hans
Michael Gentry wrote:
> I'm not exactly sure what you are trying to accomplish, but could you
> use plain SQL to do the job (run it from an SQL prompt)? That's the
> approach I normally take when I have to do updates to large amounts of
> data. Especially for a one-off task or something ill-suited to Java
> code. Even if you were using raw JDBC (no ORM) and tried to pull back
> 2.5 million records it would be difficult. I don't know the size of
> the data record you are using, but if it is even 1k (not an
> unreasonable size) it would require 2.5 GB of RAM just to hold the
> records.
>
> mrg
>
>
> On Fri, Nov 13, 2009 at 10:20 AM, Hans Pikkemaat
> <h.pikkemaa..si-solutions.nl> wrote:
>
>> Hi,
>>
>> That was the initial approax I tried. The problem with this is that I cannot
>> manually
>> create relations between objects constructed from data rows. This means that
>> when
>> I access the detail table through the relation it will execute a query to
>> get them from
>> the database.
>>
>> If I have 100 main records it runs 100 queries to get all the details.
>> This is not performing well. I need to run 1 query which is doing a left
>> join and
>> gets all the data in one go.
>>
>> But I totally agree with you that ORM is too much overhead. I don't need
>> caching
>> or something like that. Actually I'm trying to prevent that it is caching
>> the records.
>> I'm working on a solution now that is using the iterated query which is
>> returning
>> datarows where I construct new objects and the relationsship between them
>> myself.
>>
>> tx
>>
>> Hans
>>
>>
>> Michael Gentry wrote:
>>
>>> Not just Cayenne, Hans. No ORM efficiently handles the scale you are
>>> talking about. You need to find a way to break your query down into
>>> smaller chunks to process. What you are doing might be workable with
>>> 50k records, but not 2.5m. Find a way to break your query down into
>>> smaller units to process or explore what Andrus suggested with
>>> ResultIterator:
>>>
>>> http://cayenne.apache.org/doc/iterating-through-data-rows.html
>>>
>>> If you can loop over one record at a time and process it (thereby
>>> letting the garbage collector clean out the ones you have processed)
>>> then your memory usage should be somewhat stable and manageable, even
>>> if the initial query time takes a while.
>>>
>>> mrg
>>>
>>>
>>> On Fri, Nov 13, 2009 at 7:09 AM, Hans Pikkemaat
>>> <h.pikkemaa..si-solutions.nl> wrote:
>>>
>>>
>>>> Anyway, my conclusion is indeed: don't use cayenne for large query
>>>> processing.
>>>>
>>>>
>>>
>>
>>
>
>
This archive was generated by hypermail 2.0.0 : Fri Nov 13 2009 - 10:50:16 EST