Re: Object Caching

From: Michael Gentry (mgentr..asslight.net)
Date: Fri Nov 13 2009 - 11:05:45 EST

  • Next message: Sales Manager: "Managing Result Sets"

    Hi Hans,

    Even using a paginated query in Cayenne, it would've eventually pulled
    everything into memory. The paginated query is really designed to be
    used in a UI where the user is going to see a limited amount of data
    and probably not page too many times. The iterated query is the best
    approach for trying to process a large number of records in Cayenne.

    Good luck!

    mrg

    On Fri, Nov 13, 2009 at 10:49 AM, Hans Pikkemaat
    <h.pikkemaa..si-solutions.nl> wrote:
    > Hi,
    >
    > Of course I don't want to load the whole thing into memory.
    > I want to run the query and use an iterator to go through the results.
    > Using paging the jdbc driver is able to produce chunks which prevents
    > the whole resultset to be loaded into memory.
    >
    > This same principle I was trying to accomplish using cayenne but clearly
    > without success.
    >
    > So I'm going to fall back to cayenne iterated query or even jdbc.
    >
    > tx
    >
    > Hans
    >
    >
    > Michael Gentry wrote:
    >>
    >> I'm not exactly sure what you are trying to accomplish, but could you
    >> use plain SQL to do the job (run it from an SQL prompt)?  That's the
    >> approach I normally take when I have to do updates to large amounts of
    >> data.  Especially for a one-off task or something ill-suited to Java
    >> code.  Even if you were using raw JDBC (no ORM) and tried to pull back
    >> 2.5 million records it would be difficult.  I don't know the size of
    >> the data record you are using, but if it is even 1k (not an
    >> unreasonable size) it would require 2.5 GB of RAM just to hold the
    >> records.
    >>
    >> mrg
    >>
    >>
    >> On Fri, Nov 13, 2009 at 10:20 AM, Hans Pikkemaat
    >> <h.pikkemaa..si-solutions.nl> wrote:
    >>
    >>>
    >>> Hi,
    >>>
    >>> That was the initial approax I tried. The problem with this is that I
    >>> cannot
    >>> manually
    >>> create relations between objects constructed from data rows. This means
    >>> that
    >>> when
    >>> I access the detail table through the relation it will execute a query to
    >>> get them from
    >>> the database.
    >>>
    >>> If I have 100 main records it runs 100 queries to get all the details.
    >>> This is not performing well. I need to run 1 query which is doing a left
    >>> join and
    >>> gets all the data in one go.
    >>>
    >>> But I totally agree with you that ORM is too much overhead. I don't need
    >>> caching
    >>> or something like that. Actually I'm trying to prevent that it is caching
    >>> the records.
    >>> I'm working on a solution now that is using the iterated query which is
    >>> returning
    >>> datarows where I construct new objects and the relationsship between them
    >>> myself.
    >>>
    >>> tx
    >>>
    >>> Hans
    >>>
    >>>
    >>> Michael Gentry wrote:
    >>>
    >>>>
    >>>> Not just Cayenne, Hans.  No ORM efficiently handles the scale you are
    >>>> talking about.  You need to find a way to break your query down into
    >>>> smaller chunks to process.  What you are doing might be workable with
    >>>> 50k records, but not 2.5m.  Find a way to break your query down into
    >>>> smaller units to process or explore what Andrus suggested with
    >>>> ResultIterator:
    >>>>
    >>>> http://cayenne.apache.org/doc/iterating-through-data-rows.html
    >>>>
    >>>> If you can loop over one record at a time and process it (thereby
    >>>> letting the garbage collector clean out the ones you have processed)
    >>>> then your memory usage should be somewhat stable and manageable, even
    >>>> if the initial query time takes a while.
    >>>>
    >>>> mrg
    >>>>
    >>>>
    >>>> On Fri, Nov 13, 2009 at 7:09 AM, Hans Pikkemaat
    >>>> <h.pikkemaa..si-solutions.nl> wrote:
    >>>>
    >>>>
    >>>>>
    >>>>> Anyway, my conclusion is indeed: don't use cayenne for large query
    >>>>> processing.
    >>>>>
    >>>>>
    >>>>
    >>>>
    >>>
    >>>
    >>
    >>
    >
    >
    >



    This archive was generated by hypermail 2.0.0 : Fri Nov 13 2009 - 11:13:16 EST