RE: raw dataport

From: McDaniel, Joe R. (joe.mcdanie..gc.com)
Date: Wed Jun 14 2006 - 08:07:54 EDT

  • Next message: Bryan Lewis: "Re: raw dataport"

    For many databases, a LOAD is far faster than INSERT. For MySQL, for
    instance, one can do a LOAD REMOTE where the file being loaded is remote
    from the database even using JDBC. JDBC also allows for batching -- that
    can help but the LOAD REMOTE approach is still better. (If you cannot
    do the remote for whatever reason, you may still have an option of using
    a shared drive for "local" access to both the database server and the
    "client.")

    Best,

    Joe

    -----Original Message-----
    From: Tore Halset [mailto:halse..vv.ntnu.no]
    Sent: Wednesday, June 14, 2006 1:48 AM
    To: cayenne-use..ncubator.apache.org
    Subject: raw dataport

    Hello.

    Anyone got dataport to work on huge databases with lots of rows and lots
    of blobs/clobs? I had problems porting over one of our databases
    yesterday. One of the tables has ~12M rows with clobs. Even though
    INSERT_BATCH_SIZE are 1000, it would just go on forever without
    committing the first 1000 rows. It would also gladly throw away
    OutOfMemoryExceptions..

    I ended up writing a new DataPort.processInsert that use the model to
    create plain jdbc sql statements. I also changed the partially commit
    algorithm to commit based on the number of bytes read/written since the
    previous commit instead of the number of rows.

    After the change, DataPort would port anything without problems :) The
    17GB MS SQL Database got over to PostgreSQL on my old PowerBook in a few
    hours without any memory problems.

    So, what do you think? Am I using the current DataPort incorrectly?
    Should this feature replace the current dataport, be enabled with a
    raw-flag, or perhaps be availiable as a new ant task? It is at least
    useful for me :) After 1.2 of course.

    Regards,
      - Tore.



    This archive was generated by hypermail 2.0.0 : Wed Jun 14 2006 - 08:08:24 EDT