Re: dataport: memory usage

From: Andrus Adamchik (andru..bjectstyle.org)
Date: Fri May 14 2004 - 08:58:56 EDT

  • Next message: johannes schmidt: "beginners question"

    Hi Tore,

    I will probably patch the dataport example with this code, however
    eventually we will need to implement streaming of blobs and clobs.
    Reading them completely in memory is definitely not appropriate in many
    cases as your example proves.

    Andrus

    On May 14, 2004, at 6:45 AM, Tore Halset wrote:
    > On Jan 9, 2004, at 13:42, Tore Halset wrote:
    >
    >> I am using dataport to move data from MS SQL Server to PostgreSQL and
    >> it realy is the perfect tool!
    >>
    >> The small problems that I got around:
    >> 1. The database has two tables that has blob columns. A single blob
    >> are not larger than 5MB, but the tables can have ~5k rows. This lead
    >> to a java.lang.OutOfMemoryError, but giving java more memory helps a
    >> lot
    >> % ANT_OPTS="-Xmx512M" ant
    >
    > Setting the insert batch size to 1 for entities that has a blob
    > attribute fixed this memory problem. I added the following method to
    > DataPort and used it to determine the batch size in the method named
    > processInsert.
    >
    > private int getInsertBatchSize(DbEntity entity) {
    > Iterator attIt = entity.getAttributes().iterator();
    > while (attIt.hasNext()) {
    > DbAttribute dbAttribute = (DbAttribute) attIt.next();
    > if ((dbAttribute.getType() == Types.BLOB) ||
    > (dbAttribute.getType() == Types.VARBINARY)
    > || (dbAttribute.getType() == Types.LONGVARBINARY))
    > {
    > return 1;
    > }
    > }
    > return INSERT_BATCH_SIZE;
    > }
    >
    > - Tore.



    This archive was generated by hypermail 2.0.0 : Fri May 14 2004 - 08:59:01 EDT