Re: Optimize cascade deletes

From: Giulio Cesare Solaroli (giulio.cesar..mail.com)
Date: Mon Oct 01 2007 - 11:30:51 EDT

  • Next message: Michael Gentry: "Re: Optimize cascade deletes"

    Hello everybody,

    today I have being able to do some more tests and I have found out
    that the problem seems to be caused by a "fragmentation" of the data
    on the Postgresql table space.

    I have created a copy of the table where delete statements where very
    slow (using the "create table .... as select from ..." syntax) and
    matching the structure of the new copy to the original one (index and
    constraint wise); in this new table the performance of the delete
    statements where from 20 to 100 times faster that in the original
    table. :-(

    This rules out Cayenne as the culprit, but leave me wondering how to
    avoid this degradation of performances on Postgresql; but this is
    probably not the right place where to start.

    Thank you everybody for your attention and for the very useful
    suggestions or pointers that greatly helped me in understanding this
    problem.

    Best regards,

    Giulio Cesare

    On 9/23/07, Giulio Cesare Solaroli <giulio.cesar..mail.com> wrote:
    > Hello,
    >
    > I have collected a few more data, as suggested by Aristedes.
    >
    > On 9/23/07, Giulio Cesare Solaroli <giulio.cesar..mail.com> wrote:
    > > On 9/23/07, Aristedes Maniatis <ar..aniatis.org> wrote:
    > > >
    > > > On 23/09/2007, at 5:38 PM, Giulio Cesare Solaroli wrote:
    > > >
    > > > > log4j.logger.org.apache.cayenne=ERROR
    > > > > log4j.logger.org.apache.cayenne.access.QueryLogger=DEBUG
    > > > >
    > > > >
    > > > > Is there anything going on on the Cayenne code between the last [batch
    > > > > bind:xxx] log and the "updated" log?
    > > >
    > > > Well, you could always try:
    > > >
    > > > log4j.logger.org.apache.cayenne=DEBUG
    > >
    > > Argh. I should have thought about this myself. :-(
    >
    > Even with full log enabled I got the same exact log, so it looks like
    > all the time is really spent on the DB. And the next logs confirm
    > this.
    >
    >
    > > > > Is all this time spent on the DB only?
    > > >
    > > > What does turning on log_min_duration_statement tell you?
    >
    > I have being able to run PostgreSQL with logging enabled only on my
    > Parallels instance, as I can not start/stop the db on the deployment
    > server; this will make the absolute times not that relevant, but I
    > hope we can understand something more anyway.
    >
    > First of all, there is a huge difference between the deletion time of
    > the "cascade" records in different tables.
    >
    > [...]
    > LOG: duration: 0.020 ms statement: EXECUTE <unnamed> [PREPARE:
    > DELETE FROM connection.USRCNNRQS WHERE ID_USRCNNRQS = $1]
    > LOG: duration: 0.021 ms statement: EXECUTE <unnamed> [PREPARE:
    > DELETE FROM connection.USRCNNRQS WHERE ID_USRCNNRQS = $1]
    > LOG: duration: 0.020 ms statement: EXECUTE <unnamed> [PREPARE:
    > DELETE FROM connection.USRCNNRQS WHERE ID_USRCNNRQS = $1]
    > [...]
    >
    > To delete rows on the USRCNNRQS (aka UserConnectionRequest) table, it
    > averages at bout 0.020 ms.
    >
    >
    > For another table (USRCNN aka UserConnection) the times range from
    > 0.145 to 0.060, for later settling for 0.040 ms:
    >
    > [...]
    > LOG: duration: 0.090 ms statement: EXECUTE <unnamed> [PREPARE:
    > DELETE FROM connection.USRCNN WHERE ID_USRCNN = $1]
    > LOG: duration: 0.071 ms statement: EXECUTE <unnamed> [PREPARE:
    > DELETE FROM connection.USRCNN WHERE ID_USRCNN = $1]
    > LOG: duration: 0.067 ms statement: EXECUTE <unnamed> [PREPARE:
    > DELETE FROM connection.USRCNN WHERE ID_USRCNN = $1]
    > [...]
    >
    >
    > It later took 43 seconds (!!) to parse and execute the deletion of a
    > single record, ...
    > [...]
    > LOG: statement: PREPARE S_5 AS DELETE FROM clipperz.RCR WHERE ID_RCR = $1
    > LOG: statement: <BIND>
    > LOG: statement: EXECUTE <unnamed> [PREPARE: DELETE FROM
    > clipperz.RCR WHERE ID_RCR = $1]
    > LOG: duration: 42.998 ms statement: EXECUTE <unnamed> [PREPARE:
    > DELETE FROM clipperz.RCR WHERE ID_RCR = $1]
    > LOG: statement: <BIND>
    > LOG: statement: EXECUTE <unnamed> [PREPARE: DELETE FROM
    > clipperz.RCR WHERE ID_RCR = $1]
    > [...]
    >
    > ...later spotting time as low as 0.042 for the execution of the same query:
    > [...]
    > LOG: duration: 0.042 ms statement: EXECUTE <unnamed> [PREPARE:
    > DELETE FROM clipperz.RCR WHERE ID_RCR = $1]
    > LOG: statement: <BIND>
    > LOG: statement: EXECUTE <unnamed> [PREPARE: DELETE FROM
    > clipperz.RCR WHERE ID_RCR = $1]
    > [...]
    >
    > But as I may understand this different timing may be due to the
    > interaction of the Virtual machine with the real server, what is
    > puzzling me is that the deletion from the latest table had a
    > "constant" time of about 2 seconds each:
    > [...]
    > LOG: duration: 1929.043 ms statement: EXECUTE <unnamed> [PREPARE:
    > DELETE FROM clipperz.RCRVRS WHERE ID_RCRVRS = $1]
    > LOG: duration: 1991.311 ms statement: EXECUTE <unnamed> [PREPARE:
    > DELETE FROM clipperz.RCRVRS WHERE ID_RCRVRS = $1]
    > LOG: duration: 2078.249 ms statement: EXECUTE <unnamed> [PREPARE:
    > DELETE FROM clipperz.RCRVRS WHERE ID_RCRVRS = $1]
    > LOG: duration: 2015.047 ms statement: EXECUTE <unnamed> [PREPARE:
    > DELETE FROM clipperz.RCRVRS WHERE ID_RCRVRS = $1]
    > [...]
    >
    >
    > If nobody as something else to suggest, I will try to get back the the
    > DB schema to see if there is some constraint/index that is slowing
    > down the deletion of these records.
    >
    > Thank you very much for your attention.
    >
    > Best regards,
    >
    > Giulio Cesare
    >



    This archive was generated by hypermail 2.0.0 : Mon Oct 01 2007 - 11:31:25 EDT