Re: Optimize cascade deletes

From: Giulio Cesare Solaroli (giulio.cesar..mail.com)
Date: Tue Oct 02 2007 - 03:02:55 EDT

  • Next message: pinkybum_nabble: "To dep PK"

    Hello Tore,

    On 10/2/07, Tore Halset <halse..vv.ntnu.no> wrote:
    > Hello.
    >
    > This does not sound like a cayenne issue any more.

    Definitely.

    > Here is what I would do to get going.
    >
    > * Remove the extra OS layer if you are still using it. Was it
    > Parallells?

    Parallels could alter the absolute values of the operation timing; but
    I have see an consistent difference between 20 and 100 times faster on
    the newly created table; this make me think that it is not a problem
    bound to the usage of a virtual machine.

    > * Update to latest PostgreSQL and JDBC driver.

    I may try to update to the latest PostgreSQL version, but it is not a
    JDBC issue as I got the same problem issuing the query right from
    psql.

    > * Use vacuum analyze and autovacuum

    I have already tried this.

    > * Find slow queries and use "explain <query>"

    The slow query is a simple delete on a table accessed using the
    primary key. The explain plan reports an index scan, as expected.

    > * Make sure PostgreSQL are set up with enough memory.

    The current instance i probably running with a very low memory
    footprint, but if a copy of the same table on the same PostgreSQL
    instance is running the same query 20 to 100 times faster, there is
    something else to fix before tuning the memory.

    > * Ask on pgsql-performanc..ostgresql.org

    Great pointer. Thanks!!!! :-)

    Best regards,

    Giulio Cesare

    > On Oct 1, 2007, at 19:05 , Giulio Cesare Solaroli wrote:
    >
    > > Yes,
    > >
    > > I have vacuumed it and also rebuild the indexes to no avail. :-(
    > >
    > > I am not a PostgreSQL expert (and not even an expert DBA in general),
    > > so it is very possible that I am doing something very wrong.
    > >
    > > This is both a good news (the application should be fine, and I don't
    > > need to tune it much further) and bad news (I need to learn how to
    > > tune a PostgreSQL db, or find someone that could help).
    > >
    > > Best regards,
    > >
    > > Giulio Cesare
    > >
    > >
    > >
    > >
    > > On 10/1/07, Michael Gentry <blacknex..mail.com> wrote:
    > >> Have you vacuumed the original (slow) table?
    > >>
    > >> /dev/mrg
    > >>
    > >> On 10/1/07, Giulio Cesare Solaroli <giulio.cesar..mail.com> wrote:
    > >>> Hello everybody,
    > >>>
    > >>> today I have being able to do some more tests and I have found out
    > >>> that the problem seems to be caused by a "fragmentation" of the data
    > >>> on the Postgresql table space.
    > >>>
    > >>> I have created a copy of the table where delete statements where
    > >>> very
    > >>> slow (using the "create table .... as select from ..." syntax) and
    > >>> matching the structure of the new copy to the original one (index
    > >>> and
    > >>> constraint wise); in this new table the performance of the delete
    > >>> statements where from 20 to 100 times faster that in the original
    > >>> table. :-(
    > >>>
    > >>> This rules out Cayenne as the culprit, but leave me wondering how to
    > >>> avoid this degradation of performances on Postgresql; but this is
    > >>> probably not the right place where to start.
    > >>>
    > >>> Thank you everybody for your attention and for the very useful
    > >>> suggestions or pointers that greatly helped me in understanding this
    > >>> problem.
    > >>>
    > >>> Best regards,
    > >>>
    > >>> Giulio Cesare
    > >>>
    > >>>
    > >>>
    > >>>
    > >>>
    > >>>
    > >>>
    > >>> On 9/23/07, Giulio Cesare Solaroli <giulio.cesar..mail.com> wrote:
    > >>>> Hello,
    > >>>>
    > >>>> I have collected a few more data, as suggested by Aristedes.
    > >>>>
    > >>>> On 9/23/07, Giulio Cesare Solaroli <giulio.cesar..mail.com> wrote:
    > >>>>> On 9/23/07, Aristedes Maniatis <ar..aniatis.org> wrote:
    > >>>>>>
    > >>>>>> On 23/09/2007, at 5:38 PM, Giulio Cesare Solaroli wrote:
    > >>>>>>
    > >>>>>>> log4j.logger.org.apache.cayenne=ERROR
    > >>>>>>> log4j.logger.org.apache.cayenne.access.QueryLogger=DEBUG
    > >>>>>>>
    > >>>>>>>
    > >>>>>>> Is there anything going on on the Cayenne code between the
    > >>>>>>> last [batch
    > >>>>>>> bind:xxx] log and the "updated" log?
    > >>>>>>
    > >>>>>> Well, you could always try:
    > >>>>>>
    > >>>>>> log4j.logger.org.apache.cayenne=DEBUG
    > >>>>>
    > >>>>> Argh. I should have thought about this myself. :-(
    > >>>>
    > >>>> Even with full log enabled I got the same exact log, so it looks
    > >>>> like
    > >>>> all the time is really spent on the DB. And the next logs confirm
    > >>>> this.
    > >>>>
    > >>>>
    > >>>>>>> Is all this time spent on the DB only?
    > >>>>>>
    > >>>>>> What does turning on log_min_duration_statement tell you?
    > >>>>
    > >>>> I have being able to run PostgreSQL with logging enabled only on my
    > >>>> Parallels instance, as I can not start/stop the db on the
    > >>>> deployment
    > >>>> server; this will make the absolute times not that relevant, but I
    > >>>> hope we can understand something more anyway.
    > >>>>
    > >>>> First of all, there is a huge difference between the deletion
    > >>>> time of
    > >>>> the "cascade" records in different tables.
    > >>>>
    > >>>> [...]
    > >>>> LOG: duration: 0.020 ms statement: EXECUTE <unnamed> [PREPARE:
    > >>>> DELETE FROM connection.USRCNNRQS WHERE ID_USRCNNRQS = $1]
    > >>>> LOG: duration: 0.021 ms statement: EXECUTE <unnamed> [PREPARE:
    > >>>> DELETE FROM connection.USRCNNRQS WHERE ID_USRCNNRQS = $1]
    > >>>> LOG: duration: 0.020 ms statement: EXECUTE <unnamed> [PREPARE:
    > >>>> DELETE FROM connection.USRCNNRQS WHERE ID_USRCNNRQS = $1]
    > >>>> [...]
    > >>>>
    > >>>> To delete rows on the USRCNNRQS (aka UserConnectionRequest)
    > >>>> table, it
    > >>>> averages at bout 0.020 ms.
    > >>>>
    > >>>>
    > >>>> For another table (USRCNN aka UserConnection) the times range from
    > >>>> 0.145 to 0.060, for later settling for 0.040 ms:
    > >>>>
    > >>>> [...]
    > >>>> LOG: duration: 0.090 ms statement: EXECUTE <unnamed> [PREPARE:
    > >>>> DELETE FROM connection.USRCNN WHERE ID_USRCNN = $1]
    > >>>> LOG: duration: 0.071 ms statement: EXECUTE <unnamed> [PREPARE:
    > >>>> DELETE FROM connection.USRCNN WHERE ID_USRCNN = $1]
    > >>>> LOG: duration: 0.067 ms statement: EXECUTE <unnamed> [PREPARE:
    > >>>> DELETE FROM connection.USRCNN WHERE ID_USRCNN = $1]
    > >>>> [...]
    > >>>>
    > >>>>
    > >>>> It later took 43 seconds (!!) to parse and execute the deletion
    > >>>> of a
    > >>>> single record, ...
    > >>>> [...]
    > >>>> LOG: statement: PREPARE S_5 AS DELETE FROM clipperz.RCR WHERE
    > >>>> ID_RCR = $1
    > >>>> LOG: statement: <BIND>
    > >>>> LOG: statement: EXECUTE <unnamed> [PREPARE: DELETE FROM
    > >>>> clipperz.RCR WHERE ID_RCR = $1]
    > >>>> LOG: duration: 42.998 ms statement: EXECUTE <unnamed> [PREPARE:
    > >>>> DELETE FROM clipperz.RCR WHERE ID_RCR = $1]
    > >>>> LOG: statement: <BIND>
    > >>>> LOG: statement: EXECUTE <unnamed> [PREPARE: DELETE FROM
    > >>>> clipperz.RCR WHERE ID_RCR = $1]
    > >>>> [...]
    > >>>>
    > >>>> ...later spotting time as low as 0.042 for the execution of the
    > >>>> same query:
    > >>>> [...]
    > >>>> LOG: duration: 0.042 ms statement: EXECUTE <unnamed> [PREPARE:
    > >>>> DELETE FROM clipperz.RCR WHERE ID_RCR = $1]
    > >>>> LOG: statement: <BIND>
    > >>>> LOG: statement: EXECUTE <unnamed> [PREPARE: DELETE FROM
    > >>>> clipperz.RCR WHERE ID_RCR = $1]
    > >>>> [...]
    > >>>>
    > >>>> But as I may understand this different timing may be due to the
    > >>>> interaction of the Virtual machine with the real server, what is
    > >>>> puzzling me is that the deletion from the latest table had a
    > >>>> "constant" time of about 2 seconds each:
    > >>>> [...]
    > >>>> LOG: duration: 1929.043 ms statement: EXECUTE <unnamed>
    > >>>> [PREPARE:
    > >>>> DELETE FROM clipperz.RCRVRS WHERE ID_RCRVRS = $1]
    > >>>> LOG: duration: 1991.311 ms statement: EXECUTE <unnamed>
    > >>>> [PREPARE:
    > >>>> DELETE FROM clipperz.RCRVRS WHERE ID_RCRVRS = $1]
    > >>>> LOG: duration: 2078.249 ms statement: EXECUTE <unnamed>
    > >>>> [PREPARE:
    > >>>> DELETE FROM clipperz.RCRVRS WHERE ID_RCRVRS = $1]
    > >>>> LOG: duration: 2015.047 ms statement: EXECUTE <unnamed>
    > >>>> [PREPARE:
    > >>>> DELETE FROM clipperz.RCRVRS WHERE ID_RCRVRS = $1]
    > >>>> [...]
    > >>>>
    > >>>>
    > >>>> If nobody as something else to suggest, I will try to get back
    > >>>> the the
    > >>>> DB schema to see if there is some constraint/index that is slowing
    > >>>> down the deletion of these records.
    > >>>>
    > >>>> Thank you very much for your attention.
    > >>>>
    > >>>> Best regards,
    > >>>>
    > >>>> Giulio Cesare
    > >>>>
    > >>>
    > >>
    > >
    >
    >



    This archive was generated by hypermail 2.0.0 : Tue Oct 02 2007 - 03:03:26 EDT