Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languagesql
SELECT * FROM fragment WHERE anchor_id = ? AND xpath IN (?,?,?,?,?,?...,?)
SELECT * FROM fragment WHERE anchor_id = ? AND xpath IN (?,?,?,?,?,?...,?)
SELECT * FROM fragment WHERE anchor_id = ? AND xpath IN (?,?,?,?)

Observations

Performance testing is

...

inconclusive (so far)

Testing with a batch size of 1000 (not 32000) showed a performance drop of 30% at 20000 items. More testing is needed to see if this applies with batch size of 32000, or if the test involved bad luck.

Advantage: Solution is easy to understand

...

Testing so far has shown no noticeable performance difference with previous proposal.

Observations

Performance is unchanged

Testing shows there was no increase or decrease in performance compared to master branch after arrays were used.

Disadvantage: extra dependency

...

Code Block
languagejava
    @Modifying
    @Query(value = "DELETE FROM fragment f WHERE anchor_id = :anchorId AND xpath LIKE ANY (:xpathPatterns)", nativeQuery = true)
    void deleteByAnchorIdAndXpathLikeAny(@Param("anchorId") int anchorId, @Param("xpathPatterns") String[] xpathPatterns);

    default void deleteByAnchorIdAndXpathLikeAny(int anchorId, Collection<String> xpathPatterns) {
        deleteByAnchorIdAndXpathLikeAny(anchorId, xpathPatterns.toArray(new String[0]));
    }

Comparison

  • Testing shows both solutions have almost identical performance - and the same performance as master branch before the changesMore testing is needed to confirm if list partitioning drops performance with large datasets.
  • Solution #1 is explicit using 32K batches, so easy to understand what it does
  • Solution #2 has shorter code, but not very obvious to the reader what it does
  • Solution #2 simplifies code in the cases when we need Postgres array support.
  • It's possible to use both solutions, where most appropriate.