Skip to end of metadata
Go to start of metadata

You are viewing an old version of this content. View the current version.

Compare with Current View Version History

« Previous Version 2 Next »

Background

Repository queries taking collection parameters are limited to 32K collection items.
For example, given a FragmentRepository method:

List<FragmentEntity> findAllByAnchorIdAndXpathIn(int anchorId, Collection<String> xpath);

This would throw an exception if supplied with a collection of 32,767 xpaths.

The reason is that JPA/Hibernate queries would translate the command to the following SQL:

SELECT * FROM fragment WHERE anchor_id = ? AND xpath IN (?,?,?,?...,?)

Each collection item (xpath) is supplied as an SQL query parameter. Hibernate with Postgres has a limit of 32,767 query parameters per SQL statement, so an exception will be thrown with too many.

CPS-1574 is to remove those limits (and write tests to verify).

Proposal #1: List partitioning

This solution partitions the input collection into lists of 32,000 and calls the original repository method, collecting the results:

List<FragmentEntity> findAllByAnchorIdAndXpathIn(int anchorId, Collection<String> xpaths);

default List<FragmentEntity> findAllByAnchorIdAndXpathInBatch(final int anchorId, final List<String> xpaths) {
    List<FragmentEntity> results = new ArrayList<>(xpaths.size());
    for (final Collection<String> xpathsBatch : Lists.partition(xpaths, 32000)) {
        results.addAll(findAllByAnchorIdAndXpathIn(anchorId, xpathsBatch));
    }
    return results;
}

In addition, if we wish to avoid the overhead of adding the results to a collection when not needed, we may directly call the original when the collection size is less than 32,000:

List<FragmentEntity> findAllByAnchorIdAndXpathIn(int anchorId, Collection<String> xpaths);

default List<FragmentEntity> findAllByAnchorIdAndXpathInBatch(final int anchorId, final List<String> xpaths) {
    if (xpaths.size() < 32000) {
        return findAllByAnchorIdAndXpathIn(anchorId, xpathsBatch);
    }
    List<FragmentEntity> results = new ArrayList<>(xpaths.size());
    for (final Collection<String> xpathsBatch : Lists.partition(xpaths, 32000)) {
        results.addAll(findAllByAnchorIdAndXpathIn(anchorId, xpathsBatch));
    }
    return results;
}

Proposal #2: Postgres arrays

This solution uses a third-party library to add PostGres array support to queries.

Comparison


  • No labels