MAIN

EN

Cartesian product in parallel streams

Multithreading • Combinatorics • Direct product
05.10.2021

Consider an algorithm for obtaining a Cartesian product using Java Streams. This solution is similar to three nested for loops, with the difference that the outer loop is replaced with a stream for the convenience of subsequent parallelization. We’ll use the reduce method with three parameters: identity, accumulator and combiner.

Algorithm with three nested loops: Cartesian product of sets.

The incoming data is an arbitrary number of lists and their elements.

a = [A1,B1,C1,D1]
b = [A2,B2,C2]
c = [A3,B3]
d = [A4]

The number of possible combinations is the product of the number of elements in all lists:

4 * 3 * 2 * 1 = 24

We get a stream of lists Stream<List<T>> from the input data and call the reduce method:

  • identity — in our case it is a list of lists List<List<T>>, filled with one empty value. We will use it as an intermediate result and as a final result.

  • accumulator — for each step of reduction, we define a method with two parameters: intermediate result and current list. We supplement the intermediate result with elements from the current list.

  • combiner — is used in parallel mode, combines the results of the work of streams, obtains the Cartesian product of the results.

The step-by-step process of accumulating a Cartesian product looks like this:

result0: [[]]
  list1: [A1,B1,C1,D1]
--------
result1: [[A1],[B1],[C1],[D1]]
  list2: [A2,B2,C2]
--------
result2: [[A1,A2],[A1,B2],[A1,C2],[B1,A2],[B1,B2],...[D1,C2]]
  list3: [A3,B3]
--------
result3: [[A1,A2,A3],[A1,A2,B3],[A1,B2,A3],[A1,B2,B3],[A1,C2,A3],...[D1,C2,B3]]
  list4: [A4]
--------
result4: [[A1,A2,A3,A4],[A1,A2,B3,A4],[A1,B2,A3,A4],[A1,B2,B3,A4],...[D1,C2,B3,A4]]

Combinations by columns #

Number of combinations: 24
[A1, A2, A3, A4] [B1, A2, A3, A4] [C1, A2, A3, A4] [D1, A2, A3, A4]
[A1, A2, B3, A4] [B1, A2, B3, A4] [C1, A2, B3, A4] [D1, A2, B3, A4]
[A1, B2, A3, A4] [B1, B2, A3, A4] [C1, B2, A3, A4] [D1, B2, A3, A4]
[A1, B2, B3, A4] [B1, B2, B3, A4] [C1, B2, B3, A4] [D1, B2, B3, A4]
[A1, C2, A3, A4] [B1, C2, A3, A4] [C1, C2, A3, A4] [D1, C2, A3, A4]
[A1, C2, B3, A4] [B1, C2, B3, A4] [C1, C2, B3, A4] [D1, C2, B3, A4]

Combinations of elements in parallel streams #

In parallel mode, the speed of the algorithm increases when multiplying a large number of small lists, for example, 20 lists of 2 elements or 15 lists of 3 elements. The computation time reduces by one and a half to two times. In other cases, the computation time is about the same as for three nested for loops.

/**
 * @param lists an arbitrary number of lists
 * @param <T>   the type of elements in lists
 * @return the Cartesian product of the passed lists
 */
public static <T> List<List<T>> cartesianProduct(List<List<T>> lists) {
    // incoming data is not null
    if (lists == null) return Collections.emptyList();
    // stream of lists Stream<List<T>>
    return lists.stream()
        // enable parallel mode
        .parallel()
        // discard null and empty lists
        .filter(list -> list != null && list.size() > 0)
        // reduce the stream of lists into one list, get a Cartesian product
        // reduce(identity, accumulator, combiner)
        .reduce( // intermediate result, contains one empty value
            Collections.singletonList(Collections.emptyList()),
            // bypass the received lists and supplement the intermediate result
            // with their elements, at each step obtain a new intermediate result
            (result, list) -> {
                // next intermediate result
                List<List<T>> nResult = new ArrayList<>(result.size() * list.size());
                // rows of the current intermediate result
                for (List<T> row : result) {
                    // elements of the current list
                    for (T el : list) {
                        // a new row for the next intermediate result
                        List<T> nRow = new ArrayList<>(row.size() + 1);
                        // add the current row
                        nRow.addAll(row);
                        // add the current element
                        nRow.add(el);
                        // add to the next intermediate result
                        nResult.add(nRow);
                    }
                }
                // pass to the next iteration
                return nResult;
            },
            // is used in parallel mode, combines the results of the work
            // of streams, obtains the Cartesian product of the results
            (result1, result2) -> {
                // combined result
                List<List<T>> result = new ArrayList<>(result1.size() * result2.size());
                // bypass the results
                for (List<T> comb1 : result1) {
                    for (List<T> comb2 : result2) {
                        // add up the combinations
                        List<T> comb = new ArrayList<>(comb1.size() + comb2.size());
                        comb.addAll(comb1);
                        comb.addAll(comb2);
                        result.add(comb);
                    }
                }
                return result;
            });
}
// start the program and output the result
public static void main(String[] args) {
    // an arbitrary number of lists and their elements
    List<String> a = Arrays.asList("A1", "B1", "C1", "D1");
    List<String> b = Arrays.asList("A2", "B2", "C2");
    List<String> c = Arrays.asList("A3", "B3");
    List<String> d = Collections.singletonList("A4");
    // cartesian product
    List<List<String>> cp = cartesianProduct(Arrays.asList(a, b, c, d));
    // output
    System.out.println("Number of combinations: " + cp.size());
    // combinations by columns
    int rows = 6;
    IntStream.range(0, rows).forEach(i -> System.out.println(
            IntStream.range(0, cp.size())
                    .filter(j -> j % rows == i)
                    .mapToObj(j -> cp.get(j).toString())
                    .collect(Collectors.joining(" "))));
}

The output for this code see above: combinations by columns.

© Golovin G.G., Code with comments, translation from Russian, 2021

Upward