[C++][Compute] Pre-solve chunked indices before merging chunks in Sort kernels #44084

pitrou · 2024-09-12T13:48:29Z

Describe the enhancement requested

In the chunked sort kernels (for ChunkedArray and Table), the most expensive step can be the recursive merge of sorted chunks after each individual chunk was sorted.

Currently, this merge step resolves chunked indices every time an access is made to read a value. This means chunked resolution is computed O(n*log2(k)) times (where n is the input length and k is the number of chunks).

However, we could instead compute chunked indices after sorting the individual chunks. Then there would be no chunk resolution when merging, just direct accesses through ResolvedChunks.

Component(s)

C++

The text was updated successfully, but these errors were encountered:

pitrou · 2024-09-12T13:48:36Z

cc @felipecrv

pitrou · 2024-09-14T08:59:13Z

Two potential downsides to this approach:

the size taken by those temporary ResolvedChunks is twice the size of indices, hence a bigger CPU cache footprint
there has to be a final "reverse resolution" step where we convert back the sorted ResolvedChunks into absolute indices... or we maintain those absolute indices along the ResolvedChunk, which implies an ever bigger cache footprint

Experimenting will tell whether this can be beneficial.

pitrou added the Type: enhancement label Sep 12, 2024

github-actions bot added the Component: C++ label Sep 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[C++][Compute] Pre-solve chunked indices before merging chunks in Sort kernels #44084

[C++][Compute] Pre-solve chunked indices before merging chunks in Sort kernels #44084

pitrou commented Sep 12, 2024

pitrou commented Sep 12, 2024

pitrou commented Sep 14, 2024

[C++][Compute] Pre-solve chunked indices before merging chunks in Sort kernels #44084

[C++][Compute] Pre-solve chunked indices before merging chunks in Sort kernels #44084

Comments

pitrou commented Sep 12, 2024

Describe the enhancement requested

Component(s)

pitrou commented Sep 12, 2024

pitrou commented Sep 14, 2024