[CELEBORN-2221][CIP-14] Support writing with compression in C++ client#3575
[CELEBORN-2221][CIP-14] Support writing with compression in C++ client#3575afterincomparableyum wants to merge 5 commits intoapache:mainfrom
Conversation
6abae43 to
17e891a
Compare
Integrate existing compression infrastructure (LZ4 and ZSTD) into the C++ client write path. This enables compression during pushData operations, matching the functionality available in the Java client. Changes: - Add compression support to ShuffleClientImpl: * Add shuffleCompressionEnabled_ flag and compressor_ member * Initialize compressor from CelebornConf in constructor * Compress data in pushData() when compression is enabled * Use compressed size for batchBytesSize tracking - Configuration integration: * Read compression codec from celeborn.client.shuffle.compression.codec * Read ZSTD compression level from celeborn.client.shuffle.compression.zstd.level * Default to NONE (compression disabled) - Retry/revive support: * Retry path correctly uses pre-compressed body buffer * No re-compression needed during retries - Testing: * Add CompressorFactoryTest for factory pattern and config integration * Add compression config tests to CelebornConfTest * Test offset compression support for both LZ4 and ZSTD
17e891a to
d9be058
Compare
|
@HolyLow @SteNicholas @FMX @RexXiong Could you please help review this PR? Appreciate your help in improving this as needed! |
|
@afterincomparableyum, please fix the failure of CI which is caused by format. |
|
@SteNicholas I have fixed the clang format issues. |
There was a problem hiding this comment.
Pull request overview
Integrates existing compression (LZ4/ZSTD) into the C++ client shuffle write path so pushData() can send compressed payloads driven by CelebornConf.
Changes:
- Add optional compression in
ShuffleClientImpl::pushData()and initialize a compressor from config. - Extend C++ conf tests for shuffle compression codec and ZSTD level settings.
- Add unit tests for compressor factory/config integration and wire them into the C++ test target.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| cpp/celeborn/conf/tests/CelebornConfTest.cpp | Adds default/override assertions for shuffle compression codec and ZSTD level. |
| cpp/celeborn/client/tests/CompressorFactoryTest.cpp | Adds tests for compressor creation from conf and “offset” compression behavior. |
| cpp/celeborn/client/tests/CMakeLists.txt | Registers the new compressor factory test in the test executable. |
| cpp/celeborn/client/ShuffleClient.h | Adds compressor-related members to ShuffleClientImpl. |
| cpp/celeborn/client/ShuffleClient.cpp | Initializes compressor from conf and compresses payload in pushData(). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
9bfc4f6 to
36d3a41
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@afterincomparableyum, could you take a look at the review comment of Copilot? |
d396eb0 to
3977b3d
Compare
|
@SteNicholas I took a look at the comments of Copilot and made changes as needed. |
Integrate existing compression infrastructure (LZ4 and ZSTD) into the C++ client write path. This enables compression during pushData operations, matching the functionality available in the Java client.
Changes:
Add compression support to ShuffleClientImpl:
Configuration integration:
Retry/revive support:
Testing:
How was this patch tested?
Unit Tests, as well as compiling code