NN clusterizer: Improve filling kernel speed#14510
NN clusterizer: Improve filling kernel speed#14510davidrohr merged 17 commits intoAliceO2Group:devfrom
Conversation
|
REQUEST FOR PRODUCTION RELEASES: This will add The following labels are available |
Please consider the following formatting changes to AliceO2Group#14510
|
Error while checking build/O2/fullCI_slc9 for 560f77f at 2025-07-14 02:40: Full log here. |
|
@drohr, Build error seems unrelated and not sure what exactly it would fail at... The maximum value of uint8 would be 255 if I am not mistaking |
|
I don't think the error is unrelated. It tells you that you are accessing |
Indeed, sorry for the noise. Not sure how I didn't spot that. |
Please consider the following formatting changes to AliceO2Group#14510
…r coallesced access
… for-loop over row dimension as access is somewhat coalsced too
Please consider the following formatting changes to AliceO2Group#14510
|
These are the best improvements I could find so far. @drohr can merge for now once the CI is green, please? |
|
Error while checking build/O2/fullCI_slc9 for 069a7e9 at 2025-07-17 16:09: Full log here. |
Please ping @davidrohr and not @drohr over and over. We might share the same nice name, but we are not the same person :-) |
|
@davidrohr Ready to merge once the CI is green. Not sure what is currently going on with the build container, but it builds locally |
|
Error while checking build/O2/fullCI_slc9 for 4949b55 at 2025-07-18 02:58: Full log here. |
|
Error while checking build/O2/fullCI_slc9 for f4729fd at 2025-07-18 10:26: Full log here. |
* First version of lookup tables * Simplifying computations + bug-fixes * Fixes for indexing and offsets * Adjusting CPU kernel * Please consider the following formatting changes * Fix for row-number access * Please consider the following formatting changes * Improve kernel speed by ~15%. Next test: for-loop in pad direction for coallesced access * IMproving kernel speed by 30% compared to original version. Next try: for-loop over row dimension as access is somewhat coalsced too * Please consider the following formatting changes * Minor improvements for MC handling * Beautifications to trigger the CI * Compile-fix * Fix int32_t error in fullCI build --------- Co-authored-by: ALICE Action Bot <alibuild@cern.ch>
Improves GPU kernel speed by ~30%