Skip to content

Conversation

@jack-berg
Copy link
Member

Looking at the recent #8000, something jumped out at me.. Gauges we're allocating a seemingly large amount of memory!

Note, the benchmarks shown in #8000 description were generated using the benchmark build command java -jar *-jmh.jar .., which only emits the ops/s summary figure. If you instead run ./gradlew :sdk:all:jmh -PjmhIncludeSingleClass=MetricRecordBenchmark, you get a more detailed output which includes B/op.

Anyway, all the other metrics have tiny B/op figures near zero thanks to our work to reduce allocations. Meanwhile, GAUGE_LAST_VALUE is reporting figures like 229945 B/op. This isn't as bad as it looks. Each op is actually 10 * 1024 distinct record operations, so the actual B/op is 229945 / (10 * 1024) = 22 B/op. Much better, but still not good.

I tracked it down to very old code in LongLastValueAggregator doing Long / long unboxing. Interestingly, we managed to solve this problem for DoubleLastValueAggregator a while back, but the neglected to apply the solution to doubles.

Anyway, the fix is straight forward and mirrors the logic of DoubleLastValueAggregator.

Here's the before and after, with allocations approaching zero.

Before:

Benchmark                                          (aggregationTemporality)  (cardinality)  (instrumentTypeAndAggregation)   Mode  Cnt       Score      Error   Units
MetricRecordBenchmark.threads1                                        DELTA              1                GAUGE_LAST_VALUE  thrpt    5   12179.255 ±   42.202   ops/s
MetricRecordBenchmark.threads1:gc.alloc.rate                          DELTA              1                GAUGE_LAST_VALUE  thrpt    5    2670.078 ±    8.627  MB/sec
MetricRecordBenchmark.threads1:gc.alloc.rate.norm                     DELTA              1                GAUGE_LAST_VALUE  thrpt    5  229945.518 ±    8.162    B/op
MetricRecordBenchmark.threads1:gc.count                               DELTA              1                GAUGE_LAST_VALUE  thrpt    5      30.000             counts
MetricRecordBenchmark.threads1                                        DELTA            100                GAUGE_LAST_VALUE  thrpt    5    9180.458 ±  245.010   ops/s
MetricRecordBenchmark.threads1:gc.alloc.rate                          DELTA            100                GAUGE_LAST_VALUE  thrpt    5    2013.665 ±   52.958  MB/sec
MetricRecordBenchmark.threads1:gc.alloc.rate.norm                     DELTA            100                GAUGE_LAST_VALUE  thrpt    5  230066.007 ±   10.797    B/op
MetricRecordBenchmark.threads1:gc.count                               DELTA            100                GAUGE_LAST_VALUE  thrpt    5      22.000             counts
MetricRecordBenchmark.threads1                                   CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5   12823.809 ±  295.055   ops/s
MetricRecordBenchmark.threads1:gc.alloc.rate                     CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5    2811.348 ±   65.588  MB/sec
MetricRecordBenchmark.threads1:gc.alloc.rate.norm                CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5  229945.443 ±    7.769    B/op
MetricRecordBenchmark.threads1:gc.count                          CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5      32.000             counts
MetricRecordBenchmark.threads1                                   CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5   10551.743 ±  567.672   ops/s
MetricRecordBenchmark.threads1:gc.alloc.rate                     CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5    2314.450 ±  124.525  MB/sec
MetricRecordBenchmark.threads1:gc.alloc.rate.norm                CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5  230065.756 ±    9.472    B/op
MetricRecordBenchmark.threads1:gc.count                          CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5      26.000             counts
MetricRecordBenchmark.threads4                                        DELTA              1                GAUGE_LAST_VALUE  thrpt    5    1070.239 ±  539.133   ops/s
MetricRecordBenchmark.threads4:gc.alloc.rate                          DELTA              1                GAUGE_LAST_VALUE  thrpt    5     234.467 ±  116.851  MB/sec
MetricRecordBenchmark.threads4:gc.alloc.rate.norm                     DELTA              1                GAUGE_LAST_VALUE  thrpt    5  229962.212 ±   79.697    B/op
MetricRecordBenchmark.threads4:gc.count                               DELTA              1                GAUGE_LAST_VALUE  thrpt    5       3.000             counts
MetricRecordBenchmark.threads4                                        DELTA            100                GAUGE_LAST_VALUE  thrpt    5    1313.746 ± 1090.433   ops/s
MetricRecordBenchmark.threads4:gc.alloc.rate                          DELTA            100                GAUGE_LAST_VALUE  thrpt    5     287.944 ±  237.647  MB/sec
MetricRecordBenchmark.threads4:gc.alloc.rate.norm                     DELTA            100                GAUGE_LAST_VALUE  thrpt    5  230079.173 ±   66.513    B/op
MetricRecordBenchmark.threads4:gc.count                               DELTA            100                GAUGE_LAST_VALUE  thrpt    5       3.000             counts
MetricRecordBenchmark.threads4                                   CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5    2742.159 ± 1213.808   ops/s
MetricRecordBenchmark.threads4:gc.alloc.rate                     CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5     600.356 ±  264.526  MB/sec
MetricRecordBenchmark.threads4:gc.alloc.rate.norm                CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5  229951.342 ±   33.438    B/op
MetricRecordBenchmark.threads4:gc.count                          CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5       7.000             counts
MetricRecordBenchmark.threads4                                   CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5    5098.889 ± 3121.738   ops/s
MetricRecordBenchmark.threads4:gc.alloc.rate                     CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5    1117.715 ±  682.070  MB/sec
MetricRecordBenchmark.threads4:gc.alloc.rate.norm                CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5  230067.848 ±   16.629    B/op
MetricRecordBenchmark.threads4:gc.count                          CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5      13.000             counts

After:

Benchmark                                          (aggregationTemporality)  (cardinality)  (instrumentTypeAndAggregation)   Mode  Cnt       Score      Error   Units
MetricRecordBenchmark.threads1                                        DELTA              1                GAUGE_LAST_VALUE  thrpt    5   3510.788 ± 170.842   ops/s
MetricRecordBenchmark.threads1:gc.alloc.rate                          DELTA              1                GAUGE_LAST_VALUE  thrpt    5      0.017 ±   0.094  MB/sec
MetricRecordBenchmark.threads1:gc.alloc.rate.norm                     DELTA              1                GAUGE_LAST_VALUE  thrpt    5      5.197 ±  27.900    B/op
MetricRecordBenchmark.threads1:gc.count                               DELTA              1                GAUGE_LAST_VALUE  thrpt    5        ≈ 0            counts
MetricRecordBenchmark.threads1                                        DELTA            100                GAUGE_LAST_VALUE  thrpt    5   2940.213 ±  94.036   ops/s
MetricRecordBenchmark.threads1:gc.alloc.rate                          DELTA            100                GAUGE_LAST_VALUE  thrpt    5      0.017 ±   0.094  MB/sec
MetricRecordBenchmark.threads1:gc.alloc.rate.norm                     DELTA            100                GAUGE_LAST_VALUE  thrpt    5      6.260 ±  33.812    B/op
MetricRecordBenchmark.threads1:gc.count                               DELTA            100                GAUGE_LAST_VALUE  thrpt    5        ≈ 0            counts
MetricRecordBenchmark.threads1                                   CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5   3524.276 ±  62.724   ops/s
MetricRecordBenchmark.threads1:gc.alloc.rate                     CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5      0.017 ±   0.094  MB/sec
MetricRecordBenchmark.threads1:gc.alloc.rate.norm                CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5      5.213 ±  28.144    B/op
MetricRecordBenchmark.threads1:gc.count                          CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5        ≈ 0            counts
MetricRecordBenchmark.threads1                                   CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5   2829.248 ±  56.034   ops/s
MetricRecordBenchmark.threads1:gc.alloc.rate                     CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5      0.017 ±   0.094  MB/sec
MetricRecordBenchmark.threads1:gc.alloc.rate.norm                CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5      6.479 ±  34.905    B/op
MetricRecordBenchmark.threads1:gc.count                          CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5        ≈ 0            counts
MetricRecordBenchmark.threads4                                        DELTA              1                GAUGE_LAST_VALUE  thrpt    5   1356.267 ± 223.910   ops/s
MetricRecordBenchmark.threads4:gc.alloc.rate                          DELTA              1                GAUGE_LAST_VALUE  thrpt    5      0.020 ±   0.094  MB/sec
MetricRecordBenchmark.threads4:gc.alloc.rate.norm                     DELTA              1                GAUGE_LAST_VALUE  thrpt    5     15.139 ±  71.777    B/op
MetricRecordBenchmark.threads4:gc.count                               DELTA              1                GAUGE_LAST_VALUE  thrpt    5        ≈ 0            counts
MetricRecordBenchmark.threads4                                        DELTA            100                GAUGE_LAST_VALUE  thrpt    5   1251.027 ± 429.411   ops/s
MetricRecordBenchmark.threads4:gc.alloc.rate                          DELTA            100                GAUGE_LAST_VALUE  thrpt    5      0.020 ±   0.094  MB/sec
MetricRecordBenchmark.threads4:gc.alloc.rate.norm                     DELTA            100                GAUGE_LAST_VALUE  thrpt    5     16.109 ±  74.731    B/op
MetricRecordBenchmark.threads4:gc.count                               DELTA            100                GAUGE_LAST_VALUE  thrpt    5        ≈ 0            counts
MetricRecordBenchmark.threads4                                   CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5   1123.497 ± 645.130   ops/s
MetricRecordBenchmark.threads4:gc.alloc.rate                     CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5      0.020 ±   0.095  MB/sec
MetricRecordBenchmark.threads4:gc.alloc.rate.norm                CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5     16.639 ±  68.509    B/op
MetricRecordBenchmark.threads4:gc.count                          CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5        ≈ 0            counts
MetricRecordBenchmark.threads4                                   CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5   1587.276 ± 289.479   ops/s
MetricRecordBenchmark.threads4:gc.alloc.rate                     CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5      0.020 ±   0.094  MB/sec
MetricRecordBenchmark.threads4:gc.alloc.rate.norm                CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5     13.217 ±  64.311    B/op
MetricRecordBenchmark.threads4:gc.count                          CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5        ≈ 0            counts

@jack-berg jack-berg requested a review from a team as a code owner January 27, 2026 15:22
@codecov
Copy link

codecov bot commented Jan 27, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.17%. Comparing base (8dc1e03) to head (5abea94).

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #8017      +/-   ##
============================================
+ Coverage     90.16%   90.17%   +0.01%     
- Complexity     7483     7484       +1     
============================================
  Files           836      836              
  Lines         22562    22562              
  Branches       2237     2237              
============================================
+ Hits          20343    20346       +3     
+ Misses         1515     1513       -2     
+ Partials        704      703       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR eliminates memory allocations in LongLastValueAggregator by replacing boxed Long objects with primitive long values wrapped in AtomicLong. This optimization mirrors a fix previously applied to DoubleLastValueAggregator in PR #7264.

Changes:

  • Replaced AtomicReference<Long> with AtomicReference<AtomicLong> to avoid boxing allocations
  • Added AtomicLong value field to hold the actual long value
  • Updated doRecordLong and doAggregateThenMaybeResetLongs to work with the new structure
  • Reduced memory allocations from ~230KB/op to near-zero in benchmarks

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants