Skip to content

Conversation

@jack-berg
Copy link
Member

Split our DefaultSynchronousMetricStorage into delta and cumulative implementations. This serves two purposes:

  1. Makes it easier to understand. Slightly anyway. With the record and collect paths dependent on each of the combinations of MemoryMode and AggregationTemporality, there's a lot if (aggregationTemporality == x & memoryMode = y) {..} style special handling. Its hard to keep it all straight when its all mashed together.
  2. Delta temporality requires additional tracking overhead that cumulative does not. I have perf improvements planned for cumulative, but simply by untangling cumulative from delta, the perf under contention improves.

Before (From #8000):



Benchmark                       (aggregationTemporality)  (cardinality)  (instrumentTypeAndAggregation)   Mode  Cnt      Score     Error  Units
MetricRecordBenchmark.threads1                     DELTA              1                     COUNTER_SUM  thrpt    5  13414.208 ± 243.504  ops/s
MetricRecordBenchmark.threads1                     DELTA              1             UP_DOWN_COUNTER_SUM  thrpt    5  12276.148 ± 105.900  ops/s
MetricRecordBenchmark.threads1                     DELTA              1                GAUGE_LAST_VALUE  thrpt    5  10896.580 ± 705.898  ops/s
MetricRecordBenchmark.threads1                     DELTA              1              HISTOGRAM_EXPLICIT  thrpt    5   6642.787 ± 674.574  ops/s
MetricRecordBenchmark.threads1                     DELTA              1     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5   3651.887 ± 304.134  ops/s
MetricRecordBenchmark.threads1                     DELTA            100                     COUNTER_SUM  thrpt    5   8359.025 ± 777.598  ops/s
MetricRecordBenchmark.threads1                     DELTA            100             UP_DOWN_COUNTER_SUM  thrpt    5   9247.253 ± 423.551  ops/s
MetricRecordBenchmark.threads1                     DELTA            100                GAUGE_LAST_VALUE  thrpt    5   9165.700 ± 143.755  ops/s
MetricRecordBenchmark.threads1                     DELTA            100              HISTOGRAM_EXPLICIT  thrpt    5   7300.896 ± 684.395  ops/s
MetricRecordBenchmark.threads1                     DELTA            100     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5   3858.246 ±  34.989  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE              1                     COUNTER_SUM  thrpt    5  12433.135 ± 148.315  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE              1             UP_DOWN_COUNTER_SUM  thrpt    5  13341.423 ± 242.611  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5  10628.592 ± 101.145  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE              1              HISTOGRAM_EXPLICIT  thrpt    5   6895.783 ± 740.681  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE              1     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5   4087.396 ± 435.895  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE            100                     COUNTER_SUM  thrpt    5  10402.076 ± 240.933  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE            100             UP_DOWN_COUNTER_SUM  thrpt    5   9199.368 ± 107.627  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5   9056.580 ± 297.773  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE            100              HISTOGRAM_EXPLICIT  thrpt    5   7475.743 ± 979.090  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE            100     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5   3836.227 ± 131.765  ops/s
MetricRecordBenchmark.threads4                     DELTA              1                     COUNTER_SUM  thrpt    5   1577.822 ± 219.796  ops/s
MetricRecordBenchmark.threads4                     DELTA              1             UP_DOWN_COUNTER_SUM  thrpt    5   1615.582 ± 335.284  ops/s
MetricRecordBenchmark.threads4                     DELTA              1                GAUGE_LAST_VALUE  thrpt    5   1208.008 ± 165.999  ops/s
MetricRecordBenchmark.threads4                     DELTA              1              HISTOGRAM_EXPLICIT  thrpt    5    904.243 ±  22.615  ops/s
MetricRecordBenchmark.threads4                     DELTA              1     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5    869.229 ±  31.214  ops/s
MetricRecordBenchmark.threads4                     DELTA            100                     COUNTER_SUM  thrpt    5   1725.486 ± 240.360  ops/s
MetricRecordBenchmark.threads4                     DELTA            100             UP_DOWN_COUNTER_SUM  thrpt    5   1422.319 ± 594.337  ops/s
MetricRecordBenchmark.threads4                     DELTA            100                GAUGE_LAST_VALUE  thrpt    5   1560.890 ± 654.561  ops/s
MetricRecordBenchmark.threads4                     DELTA            100              HISTOGRAM_EXPLICIT  thrpt    5   1587.582 ± 458.715  ops/s
MetricRecordBenchmark.threads4                     DELTA            100     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5   1688.229 ± 181.653  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE              1                     COUNTER_SUM  thrpt    5   1540.747 ± 137.303  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE              1             UP_DOWN_COUNTER_SUM  thrpt    5   1429.698 ± 220.415  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5   1215.367 ± 546.045  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE              1              HISTOGRAM_EXPLICIT  thrpt    5   1237.215 ±  18.528  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE              1     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5    837.980 ±  23.871  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE            100                     COUNTER_SUM  thrpt    5   1602.628 ± 813.536  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE            100             UP_DOWN_COUNTER_SUM  thrpt    5   1717.663 ± 577.817  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5   1565.824 ± 298.550  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE            100              HISTOGRAM_EXPLICIT  thrpt    5   1352.174 ± 594.439  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE            100     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5   1465.394 ± 313.072  ops/s

After:

Benchmark                                          (aggregationTemporality)  (cardinality)  (instrumentTypeAndAggregation)   Mode  Cnt       Score      Error   Units
MetricRecordBenchmark.threads1                                        DELTA              1                     COUNTER_SUM  thrpt    5   12394.689 ±   78.935   ops/s
MetricRecordBenchmark.threads1                                        DELTA              1             UP_DOWN_COUNTER_SUM  thrpt    5   12191.909 ±  924.900   ops/s
MetricRecordBenchmark.threads1                                        DELTA              1                GAUGE_LAST_VALUE  thrpt    5   12179.255 ±   42.202   ops/s
MetricRecordBenchmark.threads1                                        DELTA              1              HISTOGRAM_EXPLICIT  thrpt    5    6591.959 ±  310.505   ops/s
MetricRecordBenchmark.threads1                                        DELTA              1     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5    3833.659 ±   92.551   ops/s
MetricRecordBenchmark.threads1                                        DELTA            100                     COUNTER_SUM  thrpt    5    8328.165 ±  197.753   ops/s
MetricRecordBenchmark.threads1                                        DELTA            100             UP_DOWN_COUNTER_SUM  thrpt    5    9304.038 ±  107.084   ops/s
MetricRecordBenchmark.threads1                                        DELTA            100                GAUGE_LAST_VALUE  thrpt    5    9180.458 ±  245.010   ops/s
MetricRecordBenchmark.threads1                                        DELTA            100              HISTOGRAM_EXPLICIT  thrpt    5    7301.764 ±  460.243   ops/s
MetricRecordBenchmark.threads1                                        DELTA            100     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5    4056.783 ±  110.216   ops/s
MetricRecordBenchmark.threads1                                   CUMULATIVE              1                     COUNTER_SUM  thrpt    5   14943.538 ±  521.012   ops/s
MetricRecordBenchmark.threads1                                   CUMULATIVE              1             UP_DOWN_COUNTER_SUM  thrpt    5   15711.561 ±  925.512   ops/s
MetricRecordBenchmark.threads1                                   CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5   12823.809 ±  295.055   ops/s
MetricRecordBenchmark.threads1                                   CUMULATIVE              1              HISTOGRAM_EXPLICIT  thrpt    5    8688.402 ±  400.953   ops/s
MetricRecordBenchmark.threads1                                   CUMULATIVE              1     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5    4219.736 ±  173.475   ops/s
MetricRecordBenchmark.threads1                                   CUMULATIVE            100                     COUNTER_SUM  thrpt    5    8285.499 ±  202.633   ops/s
MetricRecordBenchmark.threads1                                   CUMULATIVE            100             UP_DOWN_COUNTER_SUM  thrpt    5    9312.007 ±  273.076   ops/s
MetricRecordBenchmark.threads1                                   CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5   10551.743 ±  567.672   ops/s
MetricRecordBenchmark.threads1                                   CUMULATIVE            100              HISTOGRAM_EXPLICIT  thrpt    5    7034.936 ±  221.518   ops/s
MetricRecordBenchmark.threads1                                   CUMULATIVE            100     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5    3933.954 ±  129.073   ops/s
MetricRecordBenchmark.threads4                                        DELTA              1                     COUNTER_SUM  thrpt    5    1358.914 ±  316.886   ops/s
MetricRecordBenchmark.threads4                                        DELTA              1             UP_DOWN_COUNTER_SUM  thrpt    5    1385.987 ±  391.289   ops/s
MetricRecordBenchmark.threads4                                        DELTA              1                GAUGE_LAST_VALUE  thrpt    5    1070.239 ±  539.133   ops/s
MetricRecordBenchmark.threads4                                        DELTA              1              HISTOGRAM_EXPLICIT  thrpt    5    1312.886 ±   17.189   ops/s
MetricRecordBenchmark.threads4                                        DELTA              1     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5    1007.867 ±   35.051   ops/s
MetricRecordBenchmark.threads4                                        DELTA            100                     COUNTER_SUM  thrpt    5    1273.409 ±  767.265   ops/s
MetricRecordBenchmark.threads4                                        DELTA            100             UP_DOWN_COUNTER_SUM  thrpt    5    1383.256 ±  766.709   ops/s
MetricRecordBenchmark.threads4                                        DELTA            100                GAUGE_LAST_VALUE  thrpt    5    1313.746 ± 1090.433   ops/s
MetricRecordBenchmark.threads4                                        DELTA            100              HISTOGRAM_EXPLICIT  thrpt    5    1646.800 ±  618.065   ops/s
MetricRecordBenchmark.threads4                                        DELTA            100     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5    1645.053 ±  235.244   ops/s
MetricRecordBenchmark.threads4                                   CUMULATIVE              1                     COUNTER_SUM  thrpt    5    5706.951 ±  625.129   ops/s
MetricRecordBenchmark.threads4                                   CUMULATIVE              1             UP_DOWN_COUNTER_SUM  thrpt    5    4763.731 ± 2624.753   ops/s
MetricRecordBenchmark.threads4                                   CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5    2742.159 ± 1213.808   ops/s
MetricRecordBenchmark.threads4                                   CUMULATIVE              1              HISTOGRAM_EXPLICIT  thrpt    5    1490.623 ±   33.851   ops/s
MetricRecordBenchmark.threads4                                   CUMULATIVE              1     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5    1631.722 ±  225.171   ops/s
MetricRecordBenchmark.threads4                                   CUMULATIVE            100                     COUNTER_SUM  thrpt    5    6201.092 ± 5854.249   ops/s
MetricRecordBenchmark.threads4                                   CUMULATIVE            100             UP_DOWN_COUNTER_SUM  thrpt    5    5497.660 ± 5484.849   ops/s
MetricRecordBenchmark.threads4                                   CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5    5098.889 ± 3121.738   ops/s
MetricRecordBenchmark.threads4                                   CUMULATIVE            100              HISTOGRAM_EXPLICIT  thrpt    5    5615.150 ± 1449.110   ops/s
MetricRecordBenchmark.threads4                                   CUMULATIVE            100     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5    4640.358 ± 2528.810   ops/s

I calculate ~3x improvement for cumulative instruments under contention (i.e. 4 thread cases) and more modest ~1.2x improvement for cumulative instruments without contention (i.e. 1 thread cases).

@jack-berg jack-berg requested a review from a team as a code owner January 26, 2026 23:18
aggregatorHolder.activeRecordingThreads.addAndGet(-2);
}
} while (true);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the key work that cumulative instruments get to avoid.

On each collection, delta instruments substitute out the ConcurrentHashMap which holds AggregatorHandle instances corresponding to each series. In order to prevent lost write scenarios of the following sequence, delta requires additional coordination between collection and recording threads:

  • T1: record thread grabs instance 1 of ConcurrentHashMap
  • T2: collect thread obtains instance 1 of ConcurrentHashMap, and collects from all series
  • T3: collect thread updates the ConcurrentHashMap to a new instance 2
  • T4: record thread records to instance 1 of ConcurrentHashMap, which is never seen by any current or future collect

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In prom client java, we have 2 sets of variables (for the map in this case) and switch between them using an atomic. I wonder if we could apply this pattern here as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm familiar with how the prometheus works and will be incrementally applying techniques where useful / applicable - I'm trying to optimize the perf over there as well. The differences between prom and otel boil down to otel needing to accommodate all combinations of memory mode (immutable, reusable) and temporality (cumulative, delta), where prom client java only has the equivalent of immutable memory mode and cumulative temporality.

@codecov
Copy link

codecov bot commented Jan 26, 2026

Codecov Report

❌ Patch coverage is 94.68085% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.18%. Comparing base (d6e7e6e) to head (9726fbd).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...nternal/state/DefaultSynchronousMetricStorage.java 94.62% 1 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##               main    #8015   +/-   ##
=========================================
  Coverage     90.17%   90.18%           
+ Complexity     7479     7456   -23     
=========================================
  Files           834      834           
  Lines         22540    22569   +29     
  Branches       2236     2241    +5     
=========================================
+ Hits          20326    20353   +27     
+ Misses         1511     1510    -1     
- Partials        703      706    +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors synchronous metric storage to separate cumulative vs. delta aggregation temporality implementations, aiming to simplify logic and improve performance under contention (especially for cumulative).

Changes:

  • Replaced monolithic DefaultSynchronousMetricStorage implementation with distinct delta and cumulative implementations selected via a factory method.
  • Updated SynchronousMetricStorage.create to delegate to the new DefaultSynchronousMetricStorage.create factory.
  • Updated unit tests to construct storage via the factory and adjust internal assertions accordingly.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
sdk/metrics/src/main/java/io/opentelemetry/sdk/metrics/internal/state/DefaultSynchronousMetricStorage.java Splits storage logic into delta vs. cumulative implementations and introduces a factory constructor path.
sdk/metrics/src/main/java/io/opentelemetry/sdk/metrics/internal/state/SynchronousMetricStorage.java Switches to using the new storage factory rather than directly instantiating the default storage.
sdk/metrics/src/test/java/io/opentelemetry/sdk/metrics/internal/state/SynchronousMetricStorageTest.java Updates tests to use the new factory and adapts assertions around pooling/reset behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +215 to +217
assertThat(storage)
.extracting("aggregatorHandlePool", as(collection(AggregatorHandle.class)))
.hasSize(0);
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These assertions reach into a private implementation detail via reflective field extraction ("aggregatorHandlePool"). This is brittle (renames/refactors won’t be caught by the compiler) and leads to repeated boilerplate throughout the test. Consider reintroducing a small test hook on DefaultSynchronousMetricStorage (e.g., a package-private/VisibleForTesting method that returns the pool size or an optional pool) so tests can assert pooling behavior without relying on reflection, or alternatively assert pooling indirectly via createHandle() call counts and remove the pool-size checks.

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The aggregatorHandlePool field now only exists on the delta storage, and so tests would need to cast to the storage to delta prior to invoking the visible-for-testing method. I don't want to move aggregatorHandlePool down to the base class DefaultSynchronousMetricStorage because its not a concern for cumulative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants