Skip to content

Fix: resolve intermittent system-test timeouts in common.ts#2722

Open
thiyaguk09 wants to merge 2 commits intomainfrom
fix/system-test-flake-common
Open

Fix: resolve intermittent system-test timeouts in common.ts#2722
thiyaguk09 wants to merge 2 commits intomainfrom
fix/system-test-flake-common

Conversation

@thiyaguk09
Copy link
Contributor

Description

This PR resolves intermittent flakiness and timeouts in the Service system tests within common.ts.

Key Changes:

  • Dynamic Ports: Refactored mock servers to use port 0 (ephemeral ports) instead of hardcoded 8118 to prevent EADDRINUSE collisions in CI runners.
  • Retry Logic: Increased the test timeout to 90s to accommodate exponential backoff delays and CI environment latency.
  • Lifecycle Management: Wrapped assertions in try/catch to ensure mockServer.close() is always called, preventing socket leaks.
  • Async Fix: Corrected a race condition in the non-responsive host test where done() was being called before the retry cycle completed.

Impact

Reduces "Exit Status 1" build failures in Kokoro CI, improving the stability of the continuous integration pipeline for the Node.js Storage library.

Testing

  • System Tests: Verified locally via npm run system-test.
  • Telemetry: Inspected retry cycles using NODE_DEBUG=http to confirm the library correctly registers all 4 attempts before concluding.
  • Cleanup: Confirmed no orphaned processes or open sockets remain after test completion.

Additional Information

The original 60s timeout was frequently exceeded during the 4-attempt retry cycle when the CI runner was under heavy load. The move to dynamic ports is the primary structural fix to prevent cross-test interference.

Checklist

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease
  • Appropriate docs were updated
  • Appropriate comments were added, particularly in complex areas or places that require background
  • No new warnings or issues will be generated from this change

Fixes #2647

- Implement dynamic port allocation (port 0) for mock servers to avoid
EADDRINUSE collisions in CI.
- Increase timeout to 90s for retry logic to accommodate environmental
latency.
- Ensure mock servers are explicitly closed in both success and failure
blocks.
- Fix async race condition in 'non-responsive hosts' test by moving
done() into the callback.
@thiyaguk09 thiyaguk09 requested review from a team as code owners February 6, 2026 12:09
@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: storage Issues related to the googleapis/nodejs-storage API. labels Feb 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: storage Issues related to the googleapis/nodejs-storage API. size: m Pull request size is medium.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Common Service: should retry a request failed

1 participant