Add test and analysis for issue #396: Duplicate Python Values#399
Add test and analysis for issue #396: Duplicate Python Values#399saulshanabrook wants to merge 2 commits intomainfrom
Conversation
- Created test_issue_396_duplicate_pyobjects.py with test cases demonstrating expected behavior for PyObject deduplication - Added comprehensive analysis document (ISSUE_396_ANALYSIS.md) with: - Problem root cause analysis - 6 different solution approaches evaluated - Recommendation: Object Identity Cache (minimal changes, good performance) - Implementation plan and next steps Tests currently pass as cloudpickle is deterministic in simple cases. The issue may manifest in more complex scenarios with classes in __main__.
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Pull request overview
This pull request adds comprehensive test coverage and analysis documentation for issue #396, which concerns duplicate Python object nodes in the e-graph when cloudpickle produces non-deterministic byte sequences for the same object.
Changes:
- Added test file with three test cases demonstrating the duplicate PyObject issue
- Created detailed analysis document evaluating 6 potential solutions with recommendation
- Tests are designed to detect cloudpickle determinism issues with classes, nested classes, and closures
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| python/tests/test_issue_396_duplicate_pyobjects.py | Adds three test cases that verify PyObject deduplication behavior for same-instance objects, nested classes, and closures |
| ISSUE_396_ANALYSIS.md | Comprehensive analysis document detailing the root cause, 6 potential solutions, recommendation for Object Identity Cache approach, and implementation plan |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| # Define a simple class in the test module (acts like __main__) | ||
| class MyClass: | ||
| def __init__(self, value) -> None: |
There was a problem hiding this comment.
The __init__ method should have a return type annotation of -> None for consistency with other test files in this codebase. This is a standard Python typing convention for constructors.
|
|
||
| class OuterClass: | ||
| class InnerClass: | ||
| def __init__(self, x) -> None: |
There was a problem hiding this comment.
The __init__ method should have a return type annotation of -> None for consistency with other test files in this codebase. This is a standard Python typing convention for constructors.
| pickled2 = cloudpickle.dumps(obj) | ||
|
|
||
| # Report on pickle determinism | ||
| print("\nCloudpickle determinism check:") |
There was a problem hiding this comment.
Unnecessary f-string: This string literal does not contain any interpolation expressions and does not need to be an f-string.
| print(" ⚠ Cloudpickle produced DIFFERENT bytes for the same object!") | ||
| else: | ||
| print(" ✓ Cloudpickle is deterministic in this case") |
There was a problem hiding this comment.
Unnecessary f-string: These string literals do not contain any interpolation expressions and do not need to be f-strings.
| print(" ✗ E-graph treats objects as DIFFERENT - duplicates detected!") | ||
| print(f" Error: {e}") |
There was a problem hiding this comment.
Unnecessary f-string: These string literals do not contain any interpolation expressions and do not need to be f-strings.
| print("✗ Nested objects are NOT equal - duplicates detected!") | ||
| pytest.fail(f"Duplicate PyObject nodes for nested class: {e}") |
There was a problem hiding this comment.
Unnecessary f-string: These string literals do not contain any interpolation expressions and do not need to be f-strings.
| except Exception as e: | ||
| print("✗ Closures are NOT equal - duplicates detected!") |
There was a problem hiding this comment.
Unnecessary f-string: These string literals do not contain any interpolation expressions and do not need to be f-strings.
| import cloudpickle | ||
| import pytest | ||
|
|
||
| from egglog import * |
expected behavior for PyObject deduplication
Tests currently pass as cloudpickle is deterministic in simple cases.
The issue may manifest in more complex scenarios with classes in main.