[BugFix] Fix the bug when boolean comparison condition is simplifed to field#5071
Conversation
Signed-off-by: Songkan Tang <songkant@amazon.com>
📝 WalkthroughSummary by CodeRabbit
WalkthroughConvert boolean field predicates earlier in Calcite traversal and predicate analysis to emit exact boolean term or negated-term queries; add unit and integration tests, a REST YAML test, and expected explain-plan YAMLs covering boolean pushdown cases. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Planner as CalcitePlanner
participant Rex as CalciteRexNodeVisitor
participant Analyzer as PredicateAnalyzer
participant QExpr as QueryExpression
participant DSL as OpenSearch DSL
Client->>Planner: submit SQL with boolean predicate
Planner->>Rex: translate Rex nodes (compare / NOT)
Rex->>Planner: rewrite != / NOT -> IS_NOT_* when applicable
Planner->>Analyzer: analyzeExpression(filter)
Analyzer->>Analyzer: detect NamedFieldExpression.isBooleanType()
Analyzer->>QExpr: convert boolean field -> isTrue()/isFalse()/isNotTrue()/isNotFalse()
QExpr->>DSL: emit TermQuery or must_not TermQuery (combined with query_string)
DSL-->>Planner: return pushed-down DSL
Planner-->>Client: explain/execute with pushed-down boolean term
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
Signed-off-by: Songkan Tang <songkant@amazon.com>
| Content-Type: 'application/json' | ||
| ppl: | ||
| body: | ||
| query: source=test-boolean | where is_internal=true | fields name |
There was a problem hiding this comment.
Using failed query source=test url=http | where is_internal=true
in #5054
| // Handle NOT(IS_TRUE(boolean_field)) - convert to term query with false value | ||
| // This covers cases where IS_TRUE was explicitly applied | ||
| if (expr instanceof SimpleQueryExpression simpleExpr && simpleExpr.isBooleanFieldIsTrue()) { | ||
| return QueryExpression.create(simpleExpr.rel).isFalse(); | ||
| } |
There was a problem hiding this comment.
- (NOT boolean_field = true) will return fields include ture, null and missing fields
- but boolean_field=false only return fields has false value.
| // generate a term query with value true. | ||
| // When called on an already-evaluated predicate (builder already set), | ||
| // return as-is. | ||
| if (builder == null) { |
There was a problem hiding this comment.
Is it possible to override isTrue and not API for NamedFieldExpression instead of changing SimpleQueryExpression?
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/5054.yml`:
- Around line 1-15: The test uses an index named "test" and currently doesn't
clean it up; update the YAML to ensure index isolation by adding explicit delete
steps for the "test" index in both the setup and teardown blocks (or replace
"test" with a generated unique name), e.g., add a do: delete index action before
the test runs and another delete after the test completes so the index cannot
leak state or conflict with other tests; reference the existing setup/teardown
blocks and the index name "test" when making these changes.
integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/5054.yml
Show resolved
Hide resolved
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
| if (operand instanceof NamedFieldExpression namedField && namedField.isBooleanType()) { | ||
| return booleanOp.apply(QueryExpression.create(namedField)); | ||
| } | ||
| // IS_TRUE on a predicate (already evaluated QueryExpression) is allowed |
There was a problem hiding this comment.
Will other operation like IS_FALSE, IS_NOT_TRUE, IS_NOT_FALSE apply to QueryExpression?
Signed-off-by: Songkan Tang <songkant@amazon.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In
`@opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java`:
- Around line 603-626: The boolean postfix handling currently calls
QueryExpression.isFalse()/isNotFalse()/isNotTrue() which overwrite any existing
builder; change the branch that handles operand instanceof QueryExpression qe to
avoid calling those mutators and instead preserve/wrap the existing builder: for
IS_TRUE/IS_NOT_FALSE return qe as-is, and for IS_FALSE/IS_NOT_TRUE return a
negated form of qe (implement QueryExpression.negate() or wrap qe.getBuilder()
into a BoolQuery with mustNot) so predicates like (age > 30) IS FALSE are
expressed by negating the existing predicate builder rather than replacing it
with a term query; keep the existing boolean-field handling
(NamedFieldExpression) unchanged.
opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java
Show resolved
Hide resolved
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In
`@opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java`:
- Around line 603-625: The boolean operator handling in PredicateAnalyzer
currently calls booleanOp.apply(qe) for any QueryExpression operand, but
CompoundQueryExpression does not implement isTrue/isNotTrue and thus calling
those methods throws; update the branch that handles operand instanceof
QueryExpression to detect CompoundQueryExpression (or other predicate
QueryExpression subclasses) and directly route IS_TRUE and IS_NOT_TRUE to the
predicate-handling path instead of invoking qe.isTrue/ qe.isNotTrue;
specifically, inside the if (operand instanceof QueryExpression qe) block check
if qe is a CompoundQueryExpression (or predicate-type) and for call.getKind() ==
IS_TRUE / IS_NOT_TRUE return the appropriate predicate query (the same output
produced for NamedFieldExpression boolean predicates) or otherwise fall back to
booleanOp.apply(qe) for supported QueryExpression implementations.
🧹 Nitpick comments (1)
opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java (1)
229-235: Consider recording analyzed nodes for the top-level boolean-field shortcut.This keeps
analyzedNodesconsistent with thetryAnalyzeOperandpath and improves downstream partial-pushdown bookkeeping.🔧 Suggested tweak
- if (result instanceof NamedFieldExpression namedField && namedField.isBooleanType()) { - return QueryExpression.create(namedField).isTrue(); - } + if (result instanceof NamedFieldExpression namedField && namedField.isBooleanType()) { + QueryExpression qe = QueryExpression.create(namedField).isTrue(); + qe.updateAnalyzedNodes(expression); + return qe; + }
opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java
Show resolved
Hide resolved
Signed-off-by: Songkan Tang <songkant@amazon.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In
`@integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java`:
- Around line 2651-2681: Add a new integration test method alongside
testFilterBooleanFieldOnlyTrue/OnlyFalse/OnlyNotTrue named e.g.,
testFilterBooleanFieldIsNull that calls enabledOnlyWhenPushdownIsEnabled(),
builds a query against TEST_INDEX_BANK_WITH_NULL_VALUES using the project's
NULL-check syntax (e.g., "source=%s | where isnull(male) | fields firstname"),
calls explainQueryYaml(query), loads the expected plan file (e.g.,
"explain_filter_boolean_only_null.yaml") and asserts with
assertYamlEqualsIgnoreId(expected, result) to validate null/missing boolean
handling in the boolean pushdown path.
| @Test | ||
| public void testFilterBooleanFieldOnlyTrue() throws IOException { | ||
| enabledOnlyWhenPushdownIsEnabled(); | ||
| // Test single boolean filter without query_string | ||
| String query = | ||
| StringUtils.format("source=%s | where male = true | fields firstname", TEST_INDEX_BANK); | ||
| var result = explainQueryYaml(query); | ||
| String expected = loadExpectedPlan("explain_filter_boolean_only_true.yaml"); | ||
| assertYamlEqualsIgnoreId(expected, result); | ||
| } | ||
|
|
||
| @Test | ||
| public void testFilterBooleanFieldOnlyFalse() throws IOException { | ||
| enabledOnlyWhenPushdownIsEnabled(); | ||
| // Test single boolean filter with false value without query_string | ||
| String query = | ||
| StringUtils.format("source=%s | where male = false | fields firstname", TEST_INDEX_BANK); | ||
| var result = explainQueryYaml(query); | ||
| String expected = loadExpectedPlan("explain_filter_boolean_only_false.yaml"); | ||
| assertYamlEqualsIgnoreId(expected, result); | ||
| } | ||
|
|
||
| @Test | ||
| public void testFilterBooleanFieldOnlyNotTrue() throws IOException { | ||
| enabledOnlyWhenPushdownIsEnabled(); | ||
| // Test single NOT boolean filter without query_string | ||
| String query = | ||
| StringUtils.format("source=%s | where NOT male = true | fields firstname", TEST_INDEX_BANK); | ||
| var result = explainQueryYaml(query); | ||
| String expected = loadExpectedPlan("explain_filter_boolean_only_not_true.yaml"); | ||
| assertYamlEqualsIgnoreId(expected, result); |
There was a problem hiding this comment.
Add an explicit NULL/missing boolean-field explain case.
Current cases cover true/false/not-true but don’t explicitly exercise a NULL/missing boolean input scenario. Please add an explain test (e.g., using TEST_INDEX_BANK_WITH_NULL_VALUES with the project’s preferred NULL syntax like isnull(male)) to validate null handling for the new boolean pushdown behavior.
As per coding guidelines, **/+(*Test|*IT).java: NULL input tests must be included for all new functions.
🤖 Prompt for AI Agents
In
`@integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java`
around lines 2651 - 2681, Add a new integration test method alongside
testFilterBooleanFieldOnlyTrue/OnlyFalse/OnlyNotTrue named e.g.,
testFilterBooleanFieldIsNull that calls enabledOnlyWhenPushdownIsEnabled(),
builds a query against TEST_INDEX_BANK_WITH_NULL_VALUES using the project's
NULL-check syntax (e.g., "source=%s | where isnull(male) | fields firstname"),
calls explainQueryYaml(query), loads the expected plan file (e.g.,
"explain_filter_boolean_only_null.yaml") and asserts with
assertYamlEqualsIgnoreId(expected, result) to validate null/missing boolean
handling in the boolean pushdown path.
…o field (#5071) * Fix the bug when boolean comparison condition is simplifed to field Signed-off-by: Songkan Tang <songkant@amazon.com> * Update tests and cover more cases Signed-off-by: Songkan Tang <songkant@amazon.com> * Correct the logic of not boolean comparison Signed-off-by: Songkan Tang <songkant@amazon.com> * Add missing IS_FALSE RexNode translation Signed-off-by: Songkan Tang <songkant@amazon.com> * Remove unnecessary boolean expression conversion Signed-off-by: Songkan Tang <songkant@amazon.com> * Fix spotless check Signed-off-by: Songkan Tang <songkant@amazon.com> * Refactor PredicateAnalyzer logic a bit Signed-off-by: Songkan Tang <songkant@amazon.com> * Add more strict not expression match for field Signed-off-by: Songkan Tang <songkant@amazon.com> * Fix spotless check and flaky test Signed-off-by: Songkan Tang <songkant@amazon.com> * Cover more cases for IS_FALSE, IS_NOT_TRUE, IS_NOT_FALSE Signed-off-by: Songkan Tang <songkant@amazon.com> * Complement the truth tests for expressions Signed-off-by: Songkan Tang <songkant@amazon.com> * Fix logic Signed-off-by: Songkan Tang <songkant@amazon.com> * Fix spotless check Signed-off-by: Songkan Tang <songkant@amazon.com> * Add additional boolean filter only pushdown explain test cases Signed-off-by: Songkan Tang <songkant@amazon.com> --------- Signed-off-by: Songkan Tang <songkant@amazon.com> (cherry picked from commit 624f5e6) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Description
Fix the bug discovered in #5054. See root cause description in #5054 (comment)
Related Issues
Resolves #5054
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.