Skip to content

[BugFix] Fix the bug when boolean comparison condition is simplifed to field#5071

Merged
qianheng-aws merged 17 commits intoopensearch-project:mainfrom
songkant-aws:boolean-comparison-expression-fix
Feb 6, 2026
Merged

[BugFix] Fix the bug when boolean comparison condition is simplifed to field#5071
qianheng-aws merged 17 commits intoopensearch-project:mainfrom
songkant-aws:boolean-comparison-expression-fix

Conversation

@songkant-aws
Copy link
Contributor

@songkant-aws songkant-aws commented Jan 26, 2026

Description

Fix the bug discovered in #5054. See root cause description in #5054 (comment)

Related Issues

Resolves #5054

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Songkan Tang <songkant@amazon.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 26, 2026

📝 Walkthrough

Summary by CodeRabbit

  • Bug Fixes

    • Boolean comparisons (TRUE/FALSE, NOT, !=) now generate correct term/mustNot filters for pushdown, preserving null/missing semantics and improving aggregation filter behavior.
  • Tests

    • Added Calcite-focused explain tests and expected-plan fixtures covering query_string, true/false, NOT and != pushdown scenarios.
    • Extended YAML integration tests to enable the Calcite plugin during setup/teardown and validate boolean pushdown cases with concise length-based assertions.

Walkthrough

Convert boolean field predicates earlier in Calcite traversal and predicate analysis to emit exact boolean term or negated-term queries; add unit and integration tests, a REST YAML test, and expected explain-plan YAMLs covering boolean pushdown cases.

Changes

Cohort / File(s) Summary
Predicate analysis & boolean helpers
opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java
Detect boolean-typed NamedFieldExpression; short-circuit boolean operands into boolean-specific QueryExpression methods (isTrue, isFalse, isNotTrue, isNotFalse); extend postfix support (IS_FALSE, IS_NOT_TRUE, IS_NOT_FALSE).
Calcite boolean rewrites
core/src/main/java/org/opensearch/sql/calcite/CalciteRexNodeVisitor.java
Add private helpers to detect boolean field comparisons and rewrite NOT / != patterns into IS_NOT_TRUE/IS_NOT_FALSE forms during Rex traversal.
Opensearch unit & aggregate tests
opensearch/src/test/java/org/opensearch/sql/opensearch/request/PredicateAnalyzerTest.java, opensearch/src/test/java/org/opensearch/sql/opensearch/request/AggregateAnalyzerTest.java
Add boolean field to test schema and assert IS_TRUE produces TermQuery and combined conditions produce BoolQuery; update aggregate test expectation to use term filter for boolean true.
Calcite explain integration tests
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
Add nine explain tests covering boolean pushdown cases (query_string interactions, TRUE/'TRUE', false, NOT/!= variants) against expected YAML plans.
Calcite expected explain plans
integ-test/src/test/resources/expectedOutput/calcite/...
explain_filter_query_string_with_boolean.yaml, explain_filter_query_string_with_boolean_false.yaml, explain_filter_query_string_with_boolean_not_true.yaml, explain_filter_boolean_only_true.yaml, explain_filter_boolean_only_false.yaml, explain_filter_boolean_only_not_true.yaml
Add expected logical and physical explain plan YAMLs showing pushed-down boolean term filters (must / must_not) combined with query_string in PushDownContext/OpenSearchRequestBuilder.
YAML REST integration tests
integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/5054.yml
Add REST test that creates an index with a boolean field, bulk-inserts docs, toggles Calcite plugin in hooks, and asserts counts for is_internal=true/false and NOT variants.
Test expectation tweaks
integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/4866.yml
Simplify expectations to length-based assertions for datarows and adjust schema/total matches.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Planner as CalcitePlanner
    participant Rex as CalciteRexNodeVisitor
    participant Analyzer as PredicateAnalyzer
    participant QExpr as QueryExpression
    participant DSL as OpenSearch DSL

    Client->>Planner: submit SQL with boolean predicate
    Planner->>Rex: translate Rex nodes (compare / NOT)
    Rex->>Planner: rewrite != / NOT -> IS_NOT_* when applicable
    Planner->>Analyzer: analyzeExpression(filter)
    Analyzer->>Analyzer: detect NamedFieldExpression.isBooleanType()
    Analyzer->>QExpr: convert boolean field -> isTrue()/isFalse()/isNotTrue()/isNotFalse()
    QExpr->>DSL: emit TermQuery or must_not TermQuery (combined with query_string)
    DSL-->>Planner: return pushed-down DSL
    Planner-->>Client: explain/execute with pushed-down boolean term
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

testing

Suggested reviewers

  • penghuo
  • ps48
  • yuancu
  • kavithacm
  • derek-ho
  • joshuali925
  • anirudha
🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.28% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title directly addresses the main bug fix for boolean field comparison simplification, which is the core focus of all code changes across multiple files in this PR.
Description check ✅ Passed The description clearly relates to the changeset by referencing issue #5054, explaining the bug fix purpose, and documenting testing completion and DCO compliance.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: Songkan Tang <songkant@amazon.com>
@penghuo penghuo added bugFix PPL Piped processing language labels Jan 26, 2026
Content-Type: 'application/json'
ppl:
body:
query: source=test-boolean | where is_internal=true | fields name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using failed query source=test url=http | where is_internal=true
in #5054

Comment on lines 582 to 586
// Handle NOT(IS_TRUE(boolean_field)) - convert to term query with false value
// This covers cases where IS_TRUE was explicitly applied
if (expr instanceof SimpleQueryExpression simpleExpr && simpleExpr.isBooleanFieldIsTrue()) {
return QueryExpression.create(simpleExpr.rel).isFalse();
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • (NOT boolean_field = true) will return fields include ture, null and missing fields
  • but boolean_field=false only return fields has false value.

// generate a term query with value true.
// When called on an already-evaluated predicate (builder already set),
// return as-is.
if (builder == null) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to override isTrue and not API for NamedFieldExpression instead of changing SimpleQueryExpression?

Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/5054.yml`:
- Around line 1-15: The test uses an index named "test" and currently doesn't
clean it up; update the YAML to ensure index isolation by adding explicit delete
steps for the "test" index in both the setup and teardown blocks (or replace
"test" with a generated unique name), e.g., add a do: delete index action before
the test runs and another delete after the test completes so the index cannot
leak state or conflict with other tests; reference the existing setup/teardown
blocks and the index name "test" when making these changes.

Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
if (operand instanceof NamedFieldExpression namedField && namedField.isBooleanType()) {
return booleanOp.apply(QueryExpression.create(namedField));
}
// IS_TRUE on a predicate (already evaluated QueryExpression) is allowed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will other operation like IS_FALSE, IS_NOT_TRUE, IS_NOT_FALSE apply to QueryExpression?

Signed-off-by: Songkan Tang <songkant@amazon.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
`@opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java`:
- Around line 603-626: The boolean postfix handling currently calls
QueryExpression.isFalse()/isNotFalse()/isNotTrue() which overwrite any existing
builder; change the branch that handles operand instanceof QueryExpression qe to
avoid calling those mutators and instead preserve/wrap the existing builder: for
IS_TRUE/IS_NOT_FALSE return qe as-is, and for IS_FALSE/IS_NOT_TRUE return a
negated form of qe (implement QueryExpression.negate() or wrap qe.getBuilder()
into a BoolQuery with mustNot) so predicates like (age > 30) IS FALSE are
expressed by negating the existing predicate builder rather than replacing it
with a term query; keep the existing boolean-field handling
(NamedFieldExpression) unchanged.

Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
`@opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java`:
- Around line 603-625: The boolean operator handling in PredicateAnalyzer
currently calls booleanOp.apply(qe) for any QueryExpression operand, but
CompoundQueryExpression does not implement isTrue/isNotTrue and thus calling
those methods throws; update the branch that handles operand instanceof
QueryExpression to detect CompoundQueryExpression (or other predicate
QueryExpression subclasses) and directly route IS_TRUE and IS_NOT_TRUE to the
predicate-handling path instead of invoking qe.isTrue/ qe.isNotTrue;
specifically, inside the if (operand instanceof QueryExpression qe) block check
if qe is a CompoundQueryExpression (or predicate-type) and for call.getKind() ==
IS_TRUE / IS_NOT_TRUE return the appropriate predicate query (the same output
produced for NamedFieldExpression boolean predicates) or otherwise fall back to
booleanOp.apply(qe) for supported QueryExpression implementations.
🧹 Nitpick comments (1)
opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java (1)

229-235: Consider recording analyzed nodes for the top-level boolean-field shortcut.

This keeps analyzedNodes consistent with the tryAnalyzeOperand path and improves downstream partial-pushdown bookkeeping.

🔧 Suggested tweak
-      if (result instanceof NamedFieldExpression namedField && namedField.isBooleanType()) {
-        return QueryExpression.create(namedField).isTrue();
-      }
+      if (result instanceof NamedFieldExpression namedField && namedField.isBooleanType()) {
+        QueryExpression qe = QueryExpression.create(namedField).isTrue();
+        qe.updateAnalyzedNodes(expression);
+        return qe;
+      }

penghuo
penghuo previously approved these changes Feb 4, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
`@integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java`:
- Around line 2651-2681: Add a new integration test method alongside
testFilterBooleanFieldOnlyTrue/OnlyFalse/OnlyNotTrue named e.g.,
testFilterBooleanFieldIsNull that calls enabledOnlyWhenPushdownIsEnabled(),
builds a query against TEST_INDEX_BANK_WITH_NULL_VALUES using the project's
NULL-check syntax (e.g., "source=%s | where isnull(male) | fields firstname"),
calls explainQueryYaml(query), loads the expected plan file (e.g.,
"explain_filter_boolean_only_null.yaml") and asserts with
assertYamlEqualsIgnoreId(expected, result) to validate null/missing boolean
handling in the boolean pushdown path.

Comment on lines +2651 to +2681
@Test
public void testFilterBooleanFieldOnlyTrue() throws IOException {
enabledOnlyWhenPushdownIsEnabled();
// Test single boolean filter without query_string
String query =
StringUtils.format("source=%s | where male = true | fields firstname", TEST_INDEX_BANK);
var result = explainQueryYaml(query);
String expected = loadExpectedPlan("explain_filter_boolean_only_true.yaml");
assertYamlEqualsIgnoreId(expected, result);
}

@Test
public void testFilterBooleanFieldOnlyFalse() throws IOException {
enabledOnlyWhenPushdownIsEnabled();
// Test single boolean filter with false value without query_string
String query =
StringUtils.format("source=%s | where male = false | fields firstname", TEST_INDEX_BANK);
var result = explainQueryYaml(query);
String expected = loadExpectedPlan("explain_filter_boolean_only_false.yaml");
assertYamlEqualsIgnoreId(expected, result);
}

@Test
public void testFilterBooleanFieldOnlyNotTrue() throws IOException {
enabledOnlyWhenPushdownIsEnabled();
// Test single NOT boolean filter without query_string
String query =
StringUtils.format("source=%s | where NOT male = true | fields firstname", TEST_INDEX_BANK);
var result = explainQueryYaml(query);
String expected = loadExpectedPlan("explain_filter_boolean_only_not_true.yaml");
assertYamlEqualsIgnoreId(expected, result);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add an explicit NULL/missing boolean-field explain case.
Current cases cover true/false/not-true but don’t explicitly exercise a NULL/missing boolean input scenario. Please add an explain test (e.g., using TEST_INDEX_BANK_WITH_NULL_VALUES with the project’s preferred NULL syntax like isnull(male)) to validate null handling for the new boolean pushdown behavior.

As per coding guidelines, **/+(*Test|*IT).java: NULL input tests must be included for all new functions.

🤖 Prompt for AI Agents
In
`@integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java`
around lines 2651 - 2681, Add a new integration test method alongside
testFilterBooleanFieldOnlyTrue/OnlyFalse/OnlyNotTrue named e.g.,
testFilterBooleanFieldIsNull that calls enabledOnlyWhenPushdownIsEnabled(),
builds a query against TEST_INDEX_BANK_WITH_NULL_VALUES using the project's
NULL-check syntax (e.g., "source=%s | where isnull(male) | fields firstname"),
calls explainQueryYaml(query), loads the expected plan file (e.g.,
"explain_filter_boolean_only_null.yaml") and asserts with
assertYamlEqualsIgnoreId(expected, result) to validate null/missing boolean
handling in the boolean pushdown path.

@qianheng-aws qianheng-aws merged commit 624f5e6 into opensearch-project:main Feb 6, 2026
37 of 38 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Feb 6, 2026
…o field (#5071)

* Fix the bug when boolean comparison condition is simplifed to field

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Update tests and cover more cases

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Correct the logic of not boolean comparison

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Add missing IS_FALSE RexNode translation

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Remove unnecessary boolean expression conversion

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Fix spotless check

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Refactor PredicateAnalyzer logic a bit

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Add more strict not expression match for field

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Fix spotless check and flaky test

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Cover more cases for IS_FALSE, IS_NOT_TRUE, IS_NOT_FALSE

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Complement the truth tests for expressions

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Fix logic

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Fix spotless check

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Add additional boolean filter only pushdown explain test cases

Signed-off-by: Songkan Tang <songkant@amazon.com>

---------

Signed-off-by: Songkan Tang <songkant@amazon.com>
(cherry picked from commit 624f5e6)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.19-dev bugFix PPL Piped processing language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] PPL where command does not work as expected.

3 participants