Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Coral-Incremental] Incremental plan generation for RelNode incremental rewrite #520

Open
wants to merge 24 commits into
base: master
Choose a base branch
from

Conversation

yyy1000
Copy link

@yyy1000 yyy1000 commented Jul 29, 2024

What changes are proposed in this pull request, and why are they necessary?

Introduction: The goal of this PR is to introduce Incremental Plan Generation for Coral-Incremental. This feature will enable the generation of multiple incremental plans for a given RelNode (a logical plan) based on the number of sub-queries that will be computed incrementally. The number of generated plans will match the number of sub-queries in a SQL query. Each logical plan in these generated plans will update the materialized view with its results.

For example, in a multi-table join, we can choose between batch and incremental execution stages. Consider the following three-table join:

               LogicalProject#8
                    |
               LogicalJoin#7
               /        \
       LogicalProject#4   TableScan#5
               |
        LogicalJoin#3
             /   \
   TableScan#0  TableScan#1

We could generate plans as follows:

Incremental: Both joins are executed incrementally.
Part-Batch, Part-Incremental: The first join is executed incrementally, and the second join is executed in batch mode.
Batch: Both joins are executed in batch mode.

Each generated plan will be represented as a List, where each RelNode indicates whether its corresponding sub-query will be executed incrementally or in batch mode.

Some important changes:

  1. New RelNodeGenerationTransformer to rewrite incremental format RelNodes by materializing the join subqueries, and combine them to generate different complete plans.
  2. Added two helper classes within RelNodeGenerationTransformer:
    findJoinNeedsProject: Converts all RelNodes into a uniform format for future incremental plan generation.
    uniformFormat: Ensures consistent formatting of RelNodes.
  3. convertRelPrev method which could convert the tables of a RelNode into '_prev' format, thus distinguishing from the original one.

How was this patch tested?

Unit test

@yyy1000 yyy1000 marked this pull request as draft July 29, 2024 21:11
@yyy1000 yyy1000 changed the title Plan generation [Coral-Incremental] Incremental plan generation for RelNode incremental rewrite Jul 30, 2024
@yyy1000 yyy1000 marked this pull request as ready for review July 30, 2024 21:09
Comment on lines 313 to 314
* - For other RelNodes: recursively processing their inputs to ensure uniformity.
* <p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clear what this achieves.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the example below show what it's doing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am referring to "to ensure uniformity". The idea about the whole method is to ensure uniformity. Making it exclusive to "other RelNodes" does not make sense.

@wmoustafa
Copy link
Contributor

I think overall, the PR could benefit from more extensive documentation. Current documentation on different methods is a good start. Each documentation should be expanded with concrete running example with actual table names in the plan, input and output. Each method responsibility throughout the transformation process could be illustrated by this running example. If you add the table names to your current example, it will make the documentation much more clear and concrete by giving examples of the input and output of each method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants