"rowtime Attributes Must Not Be In The Input Rows Of A Regular Join" Despite Using Interval Join, But Only With Event Timestamp
Solution 1:
Joins between two regular tables in SQL are always expressed in the same way using FROM a, b
or a JOIN b
.
However, Flink provides two types of join operators under the hood for the same syntax. One is an interval join which requires time attributes to relate both tables with each other based on time. And one is the regular SQL join that is implemented in a generic way as you know it from databases.
Interval joins are just a streaming optimization to keep the state size low during runtime and produce no updates in the result. The regular SQL join operator can produce the same result as the an interval in the end but with higher maintenance costs.
In order to distinguish between interval join and regular join, the optimizer searches for a predicate in the WHERE
clause that works on time attributes. For the interval join, the output can always contain two rowtime attributes for outer temporal operations (downstream temporal operators). Because both rowtime attributes are still aligned with the underlying watermarking system. This means that e.g. an outer window or other interval join could work with the time attribute again.
However, the implementation of interval joins has some shortcomings that are known and covered in FLINK-10211. Due to the bad design, we cannot distinguish between an interval join and regular join at certain locations. Thus, we need to assume that the regular join could be an interval join and cannot cast the time attribute to TIMESTAMP
for users automatically. Instead we currently forbid time attributes in the output for regular joins.
At some point this limitation will hopefully be gone, until then a user has two possibilities:
Don't use a regular join on tables that contain a time attribute. You can also just project it away with a nested
SELECT
clause or do aCAST
before joining.Cast the time attribute to a regular timestamp using
CAST(col AS TIMESTAMP)
in theSELECT
clause. It will be pushed down into the join operation.
Your exception indicates that you are using a regular join. Interval joins need a range to operate (even if it is only 1 ms). They don't support equality.
Post a Comment for ""rowtime Attributes Must Not Be In The Input Rows Of A Regular Join" Despite Using Interval Join, But Only With Event Timestamp"