Bug #45640: optimizer bug produces wrong results
Grouping by a subquery in a query with a distinct aggregate function lead to a wrong result (wrong and unordered grouping values). There are two related problems: 1) The query like this: SELECT (SELECT t1.a) aa, COUNT(DISTINCT b) c FROM t1 GROUP BY aa returned wrong result, because the outer reference "t1.a" in the subquery was substituted with the Item_ref item. The Item_ref item obtains data from the result_field object that refreshes once after the end of each group. This data is not applicable to filesort since filesort() doesn't care about groups (and doesn't update result_field objects with copy_fields() and so on). Also that data is not applicable to group separation algorithm: end_send_group() checks every record with test_if_group_changed() that evaluates Item_ref items, but it refreshes those Item_ref-s only after the end of group, that is a vicious circle and the grouped column values in the output are shifted. Fix: if a) we grouping by a subquery and b) that subquery has outer references to FROM list of the grouping query, then we substitute these outer references with Item_direct_ref like references under aggregate functions: Item_direct_ref obtains data directly from the current record. 2) The query with a non-trivial grouping expression like: SELECT (SELECT t1.a) aa, COUNT(DISTINCT b) c FROM t1 GROUP BY aa+0 also returned wrong result, since JOIN::exec() substitutes references to top-level aliases in SELECT list with Item_copy caching items. Item_copy items have same refreshing policy as Item_ref items, so the whole groping expression with Item_copy inside returns wrong result in filesort() and end_send_group(). Fix: include aliased items into GROUP BY item tree instead of Item_ref references to them.
Showing
Please register or sign in to comment