Commit ede2e0d2 authored by igor@rurik.mysql.com's avatar igor@rurik.mysql.com

item_sum.h, item.cc:

  Post review corrections in comments.
parent ad53d720
......@@ -1084,7 +1084,7 @@ void Item_name_const::print(String *str)
ref_pointer_array Pointer to array of reference fields
fields All fields in select
ref Pointer to item
skip_registered <=> the function must skipped for registered SUM items
skip_registered <=> function be must skipped for registered SUM items
NOTES
This is from split_sum_func2() for items that should be split
......
......@@ -30,7 +30,7 @@
GENERAL NOTES
A set function can be used not in any position where an expression is
A set function cannot be used in certain positions where expressions are
accepted. There are some quite explicable restrictions for the usage of
set functions.
......@@ -60,36 +60,37 @@
for each group defined in the main query, not for groups of the subquery.
The problem of finding the query where to aggregate a particular
set function is not so simple is it seems to be.
set function is not so simple as it seems to be.
In the query:
SELECT t1.a FROM t1 GROUP BY t1.a
HAVING t1.a > ALL(SELECT t2.c FROM t2 GROUP BY t2.c
HAVING SUM(t1.a) < t2.c)
the set function can be evaluated for both outer and inner selects.
If we evaluate SUM(t.a) for the outer query then we get the value of t.a
If we evaluate SUM(t1.a) for the outer query then we get the value of t1.a
multiplied by the cardinality of a group in table t1. In this case
in each correlated subquery SUM(t1.a) is used as a constant. But we also
can evaluate SUM(t.a) for the inner query. In this case t.a will be a
can evaluate SUM(t1.a) for the inner query. In this case t1.a will be a
constant for each correlated subquery and summation is performed
for each group of table t2.
(Here it makes sense to remind that the query
SELECT c FROM t GROUP BY a HAVING SUM(1) < a
is quite legal in our SQL).
So depending on to what query we assign the set function we
So depending on what query we assign the set function to we
can get different result sets.
The general rule to detect the query where a set function is to be
evaluated can be formulated as follows.
Consider a set function S(E) where E is an expression with occurrences
of column references C1, ..., CN. Resolve these references against
subqueries whose subexpression the set function S(E) is. Let Q be the
most inner subquery of those subqueries. (It should be noted here that S(E)
in no way can be evaluated in the subquery embedding the subquery Q.)
If S(E) is used in a construct of Q where set function are allowed then
of column references C1, ..., CN. Resolve these column references against
subqueries that contain the set function S(E). Let Q be the innermost
subquery of those subqueries. (It should be noted here that S(E)
in no way can be evaluated in the subquery embedding the subquery Q,
otherwise S(E) would refer to at least one unbound column reference)
If S(E) is used in a construct of Q where set functions are allowed then
we evaluate S(E) in Q.
Otherwise we look for a most inner subquery containing S(E) of those where
Otherwise we look for a innermost subquery containing S(E) of those where
usage of S(E) is allowed.
Let's demonstrate how this rule is applied to the following queries.
......@@ -99,7 +100,7 @@
HAVING t2.b > ALL(SELECT t3.c FROM t3 GROUP BY t3.c
HAVING SUM(t1.a+t2.b) < t3.c))
For this query the set function SUM(t1.a+t2.b) depends on t1.a and t2.b
with t1.a defined in the most outer query, and t2.b defined for its
with t1.a defined in the outermost query, and t2.b defined for its
subquery. The set function is in the HAVING clause of the subquery and can
be evaluated in this subquery.
......@@ -110,8 +111,8 @@
Here the set function SUM(t1.a+t2.b)is in the WHERE clause of the second
subquery - the most upper subquery where t1.a and t2.b are defined.
If we evaluate the function in this subquery we violate the context rules.
So we evaluate the function in the third subquery where it is used under
the HAVING clause.
So we evaluate the function in the third subquery (over table t3) where it
is used under the HAVING clause.
3. SELECT t1.a FROM t1 GROUP BY t1.a
HAVING t1.a > ALL(SELECT t2.b FROM t2
......@@ -123,7 +124,7 @@
Mostly set functions cannot be nested. In the query
SELECT t1.a from t1 GROUP BY t1.a HAVING AVG(SUM(t1.b)) > 20
the expression SUM(b) is not acceptable, though it is under a HAVING clause.
It is acceptable in the query:
Yet it is acceptable in the query:
SELECT t.1 FROM t1 GROUP BY t1.a HAVING SUM(t1.b) > 20.
An argument of a set function does not have to be a reference to a table
......@@ -138,7 +139,7 @@
SELECT t1.a FROM t1 GROUP BY t1.a
HAVING t1.a IN (SELECT t2.c FROM t2 GROUP BY t2.c
HAVING AVG(t2.c+SUM(t1.b)) > 20)
is still clear too. For a group of the rows with the same t1.a values we
is still clear. For a group of the rows with the same t1.a values we
calculate the value of SUM(t1.b). This value 's' is substituted in the
the subquery:
SELECT t2.c FROM t2 GROUP BY t2.c HAVING AVG(t2.c+s)
......@@ -153,11 +154,11 @@
IMPLEMENTATION NOTES
Three methods were added to the class to check the constraints specified
in the previous section. This methods utilize several new members.
in the previous section. These methods utilize several new members.
The field 'nest_level' contains the number of the level for the subquery
containing the set function. The main SELECT is of level 0, its subqueries
are of levels 1, the subqueries of the latters are of level 2 and so on.
are of levels 1, the subqueries of the latter are of level 2 and so on.
The field 'aggr_level' is to contain the nest level of the subquery
where the set function is aggregated.
......@@ -174,7 +175,7 @@
HAVING t2.b > ALL(SELECT t3.c FROM t3 GROUP BY t3.c
HAVING SUM(t1.a+t2.b) < t3.c))
the value of max_arg_level is equal to 1 since t1.a is bound in the main
query, and t2.b is bound by the first subquery whose nest level 1.
query, and t2.b is bound by the first subquery whose nest level is 1.
Obviously a set function cannot be aggregated in the subquery whose
nest level is less than max_arg_level. (Yet it can be aggregated in the
subqueries whose nest level is greater than max_arg_level.)
......@@ -192,11 +193,11 @@
for set function s0 if s1 is not calculated in any subquery
within s0.
A set function that as a subexpression in an argument of another set
function refers to the latter via the field 'in_sum_func'.
A set function that is used as a subexpression in an argument of another
set function refers to the latter via the field 'in_sum_func'.
The condition imposed on the usage of set functions are checked when
we traverse query subexpressions with the help of recursive method
we traverse query subexpressions with the help of the recursive method
fix_fields. When we apply this method to an object of the class
Item_sum, first, on the descent, we call the method init_sum_func_check
that initialize members used at checking. Then, on the ascent, we
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment