View Single Post

  #10 (permalink)  
Old 05-15-2007
Paul Lautman
 
Posts: n/a
Default Re: Q: Sorting by the sum of columns?

Thomas Gagne wrote:
> ZeldorBlat wrote:
>> On May 15, 6:57 am, Captain Paralytic <paul_laut...@yahoo.com> wrote:
>>
>>> <snip>
>>> Sorry for the delay in answering this one. My source is the manual.
>>> On the pagehttp://dev.mysql.com/doc/refman/5.0/en/select.html
>>> Where it says:
>>>
>>> "Columns selected for output can be referred to in ORDER BY and
>>> GROUP BY clauses using column names, column aliases, or column
>>> positions."
>>>
>>> And of course it defines an Order By statement as:
>>>
>>> ORDER BY {col_name | expr | position}
>>>
>>> It does not say that columns selected for output can be referred to
>>> by using an expression that happens to match the expression that
>>> created the data for that column. Thus, unless you use one of the
>>> methods to refer to the output column, you are left with an
>>> expression that will have to be evaluated.
>>>

>>
>> Ok, but nowhere does that say that the expression will or will not be
>> internally evaluated a second time. If the database is smart enough
>> to recognize when you have an expression in the group by clause that
>> is not in the select list (in which case it complains), I would
>> expect that it is smart enough to match an expression in the select
>> list to an expression in the order by clause.
>>
>>

> It's barely worth discussing. As focused as database companies are on
> performance I doubt any would re-evaluate any expression that
> lexically matches another either before or after compilation.
>
> Referring to column aliases is a syntactic feature, not a performance
> feature. It's a good thing that MySQL has column aliases. I remember
> that being one of my favorite features of Teradata's SQL.
>
> Regardless, the answer to the question is unlikely to be found in the
> manuals. To know for sure the answer would either have to come from a
> MySQL engineer or someone else familiar with MySQL's parser, compiler,
> evaluator, or looking through the code.
>
> However, as confident as Mr. Lautman is of the "facts," there are none
> supporting the /assumption/ the ORDER BY clause' re-evaluates
> duplicate expressions. Though I have no "facts" supporting my
> assumptions, I'm equally confident that considering SQL databases
> maturity, the amount of engineering given them, the papers published
> on optimization strategies, the competition for paying licensees, and
> the constant pressure to improve performance, that such an issue as
> duplicate expression evaluation is unlikely to have gone unnoticed
> for 30+ years.


The manual tells you when an output column will be referenced. If it would
be referenced in all cases then the writers would not have bothered to list
the only 2 that applied. The manual doesn't tell you everything that does
not apply. For instance it tells you that the SELECT syntax starts with
SELECT. It doesn't tell you that you cannot spell SELECT as SELETC or any
other way.

ZeldorBlat's mention of the GROUP BY clause is totally unrelated and in fact
wrong. It is indeed perfectly possible to have an expression in the GROUP BY
clause that is not in the SELECT list. And it does not "complain" as he
states. Instead it works! I just tried it.

Regardless of what you say about optimisation strategies, the developers of
databases do, as I have said before, assume a bit of intelligence in the
people who use them. They have stated quite clearly in the manual when data
in a column can be referenced. The would not bother, no matter how many
papers had been written to try to optimise a query for something like this
when they have already given and documented the 2 correct ways to do it.


Reply With Quote