This is a discussion on Q: Sorting by the sum of columns? within the MySQL Database forums, part of the Database Forums category; Hi Folks! I'm a newbie MySQL programmer. While doing my project I have faced following problem: I have five ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
Hi Folks!
I'm a newbie MySQL programmer. While doing my project I have faced following problem: I have five columns in my table (indexed 'id' and four with numerical values). I would like to sum the numerical values into a kind of virtual column and sort the table accordingly to the decreasing values of the sum (of the virual column). I have searched in numerous tutorials and FAQs but have not found description of even similar situation. Please help Best regards, Genz |
|
|||
|
On May 12, 3:44 pm, genzeryk <genze...@gmail.com> wrote:
> Hi Folks! > > I'm a newbie MySQL programmer. While doing my project I have faced > following problem: > > I have five columns in my table (indexed 'id' and four with numerical > values). I would like to sum the numerical values into a kind of > virtual column and sort the table accordingly to the decreasing values > of the sum (of the virual column). > > I have searched in numerous tutorials and FAQs but have not found > description of even similar situation. > > Please help > > Best regards, > > Genz select id, (a + b + c + d) theSum from someTable order by (a + b + c + d) desc |
|
|||
|
ZeldorBlat wrote:
> On May 12, 3:44 pm, genzeryk <genze...@gmail.com> wrote: >> Hi Folks! >> >> I'm a newbie MySQL programmer. While doing my project I have faced >> following problem: >> >> I have five columns in my table (indexed 'id' and four with numerical >> values). I would like to sum the numerical values into a kind of >> virtual column and sort the table accordingly to the decreasing >> values of the sum (of the virual column). >> >> I have searched in numerous tutorials and FAQs but have not found >> description of even similar situation. >> >> Please help >> >> Best regards, >> >> Genz > > select id, (a + b + c + d) theSum > from someTable > order by (a + b + c + d) desc That seems needlessly wasted on resources, by summing the columns twice. Surely either of these 2 would be better: SELECT `id`, (`a` + `b` + `c` + `d`) `theSum` FROM `someTable` ORDER BY `theSum` or SELECT `id`, (`a` + `b` + `c` + `d`) `theSum` FROM `someTable` ORDER BY 2 |
|
|||
|
On May 12, 3:53 pm, "Paul Lautman" <paul.laut...@btinternet.com>
wrote: > ZeldorBlat wrote: > > On May 12, 3:44 pm, genzeryk <genze...@gmail.com> wrote: > >> Hi Folks! > > >> I'm a newbie MySQL programmer. While doing my project I have faced > >> following problem: > > >> I have five columns in my table (indexed 'id' and four with numerical > >> values). I would like to sum the numerical values into a kind of > >> virtual column and sort the table accordingly to the decreasing > >> values of the sum (of the virual column). > > >> I have searched in numerous tutorials and FAQs but have not found > >> description of even similar situation. > > >> Please help > > >> Best regards, > > >> Genz > > > select id, (a + b + c + d) theSum > > from someTable > > order by (a + b + c + d) desc > > That seems needlessly wasted on resources, by summing the columns twice. > Surely either of these 2 would be better: > > SELECT > `id`, > (`a` + `b` + `c` + `d`) `theSum` > FROM `someTable` > ORDER BY `theSum` > > or > > SELECT > `id`, > (`a` + `b` + `c` + `d`) `theSum` > FROM `someTable` > ORDER BY 2 Do you know for a fact that MySQL will internally calculate the sums twice? If the database engine is properly optimized I would guess that it won't. |
|
|||
|
ZeldorBlat wrote:
> On May 12, 3:53 pm, "Paul Lautman" <paul.laut...@btinternet.com> > wrote: >> ZeldorBlat wrote: >>> On May 12, 3:44 pm, genzeryk <genze...@gmail.com> wrote: >>>> Hi Folks! >> >>>> I'm a newbie MySQL programmer. While doing my project I have faced >>>> following problem: >> >>>> I have five columns in my table (indexed 'id' and four with >>>> numerical values). I would like to sum the numerical values into a >>>> kind of virtual column and sort the table accordingly to the >>>> decreasing values of the sum (of the virual column). >> >>>> I have searched in numerous tutorials and FAQs but have not found >>>> description of even similar situation. >> >>>> Please help >> >>>> Best regards, >> >>>> Genz >> >>> select id, (a + b + c + d) theSum >>> from someTable >>> order by (a + b + c + d) desc >> >> That seems needlessly wasted on resources, by summing the columns >> twice. Surely either of these 2 would be better: >> >> SELECT >> `id`, >> (`a` + `b` + `c` + `d`) `theSum` >> FROM `someTable` >> ORDER BY `theSum` >> >> or >> >> SELECT >> `id`, >> (`a` + `b` + `c` + `d`) `theSum` >> FROM `someTable` >> ORDER BY 2 > > Do you know for a fact that MySQL will internally calculate the sums > twice? If the database engine is properly optimized I would guess > that it won't. Yes I do know that for a fact. No matter how well the database engine is likely to be optimised, it is not going to bother tryjng to recognise that two distinct pieces of calculation happen to have the same fields. That is why you are able to refer to previously calculated fields. Optimisers assume some level of intelligence from the programmer. |
|
|||
|
On May 13, 9:42 am, "Paul Lautman" <paul.laut...@btinternet.com>
wrote: > ZeldorBlat wrote: > > On May 12, 3:53 pm, "Paul Lautman" <paul.laut...@btinternet.com> > > wrote: > >> ZeldorBlat wrote: > >>> On May 12, 3:44 pm, genzeryk <genze...@gmail.com> wrote: > >>>> Hi Folks! > > >>>> I'm a newbie MySQL programmer. While doing my project I have faced > >>>> following problem: > > >>>> I have five columns in my table (indexed 'id' and four with > >>>> numerical values). I would like to sum the numerical values into a > >>>> kind of virtual column and sort the table accordingly to the > >>>> decreasing values of the sum (of the virual column). > > >>>> I have searched in numerous tutorials and FAQs but have not found > >>>> description of even similar situation. > > >>>> Please help > > >>>> Best regards, > > >>>> Genz > > >>> select id, (a + b + c + d) theSum > >>> from someTable > >>> order by (a + b + c + d) desc > > >> That seems needlessly wasted on resources, by summing the columns > >> twice. Surely either of these 2 would be better: > > >> SELECT > >> `id`, > >> (`a` + `b` + `c` + `d`) `theSum` > >> FROM `someTable` > >> ORDER BY `theSum` > > >> or > > >> SELECT > >> `id`, > >> (`a` + `b` + `c` + `d`) `theSum` > >> FROM `someTable` > >> ORDER BY 2 > > > Do you know for a fact that MySQL will internally calculate the sums > > twice? If the database engine is properly optimized I would guess > > that it won't. > > Yes I do know that for a fact. No matter how well the database engine is > likely to be optimised, it is not going to bother tryjng to recognise that > two distinct pieces of calculation happen to have the same fields. That is > why you are able to refer to previously calculated fields. Optimisers assume > some level of intelligence from the programmer. I was unable to find that documented anywhere. Do you have a source? I'm curious now... |
|
|||
|
On 13 May, 22:41, ZeldorBlat <zeldorb...@gmail.com> wrote:
> On May 13, 9:42 am, "Paul Lautman" <paul.laut...@btinternet.com> > wrote: > > > > > > > ZeldorBlat wrote: > > > On May 12, 3:53 pm, "Paul Lautman" <paul.laut...@btinternet.com> > > > wrote: > > >> ZeldorBlat wrote: > > >>> On May 12, 3:44 pm, genzeryk <genze...@gmail.com> wrote: > > >>>> Hi Folks! > > > >>>> I'm a newbie MySQL programmer. While doing my project I have faced > > >>>> following problem: > > > >>>> I have five columns in my table (indexed 'id' and four with > > >>>> numerical values). I would like to sum the numerical values into a > > >>>> kind of virtual column and sort the table accordingly to the > > >>>> decreasing values of the sum (of the virual column). > > > >>>> I have searched in numerous tutorials and FAQs but have not found > > >>>> description of even similar situation. > > > >>>> Please help > > > >>>> Best regards, > > > >>>> Genz > > > >>> select id, (a + b + c + d) theSum > > >>> from someTable > > >>> order by (a + b + c + d) desc > > > >> That seems needlessly wasted on resources, by summing the columns > > >> twice. Surely either of these 2 would be better: > > > >> SELECT > > >> `id`, > > >> (`a` + `b` + `c` + `d`) `theSum` > > >> FROM `someTable` > > >> ORDER BY `theSum` > > > >> or > > > >> SELECT > > >> `id`, > > >> (`a` + `b` + `c` + `d`) `theSum` > > >> FROM `someTable` > > >> ORDER BY 2 > > > > Do you know for a fact that MySQL will internally calculate the sums > > > twice? If the database engine is properly optimized I would guess > > > that it won't. > > > Yes I do know that for a fact. No matter how well the database engine is > > likely to be optimised, it is not going to bother tryjng to recognise that > > two distinct pieces of calculation happen to have the same fields. That is > > why you are able to refer to previously calculated fields. Optimisers assume > > some level of intelligence from the programmer. > > I was unable to find that documented anywhere. Do you have a source? > I'm curious now...- Hide quoted text - > > - Show quoted text - Sorry for the delay in answering this one. My source is the manual. On the page http://dev.mysql.com/doc/refman/5.0/en/select.html Where it says: "Columns selected for output can be referred to in ORDER BY and GROUP BY clauses using column names, column aliases, or column positions." And of course it defines an Order By statement as: ORDER BY {col_name | expr | position} It does not say that columns selected for output can be referred to by using an expression that happens to match the expression that created the data for that column. Thus, unless you use one of the methods to refer to the output column, you are left with an expression that will have to be evaluated. |
|
|||
|
On May 15, 6:57 am, Captain Paralytic <paul_laut...@yahoo.com> wrote:
> On 13 May, 22:41, ZeldorBlat <zeldorb...@gmail.com> wrote: > > > > > On May 13, 9:42 am, "Paul Lautman" <paul.laut...@btinternet.com> > > wrote: > > > > ZeldorBlat wrote: > > > > On May 12, 3:53 pm, "Paul Lautman" <paul.laut...@btinternet.com> > > > > wrote: > > > >> ZeldorBlat wrote: > > > >>> On May 12, 3:44 pm, genzeryk <genze...@gmail.com> wrote: > > > >>>> Hi Folks! > > > > >>>> I'm a newbie MySQL programmer. While doing my project I have faced > > > >>>> following problem: > > > > >>>> I have five columns in my table (indexed 'id' and four with > > > >>>> numerical values). I would like to sum the numerical values into a > > > >>>> kind of virtual column and sort the table accordingly to the > > > >>>> decreasing values of the sum (of the virual column). > > > > >>>> I have searched in numerous tutorials and FAQs but have not found > > > >>>> description of even similar situation. > > > > >>>> Please help > > > > >>>> Best regards, > > > > >>>> Genz > > > > >>> select id, (a + b + c + d) theSum > > > >>> from someTable > > > >>> order by (a + b + c + d) desc > > > > >> That seems needlessly wasted on resources, by summing the columns > > > >> twice. Surely either of these 2 would be better: > > > > >> SELECT > > > >> `id`, > > > >> (`a` + `b` + `c` + `d`) `theSum` > > > >> FROM `someTable` > > > >> ORDER BY `theSum` > > > > >> or > > > > >> SELECT > > > >> `id`, > > > >> (`a` + `b` + `c` + `d`) `theSum` > > > >> FROM `someTable` > > > >> ORDER BY 2 > > > > > Do you know for a fact that MySQL will internally calculate the sums > > > > twice? If the database engine is properly optimized I would guess > > > > that it won't. > > > > Yes I do know that for a fact. No matter how well the database engine is > > > likely to be optimised, it is not going to bother tryjng to recognise that > > > two distinct pieces of calculation happen to have the same fields. That is > > > why you are able to refer to previously calculated fields. Optimisers assume > > > some level of intelligence from the programmer. > > > I was unable to find that documented anywhere. Do you have a source? > > I'm curious now...- Hide quoted text - > > > - Show quoted text - > > Sorry for the delay in answering this one. My source is the manual. > On the pagehttp://dev.mysql.com/doc/refman/5.0/en/select.html > Where it says: > > "Columns selected for output can be referred to in ORDER BY and GROUP > BY clauses using column names, column aliases, or column positions." > > And of course it defines an Order By statement as: > > ORDER BY {col_name | expr | position} > > It does not say that columns selected for output can be referred to by > using an expression that happens to match the expression that created > the data for that column. Thus, unless you use one of the methods to > refer to the output column, you are left with an expression that will > have to be evaluated. Ok, but nowhere does that say that the expression will or will not be internally evaluated a second time. If the database is smart enough to recognize when you have an expression in the group by clause that is not in the select list (in which case it complains), I would expect that it is smart enough to match an expression in the select list to an expression in the order by clause. |
|
|||
|
ZeldorBlat wrote:
> On May 15, 6:57 am, Captain Paralytic <paul_laut...@yahoo.com> wrote: > >> <snip> >> Sorry for the delay in answering this one. My source is the manual. >> On the pagehttp://dev.mysql.com/doc/refman/5.0/en/select.html >> Where it says: >> >> "Columns selected for output can be referred to in ORDER BY and GROUP >> BY clauses using column names, column aliases, or column positions." >> >> And of course it defines an Order By statement as: >> >> ORDER BY {col_name | expr | position} >> >> It does not say that columns selected for output can be referred to by >> using an expression that happens to match the expression that created >> the data for that column. Thus, unless you use one of the methods to >> refer to the output column, you are left with an expression that will >> have to be evaluated. >> > > Ok, but nowhere does that say that the expression will or will not be > internally evaluated a second time. If the database is smart enough > to recognize when you have an expression in the group by clause that > is not in the select list (in which case it complains), I would expect > that it is smart enough to match an expression in the select list to > an expression in the order by clause. > > It's barely worth discussing. As focused as database companies are on performance I doubt any would re-evaluate any expression that lexically matches another either before or after compilation. Referring to column aliases is a syntactic feature, not a performance feature. It's a good thing that MySQL has column aliases. I remember that being one of my favorite features of Teradata's SQL. Regardless, the answer to the question is unlikely to be found in the manuals. To know for sure the answer would either have to come from a MySQL engineer or someone else familiar with MySQL's parser, compiler, evaluator, or looking through the code. However, as confident as Mr. Lautman is of the "facts," there are none supporting the /assumption/ the ORDER BY clause' re-evaluates duplicate expressions. Though I have no "facts" supporting my assumptions, I'm equally confident that considering SQL databases maturity, the amount of engineering given them, the papers published on optimization strategies, the competition for paying licensees, and the constant pressure to improve performance, that such an issue as duplicate expression evaluation is unlikely to have gone unnoticed for 30+ years. -- Visit <http://blogs.instreamfinancial.com/anything.php> to read my rants on technology and the finance industry. |
|
|||
|
Thomas Gagne wrote:
> ZeldorBlat wrote: >> On May 15, 6:57 am, Captain Paralytic <paul_laut...@yahoo.com> wrote: >> >>> <snip> >>> Sorry for the delay in answering this one. My source is the manual. >>> On the pagehttp://dev.mysql.com/doc/refman/5.0/en/select.html >>> Where it says: >>> >>> "Columns selected for output can be referred to in ORDER BY and >>> GROUP BY clauses using column names, column aliases, or column >>> positions." >>> >>> And of course it defines an Order By statement as: >>> >>> ORDER BY {col_name | expr | position} >>> >>> It does not say that columns selected for output can be referred to >>> by using an expression that happens to match the expression that >>> created the data for that column. Thus, unless you use one of the >>> methods to refer to the output column, you are left with an >>> expression that will have to be evaluated. >>> >> >> Ok, but nowhere does that say that the expression will or will not be >> internally evaluated a second time. If the database is smart enough >> to recognize when you have an expression in the group by clause that >> is not in the select list (in which case it complains), I would >> expect that it is smart enough to match an expression in the select >> list to an expression in the order by clause. >> >> > It's barely worth discussing. As focused as database companies are on > performance I doubt any would re-evaluate any expression that > lexically matches another either before or after compilation. > > Referring to column aliases is a syntactic feature, not a performance > feature. It's a good thing that MySQL has column aliases. I remember > that being one of my favorite features of Teradata's SQL. > > Regardless, the answer to the question is unlikely to be found in the > manuals. To know for sure the answer would either have to come from a > MySQL engineer or someone else familiar with MySQL's parser, compiler, > evaluator, or looking through the code. > > However, as confident as Mr. Lautman is of the "facts," there are none > supporting the /assumption/ the ORDER BY clause' re-evaluates > duplicate expressions. Though I have no "facts" supporting my > assumptions, I'm equally confident that considering SQL databases > maturity, the amount of engineering given them, the papers published > on optimization strategies, the competition for paying licensees, and > the constant pressure to improve performance, that such an issue as > duplicate expression evaluation is unlikely to have gone unnoticed > for 30+ years. The manual tells you when an output column will be referenced. If it would be referenced in all cases then the writers would not have bothered to list the only 2 that applied. The manual doesn't tell you everything that does not apply. For instance it tells you that the SELECT syntax starts with SELECT. It doesn't tell you that you cannot spell SELECT as SELETC or any other way. ZeldorBlat's mention of the GROUP BY clause is totally unrelated and in fact wrong. It is indeed perfectly possible to have an expression in the GROUP BY clause that is not in the SELECT list. And it does not "complain" as he states. Instead it works! I just tried it. Regardless of what you say about optimisation strategies, the developers of databases do, as I have said before, assume a bit of intelligence in the people who use them. They have stated quite clearly in the manual when data in a column can be referenced. The would not bother, no matter how many papers had been written to try to optimise a query for something like this when they have already given and documented the 2 correct ways to do it. |
![]() |
| Thread Tools | |
| Display Modes | |
|
|