This is a discussion on Selecting a fixed number of entries for each day in range within the MySQL Database forums, part of the Database Forums category; I have a table with a DATE column, and I want to make a SELECT that returns N random entries ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
I have a table with a DATE column, and I want to make a SELECT that returns
N random entries from this table for each day in a given range. The result I want looks like this: date some data ---------------------------------- 2008/01/01 Entry 16 2008/01/01 Entry 32 2008/01/01 Entry 13 2008/01/02 Entry 2 2008/01/02 Entry 35 2008/01/02 Entry 54 .... Is there a faster way to do this than running 30 SELECTs? I've posted this about a year ago, but couldn't find a way to do it other than a loop. I now need a more complex query (entries are weighted) and I'm still looking for a good solution. Thanks a lot! -- Bruno Barberi Gnecco <brunobg_at_users.sourceforge.net> Five is a sufficiently close approximation to infinity. -- Robert Firth |
|
|||
|
Bruno Barberi Gnecco wrote:
> I have a table with a DATE column, and I want to make a SELECT that returns > N random entries from this table for each day in a given range. The result I want > looks like this: > date some data > ---------------------------------- > 2008/01/01 Entry 16 > 2008/01/01 Entry 32 > 2008/01/01 Entry 13 > 2008/01/02 Entry 2 > 2008/01/02 Entry 35 > 2008/01/02 Entry 54 > .... > Is there a faster way to do this than running 30 SELECTs? > I've posted this about a year ago, but couldn't find a way to do it other > than a loop. I now need a more complex query (entries are weighted) and I'm still > looking for a good solution. > Thanks a lot! \ Reading the Docs would save you a whole lot of time - as would: google mysql random rand how about looking at the RAND() function select field from table order by RAND() limit 10; |
|
|||
|
On Thu, 06 Mar 2008 02:01:22 +0100, Michael Austin
<maustin@firstdbasource.com> wrote: > Bruno Barberi Gnecco wrote: > >> I have a table with a DATE column, and I want to make a SELECT that >> returns >> N random entries from this table for each day in a given range. The >> result I > want >> looks like this: > >> date some data >> ---------------------------------- >> 2008/01/01 Entry 16 >> 2008/01/01 Entry 32 >> 2008/01/01 Entry 13 >> 2008/01/02 Entry 2 >> 2008/01/02 Entry 35 >> 2008/01/02 Entry 54 >> .... > >> Is there a faster way to do this than running 30 SELECTs? > >> I've posted this about a year ago, but couldn't find a way to do it >> other >> than a loop. I now need a more complex query (entries are weighted) and >> I'm > still >> looking for a good solution. > >> Thanks a lot! > > Reading the Docs would save you a whole lot of time - as would: > google mysql random rand > > how about looking at the RAND() function > > select field from table order by RAND() limit 10; Reading the question instead would have saved you this post: the question is how to return N random entries _for_each_day_ in range, in one query. (N = 3 would yield 6 results for 01-01-2008 - 02-01-2008, 9 for 01-01-2008 - 03-01-2008 etc.) Not an easy task IMO. Even the standard having a 'dates' table (having dates from 01-01-1970 - 01-01-2031 for instance), and joining the table or possibly a subquery against it does not yield the desired results as you can't limit a join as far as I know. If I really needed this, I'd probably opt for the same thing I suspect OP is doing now: query the database for each date with the WHERE date = <date> ORDER BY RAND() LIMIT N, and display those after one another. Making those seperate queries a UNION is not really faster or more convenient. If one query is really needed, but no real extra data save for some primary key 'id', this could be one: SELECT FLOOR(LOG10(MAX(id)) + 1 FROM tablename INTO @my_id_length; SELECT date SUBSTRING(GROUP_CONCAT(LPAD(id,@my_id_length,' ') ORDER BY RAND() SEPARATOR ','), 0, (3 * @my_id_length) + 3 - 1)) as 'ids' FROM tablename WHERE date BETWEEN <start_date> AND <end_date> GROUP BY date ORDER BY date ....and intepret a string like ' 123,7245, 6' in the calling script/application. -- Rik Wasmus |
|
|||
|
On 6 Mar, 09:43, "Rik Wasmus" <luiheidsgoe...@hotmail.com> wrote:
> On Thu, 06 Mar 2008 02:01:22 +0100, Michael Austin > > > > <maus...@firstdbasource.com> wrote: > > Bruno Barberi Gnecco wrote: > > >> I have a table with a DATE column, and I want to make a SELECT that > >> returns > >> N random entries from this table for each day in a given range. The > >> result I > > want > >> looks like this: > > >> date some data > >> ---------------------------------- > >> 2008/01/01 Entry 16 > >> 2008/01/01 Entry 32 > >> 2008/01/01 Entry 13 > >> 2008/01/02 Entry 2 > >> 2008/01/02 Entry 35 > >> 2008/01/02 Entry 54 > >> .... > > >> Is there a faster way to do this than running 30 SELECTs? > > >> I've posted this about a year ago, but couldn't find a way to do it > >> other > >> than a loop. I now need a more complex query (entries are weighted) and > >> I'm > > still > >> looking for a good solution. > > >> Thanks a lot! > > > Reading the Docs would save you a whole lot of time - as would: > > google mysql random rand > > > how about looking at the RAND() function > > > select field from table order by RAND() limit 10; > > Reading the question instead would have saved you this post: the question > is how to return N random entries _for_each_day_ in range, in one query. > (N = 3 would yield 6 results for 01-01-2008 - 02-01-2008, 9 for 01-01-2008 > - 03-01-2008 etc.) Not an easy task IMO. Even the standard having a > 'dates' table (having dates from 01-01-1970 - 01-01-2031 for instance), > and joining the table or possibly a subquery against it does not yield the > desired results as you can't limit a join as far as I know. > > If I really needed this, I'd probably opt for the same thing I suspect OP > is doing now: query the database for each date with the WHERE date = > <date> ORDER BY RAND() LIMIT N, and display those after one another. > Making those seperate queries a UNION is not really faster or more > convenient. > > If one query is really needed, but no real extra data save for some > primary key 'id', this could be one: > > SELECT FLOOR(LOG10(MAX(id)) + 1 FROM tablename INTO @my_id_length; > SELECT > date > SUBSTRING(GROUP_CONCAT(LPAD(id,@my_id_length,' ') ORDER BY RAND() > SEPARATOR ','), > 0, > (3 * @my_id_length) + 3 - 1)) as 'ids' > FROM tablename > WHERE date BETWEEN <start_date> AND <end_date> > GROUP BY date > ORDER BY date > > ...and intepret a string like ' 123,7245, 6' in the calling > script/application. > -- > Rik Wasmus Wouldn't the "top n of a group" using random numbers to define the "top" work? http://thenoyes.com/littlenoise/?p=36 |
|
|||
|
Captain Paralytic wrote:
>>>> I have a table with a DATE column, and I want to make a SELECT that >>>>returns >>>>N random entries from this table for each day in a given range. The >>>>result I >>> >>>want >>> >>>>looks like this: >> >>>>date some data >>>>---------------------------------- >>>>2008/01/01 Entry 16 >>>>2008/01/01 Entry 32 >>>>2008/01/01 Entry 13 >>>>2008/01/02 Entry 2 >>>>2008/01/02 Entry 35 >>>>2008/01/02 Entry 54 >>>>.... >> >>>> Is there a faster way to do this than running 30 SELECTs? >> >>>> I've posted this about a year ago, but couldn't find a way to do it >>>>other >>>>than a loop. I now need a more complex query (entries are weighted) and >>>>I'm >>> >>>still >>> >>>>looking for a good solution. >> >>>> Thanks a lot! >> >>>Reading the Docs would save you a whole lot of time - as would: >>>google mysql random rand >> >>>how about looking at the RAND() function >> >>>select field from table order by RAND() limit 10; >> >>Reading the question instead would have saved you this post: the question >>is how to return N random entries _for_each_day_ in range, in one query. >>(N = 3 would yield 6 results for 01-01-2008 - 02-01-2008, 9 for 01-01-2008 >>- 03-01-2008 etc.) Not an easy task IMO. Even the standard having a >>'dates' table (having dates from 01-01-1970 - 01-01-2031 for instance), >>and joining the table or possibly a subquery against it does not yield the >>desired results as you can't limit a join as far as I know. >> >>If I really needed this, I'd probably opt for the same thing I suspect OP >>is doing now: query the database for each date with the WHERE date = >><date> ORDER BY RAND() LIMIT N, and display those after one another. >>Making those seperate queries a UNION is not really faster or more >>convenient. >> >>If one query is really needed, but no real extra data save for some >>primary key 'id', this could be one: >> >>SELECT FLOOR(LOG10(MAX(id)) + 1 FROM tablename INTO @my_id_length; >>SELECT >> date >> SUBSTRING(GROUP_CONCAT(LPAD(id,@my_id_length,' ') ORDER BY RAND() >>SEPARATOR ','), >> 0, >> (3 * @my_id_length) + 3 - 1)) as 'ids' >> FROM tablename >>WHERE date BETWEEN <start_date> AND <end_date> >>GROUP BY date >>ORDER BY date >> >>...and intepret a string like ' 123,7245, 6' in the calling >>script/application. >>-- >>Rik Wasmus > > > Wouldn't the "top n of a group" using random numbers to define the > "top" work? > > http://thenoyes.com/littlenoise/?p=36 Thank you, this seems a very neat solution. It's also easy to add weights to it, indexing by weight*RAND(). I don't know if it will be efficient if the tables are large, however, since you do need to generate a temporary table such as (SELECT original.id, RAND() AS rand FROM original). So I think (correct me if I'm wrong) that this should behave in performance essentially like a ORDER BY RAND(). Is there some way to optimize it that I'm not seeing? I suppose the only way to optimize would to precompute the random mapping table and add a WHERE random < @randomnumber condition (not feasible in my case, alas). -- Bruno Barberi Gnecco <brunobg_at_users.sourceforge.net> MONDAY: In Christian countries, the day after the football game. -- Ambrose Bierce |