Selecting a fixed number of entries for each day in range

This is a discussion on Selecting a fixed number of entries for each day in range within the MySQL Database forums, part of the Database Forums category; I have a table with a DATE column, and I want to make a SELECT that returns N random entries ...


Go Back   Usenet Forums > Database Forums > MySQL Database

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 03-05-2008
Bruno Barberi Gnecco
 
Posts: n/a
Default Selecting a fixed number of entries for each day in range

I have a table with a DATE column, and I want to make a SELECT that returns
N random entries from this table for each day in a given range. The result I want
looks like this:

date some data
----------------------------------
2008/01/01 Entry 16
2008/01/01 Entry 32
2008/01/01 Entry 13
2008/01/02 Entry 2
2008/01/02 Entry 35
2008/01/02 Entry 54
....

Is there a faster way to do this than running 30 SELECTs?

I've posted this about a year ago, but couldn't find a way to do it other
than a loop. I now need a more complex query (entries are weighted) and I'm still
looking for a good solution.

Thanks a lot!

--
Bruno Barberi Gnecco <brunobg_at_users.sourceforge.net>
Five is a sufficiently close approximation to infinity.
-- Robert Firth
Reply With Quote
  #2 (permalink)  
Old 03-06-2008
Michael Austin
 
Posts: n/a
Default Re: Selecting a fixed number of entries for each day in range

Bruno Barberi Gnecco wrote:

> I have a table with a DATE column, and I want to make a SELECT that returns
> N random entries from this table for each day in a given range. The result I

want
> looks like this:


> date some data
> ----------------------------------
> 2008/01/01 Entry 16
> 2008/01/01 Entry 32
> 2008/01/01 Entry 13
> 2008/01/02 Entry 2
> 2008/01/02 Entry 35
> 2008/01/02 Entry 54
> ....


> Is there a faster way to do this than running 30 SELECTs?


> I've posted this about a year ago, but couldn't find a way to do it other
> than a loop. I now need a more complex query (entries are weighted) and I'm

still
> looking for a good solution.


> Thanks a lot!


\

Reading the Docs would save you a whole lot of time - as would:
google mysql random rand

how about looking at the RAND() function

select field from table order by RAND() limit 10;



Reply With Quote
  #3 (permalink)  
Old 03-06-2008
Rik Wasmus
 
Posts: n/a
Default Re: Selecting a fixed number of entries for each day in range

On Thu, 06 Mar 2008 02:01:22 +0100, Michael Austin
<maustin@firstdbasource.com> wrote:

> Bruno Barberi Gnecco wrote:
>
>> I have a table with a DATE column, and I want to make a SELECT that
>> returns
>> N random entries from this table for each day in a given range. The
>> result I

> want
>> looks like this:

>
>> date some data
>> ----------------------------------
>> 2008/01/01 Entry 16
>> 2008/01/01 Entry 32
>> 2008/01/01 Entry 13
>> 2008/01/02 Entry 2
>> 2008/01/02 Entry 35
>> 2008/01/02 Entry 54
>> ....

>
>> Is there a faster way to do this than running 30 SELECTs?

>
>> I've posted this about a year ago, but couldn't find a way to do it
>> other
>> than a loop. I now need a more complex query (entries are weighted) and
>> I'm

> still
>> looking for a good solution.

>
>> Thanks a lot!

>
> Reading the Docs would save you a whole lot of time - as would:
> google mysql random rand
>
> how about looking at the RAND() function
>
> select field from table order by RAND() limit 10;


Reading the question instead would have saved you this post: the question
is how to return N random entries _for_each_day_ in range, in one query.
(N = 3 would yield 6 results for 01-01-2008 - 02-01-2008, 9 for 01-01-2008
- 03-01-2008 etc.) Not an easy task IMO. Even the standard having a
'dates' table (having dates from 01-01-1970 - 01-01-2031 for instance),
and joining the table or possibly a subquery against it does not yield the
desired results as you can't limit a join as far as I know.

If I really needed this, I'd probably opt for the same thing I suspect OP
is doing now: query the database for each date with the WHERE date =
<date> ORDER BY RAND() LIMIT N, and display those after one another.
Making those seperate queries a UNION is not really faster or more
convenient.

If one query is really needed, but no real extra data save for some
primary key 'id', this could be one:

SELECT FLOOR(LOG10(MAX(id)) + 1 FROM tablename INTO @my_id_length;
SELECT
date
SUBSTRING(GROUP_CONCAT(LPAD(id,@my_id_length,' ') ORDER BY RAND()
SEPARATOR ','),
0,
(3 * @my_id_length) + 3 - 1)) as 'ids'
FROM tablename
WHERE date BETWEEN <start_date> AND <end_date>
GROUP BY date
ORDER BY date

....and intepret a string like ' 123,7245, 6' in the calling
script/application.
--
Rik Wasmus
Reply With Quote
  #4 (permalink)  
Old 03-06-2008
Captain Paralytic
 
Posts: n/a
Default Re: Selecting a fixed number of entries for each day in range

On 6 Mar, 09:43, "Rik Wasmus" <luiheidsgoe...@hotmail.com> wrote:
> On Thu, 06 Mar 2008 02:01:22 +0100, Michael Austin
>
>
>
> <maus...@firstdbasource.com> wrote:
> > Bruno Barberi Gnecco wrote:

>
> >> I have a table with a DATE column, and I want to make a SELECT that
> >> returns
> >> N random entries from this table for each day in a given range. The
> >> result I

> > want
> >> looks like this:

>
> >> date some data
> >> ----------------------------------
> >> 2008/01/01 Entry 16
> >> 2008/01/01 Entry 32
> >> 2008/01/01 Entry 13
> >> 2008/01/02 Entry 2
> >> 2008/01/02 Entry 35
> >> 2008/01/02 Entry 54
> >> ....

>
> >> Is there a faster way to do this than running 30 SELECTs?

>
> >> I've posted this about a year ago, but couldn't find a way to do it
> >> other
> >> than a loop. I now need a more complex query (entries are weighted) and
> >> I'm

> > still
> >> looking for a good solution.

>
> >> Thanks a lot!

>
> > Reading the Docs would save you a whole lot of time - as would:
> > google mysql random rand

>
> > how about looking at the RAND() function

>
> > select field from table order by RAND() limit 10;

>
> Reading the question instead would have saved you this post: the question
> is how to return N random entries _for_each_day_ in range, in one query.
> (N = 3 would yield 6 results for 01-01-2008 - 02-01-2008, 9 for 01-01-2008
> - 03-01-2008 etc.) Not an easy task IMO. Even the standard having a
> 'dates' table (having dates from 01-01-1970 - 01-01-2031 for instance),
> and joining the table or possibly a subquery against it does not yield the
> desired results as you can't limit a join as far as I know.
>
> If I really needed this, I'd probably opt for the same thing I suspect OP
> is doing now: query the database for each date with the WHERE date =
> <date> ORDER BY RAND() LIMIT N, and display those after one another.
> Making those seperate queries a UNION is not really faster or more
> convenient.
>
> If one query is really needed, but no real extra data save for some
> primary key 'id', this could be one:
>
> SELECT FLOOR(LOG10(MAX(id)) + 1 FROM tablename INTO @my_id_length;
> SELECT
> date
> SUBSTRING(GROUP_CONCAT(LPAD(id,@my_id_length,' ') ORDER BY RAND()
> SEPARATOR ','),
> 0,
> (3 * @my_id_length) + 3 - 1)) as 'ids'
> FROM tablename
> WHERE date BETWEEN <start_date> AND <end_date>
> GROUP BY date
> ORDER BY date
>
> ...and intepret a string like ' 123,7245, 6' in the calling
> script/application.
> --
> Rik Wasmus


Wouldn't the "top n of a group" using random numbers to define the
"top" work?

http://thenoyes.com/littlenoise/?p=36
Reply With Quote
  #5 (permalink)  
Old 03-07-2008
Bruno Barberi Gnecco
 
Posts: n/a
Default Re: Selecting a fixed number of entries for each day in range

Captain Paralytic wrote:

>>>> I have a table with a DATE column, and I want to make a SELECT that
>>>>returns
>>>>N random entries from this table for each day in a given range. The
>>>>result I
>>>
>>>want
>>>
>>>>looks like this:

>>
>>>>date some data
>>>>----------------------------------
>>>>2008/01/01 Entry 16
>>>>2008/01/01 Entry 32
>>>>2008/01/01 Entry 13
>>>>2008/01/02 Entry 2
>>>>2008/01/02 Entry 35
>>>>2008/01/02 Entry 54
>>>>....

>>
>>>> Is there a faster way to do this than running 30 SELECTs?

>>
>>>> I've posted this about a year ago, but couldn't find a way to do it
>>>>other
>>>>than a loop. I now need a more complex query (entries are weighted) and
>>>>I'm
>>>
>>>still
>>>
>>>>looking for a good solution.

>>
>>>> Thanks a lot!

>>
>>>Reading the Docs would save you a whole lot of time - as would:
>>>google mysql random rand

>>
>>>how about looking at the RAND() function

>>
>>>select field from table order by RAND() limit 10;

>>
>>Reading the question instead would have saved you this post: the question
>>is how to return N random entries _for_each_day_ in range, in one query.
>>(N = 3 would yield 6 results for 01-01-2008 - 02-01-2008, 9 for 01-01-2008
>>- 03-01-2008 etc.) Not an easy task IMO. Even the standard having a
>>'dates' table (having dates from 01-01-1970 - 01-01-2031 for instance),
>>and joining the table or possibly a subquery against it does not yield the
>>desired results as you can't limit a join as far as I know.
>>
>>If I really needed this, I'd probably opt for the same thing I suspect OP
>>is doing now: query the database for each date with the WHERE date =
>><date> ORDER BY RAND() LIMIT N, and display those after one another.
>>Making those seperate queries a UNION is not really faster or more
>>convenient.
>>
>>If one query is really needed, but no real extra data save for some
>>primary key 'id', this could be one:
>>
>>SELECT FLOOR(LOG10(MAX(id)) + 1 FROM tablename INTO @my_id_length;
>>SELECT
>> date
>> SUBSTRING(GROUP_CONCAT(LPAD(id,@my_id_length,' ') ORDER BY RAND()
>>SEPARATOR ','),
>> 0,
>> (3 * @my_id_length) + 3 - 1)) as 'ids'
>> FROM tablename
>>WHERE date BETWEEN <start_date> AND <end_date>
>>GROUP BY date
>>ORDER BY date
>>
>>...and intepret a string like ' 123,7245, 6' in the calling
>>script/application.
>>--
>>Rik Wasmus

>
>
> Wouldn't the "top n of a group" using random numbers to define the
> "top" work?
>
> http://thenoyes.com/littlenoise/?p=36


Thank you, this seems a very neat solution. It's also easy to
add weights to it, indexing by weight*RAND().

I don't know if it will be efficient if the tables are
large, however, since you do need to generate a temporary table
such as (SELECT original.id, RAND() AS rand FROM original). So
I think (correct me if I'm wrong) that this should behave in
performance essentially like a ORDER BY RAND(). Is there some
way to optimize it that I'm not seeing?

I suppose the only way to optimize would to precompute
the random mapping table and add a WHERE random < @randomnumber
condition (not feasible in my case, alas).

--
Bruno Barberi Gnecco <brunobg_at_users.sourceforge.net>
MONDAY:
In Christian countries, the day after the football game.
-- Ambrose Bierce
Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT +1. The time now is 04:27 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0