This is a discussion on Implementing an history with a limited number of entries with MySQL within the MySQL Database forums, part of the Database Forums category; Hello All, I'm searching for the faster way (performance-wise) to create the following structure with mysql: I've ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
Hello All,
I'm searching for the faster way (performance-wise) to create the following structure with mysql: I've a table where I take an history of things that happen in a system. This table contains just some integer and a creation timestamp in UNIX time Every time there is a new event to add to the history I just INSERT it The problem is that I want this history to contain at MAX 10000 elements. Older elements should be removed in some way. What's the faster way to take the history limited in length? This is my best try so far. I take two max-length instead of just 10000, that I call low and high here where: high = 12000 low = 10000 high is a tollerance value, so I can do the following. - every time I've to insert a new element: - if the current history length > high - row = SELECT time FROM history ORDER BY time DESC LIMIT low,1 - timelimt = row.time - DELETE FROM history WHERE time >= timelimit With the high/low trick I can do this queries just 1 time every 2000 new history rows added, still I need all this to be very fast once it needs to be done. Of course time is INDEXed. Are there better ways to do this stuff? Regards, Salvatore |
|
|||
|
If I were doing it, I would write a script on the server that would
run at 2am (or whenever) to check the database and remove history. Unless it's a priority to have history removed constantly on the go, this is the best way to ensure it can be done properly and off peak hours. I run several scripts under cronjobs to manipulate the database at night. Nino On May 11, 4:28 pm, Salvatore Sanfilippo <anti...@gmail.com> wrote: > Hello All, > > I'm searching for the faster way (performance-wise) to create the > following structure with mysql: > > I've a table where I take an history of things that happen in a > system. > This table contains just some integer and a creation timestamp in UNIX > time > Every time there is a new event to add to the history I just INSERT it > > The problem is that I want this history to contain at MAX 10000 > elements. > Older elements should be removed in some way. > > What's the faster way to take the history limited in length? > > This is my best try so far. > > I take two max-length instead of just 10000, that I call low and high > here where: > > high = 12000 > low = 10000 > > high is a tollerance value, so I can do the following. > > - every time I've to insert a new element: > - if the current history length > high > - row = SELECT time FROM history ORDER BY time DESC LIMIT low,1 > - timelimt = row.time > - DELETE FROM history WHERE time >= timelimit > > With the high/low trick I can do this queries just 1 time every 2000 > new history rows added, > still I need all this to be very fast once it needs to be done. > > Of course time is INDEXed. > > Are there better ways to do this stuff? > > Regards, > Salvatore |
|
|||
|
On May 12, 1:53 am, "nino9st...@yahoo.com" <nino9st...@yahoo.com>
wrote: > If I were doing it, I would write a script on the server that would > run at 2am (or whenever) to check the database and remove history. > Unless it's a priority to have history removed constantly on the go, > this is the best way to ensure it can be done properly and off peak > hours. I run several scripts under cronjobs to manipulate the database > at night. > > Nino Hello Nino, Thanks for the suggestion, I also use to run scripts via cron, or even better to have sort of daemons in separated processes running DB operations incrementally. Here the problem is a bit harder since I've multiple histories (one for customer) and there are no "idle" periods on the server since we are serving customers in different timezones. Probably the best solution from the point of view of performances is to switch from DB to plain files with fixed record length implementing a circular buffer, but I could like to take the flexibility of the DB. Regards, Salvatore |
|
|||
|
Salvatore Sanfilippo wrote:
> Hello All, > > I'm searching for the faster way (performance-wise) to create the > following structure with mysql: > > I've a table where I take an history of things that happen in a > system. > This table contains just some integer and a creation timestamp in UNIX > time > Every time there is a new event to add to the history I just INSERT it > > The problem is that I want this history to contain at MAX 10000 > elements. > Older elements should be removed in some way. > > What's the faster way to take the history limited in length? > > This is my best try so far. > > I take two max-length instead of just 10000, that I call low and high > here where: > > high = 12000 > low = 10000 > > high is a tollerance value, so I can do the following. > > - every time I've to insert a new element: > - if the current history length > high > - row = SELECT time FROM history ORDER BY time DESC LIMIT low,1 > - timelimt = row.time > - DELETE FROM history WHERE time >= timelimit > > With the high/low trick I can do this queries just 1 time every 2000 > new history rows added, > still I need all this to be very fast once it needs to be done. > > Of course time is INDEXed. > > Are there better ways to do this stuff? > > Regards, > Salvatore Here's another way to think about it. Once you have 10000 rows, then every time you insert a new row, you can just delete the oldest one. So, if you're using MyISAM, you can check the result of SHOW TABLE STATUS and if it is 10000 then insert your row and delete the oldest one. |
|
|||
|
On May 12, 10:32 am, "Paul Lautman" <paul.laut...@btinternet.com>
wrote: > Salvatore Sanfilippo wrote: > > Hello All, > > > I'm searching for the faster way (performance-wise) to create the > > following structure with mysql: > > > I've a table where I take an history of things that happen in a > > system. > > This table contains just some integer and a creation timestamp in UNIX > > time > > Every time there is a new event to add to the history I just INSERT it > > > The problem is that I want this history to contain at MAX 10000 > > elements. > > Older elements should be removed in some way. > > > What's the faster way to take the history limited in length? > > > This is my best try so far. > > > I take two max-length instead of just 10000, that I call low and high > > here where: > > > high = 12000 > > low = 10000 > > > high is a tollerance value, so I can do the following. > > > - every time I've to insert a new element: > > - if the current history length > high > > - row = SELECT time FROM history ORDER BY time DESC LIMIT low,1 > > - timelimt = row.time > > - DELETE FROM history WHERE time >= timelimit > > > With the high/low trick I can do this queries just 1 time every 2000 > > new history rows added, > > still I need all this to be very fast once it needs to be done. > > > Of course time is INDEXed. > > > Are there better ways to do this stuff? > > > Regards, > > Salvatore > > Here's another way to think about it. Once you have 10000 rows, then every > time you insert a new row, you can just delete the oldest one. > So, if you're using MyISAM, you can check the result of SHOW TABLE STATUS > and if it is 10000 then insert your row and delete the oldest one. Or build a table with 10000 rows and then just UPDATE the oldest one each time |
|
|||
|
Salvatore Sanfilippo wrote:
> On May 12, 1:53 am, "nino9st...@yahoo.com" <nino9st...@yahoo.com> > wrote: >> If I were doing it, I would write a script on the server that would >> run at 2am (or whenever) to check the database and remove history. >> Unless it's a priority to have history removed constantly on the go, >> this is the best way to ensure it can be done properly and off peak >> hours. I run several scripts under cronjobs to manipulate the database >> at night. >> >> Nino > > Hello Nino, Thanks for the suggestion, I also use to run scripts via > cron, or even better to have sort of daemons in separated processes > running DB operations incrementally. Here the problem is a bit harder > since I've multiple histories (one for customer) and there are no > "idle" > periods on the server since we are serving customers in different > timezones. > > Probably the best solution from the point of view of performances is > to switch from DB to plain files with fixed record length implementing > a circular buffer, but I could like to take the flexibility of the DB. > > Regards, > Salvatore > Why would that be the "best solution". In fact, if your file gets large, it will probably not perform as well as a database - unless you add your own indicies and other performance improvements. And you don't need "idle times" to run these queries. None of them are going to take hours to execute. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |
|
|||
|
On May 12, 4:24 pm, Jerry Stuckle <jstuck...@attglobal.net> wrote:
> > Probably the best solution from the point of view of performances is > > to switch from DB to plain files with fixed record length implementing > > a circular buffer, but I could like to take the flexibility of the DB. > Why would that be the "best solution". In fact, if your file gets > large, it will probably not perform as well as a database - unless you > add your own indicies and other performance improvements. Hello Jerry, thank you for your reply, the fact that a file-based schema is so fast for this application is because you can prebuild a fixed record length table of N elements, and take as first 4 bytes of the table a 32bit counter that points to the next record to write. The counter gets incremented every time a new record is instered. This means O(1) insert, with minimal constant times, and the query I need to run against this is just "find all the records with ID > X", this is trivial as I just need to check the current pointer, subtract X from this number, seek(2) (the actual POSIX syscall) and read fixed length records. Again O(1) seek + O(N) (with N being the length of the records to read) to read all the rest. If it is so good, why I'm posting questions about SQL you may wonder... because if I can find a solution with SQL I can, for instance, backup the DB and have everyting saved, from the users, to the records, and so on. Also this is not the first application I develop. I know that things tend to become more complex, so I bet in some week, month or year I'll need to perform more complex queries against this records. > > And you don't need "idle times" to run these queries. None of them are > going to take hours to execute. Sure but my requirements is that the DELETE query I need will never take more than 200 milliseconds in the worst case. For now I've 100 records for user and this is working very well. When I'll switch to 10000 records for user I hope it will continue to work well. About this, I wonder if it's better to create a tabl for every customer or instead, (like I'm doing currently), to have a csutomer_id in the table. The latter solution is more handy for me as I can do for example SELECT DISTINCT(customer_id) if I want to see in real time the active customers, but if a table for every customer will perform better in pratice I'll switch to this model. Thanks again, Salvatore |
|
|||
|
On May 12, 11:32 am, "Paul Lautman" <paul.laut...@btinternet.com>
wrote: > Here's another way to think about it. Once you have 10000 rows, then every > time you insert a new row, you can just delete the oldest one. > So, if you're using MyISAM, you can check the result of SHOW TABLE STATUS > and if it is 10000 then insert your row and delete the oldest one. Hello Paul, yes this can work as far as there is a way to make "delete the oldest one" very fast. AFAIK the faster I can get is: row = SELECT MIN(time) FROM history WHERE customer_id=$mycustomer DELETE FROM row WHERE time=row[0]; I wonder if there are faster ways. Regards, Salvatore |
|
|||
|
Salvatore Sanfilippo wrote:
> On May 12, 11:32 am, "Paul Lautman" <paul.laut...@btinternet.com> > wrote: > >> Here's another way to think about it. Once you have 10000 rows, then >> every time you insert a new row, you can just delete the oldest one. >> So, if you're using MyISAM, you can check the result of SHOW TABLE >> STATUS and if it is 10000 then insert your row and delete the oldest >> one. > > Hello Paul, > > yes this can work as far as there is a way to make "delete the oldest > one" > very fast. AFAIK the faster I can get is: > > row = SELECT MIN(time) FROM history WHERE customer_id=$mycustomer > DELETE FROM row WHERE time=row[0]; > > I wonder if there are faster ways. > > Regards, > Salvatore I find the "Strawberry Query" for finding the row with the minimum value (LEFT (self)JOIN and NULL test), many times faster than the MIN() approach. |
|
|||
|
Salvatore Sanfilippo wrote:
> On May 12, 11:32 am, "Paul Lautman" <paul.laut...@btinternet.com> > wrote: > >> Here's another way to think about it. Once you have 10000 rows, then >> every time you insert a new row, you can just delete the oldest one. >> So, if you're using MyISAM, you can check the result of SHOW TABLE >> STATUS and if it is 10000 then insert your row and delete the oldest >> one. > > Hello Paul, > > yes this can work as far as there is a way to make "delete the oldest > one" > very fast. AFAIK the faster I can get is: > > row = SELECT MIN(time) FROM history WHERE customer_id=$mycustomer > DELETE FROM row WHERE time=row[0]; > > I wonder if there are faster ways. > > Regards, > Salvatore Indeed, you should be able to do this in a single query thus: DELETE FROM `h1` USING `history` `h1` LEFT JOIN `history` `h2` ON `h1`.`time` > `h2`.`time` WHERE `h2`.`time` IS NULL |
![]() |
| Thread Tools | |
| Display Modes | |
|
|