Weird performance bottle-neck with large tables.

This is a discussion on Weird performance bottle-neck with large tables. within the MySQL Database forums, part of the Database Forums category; I am running into something really weird with MySQL. Bare in mind that the following is not a real-world ...


Go Back   Usenet Forums > Database Forums > MySQL Database

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 09-18-2006
jinxidoru@byu.net
 
Posts: n/a
Default Weird performance bottle-neck with large tables.

I am running into something really weird with MySQL. Bare in mind that
the following is not a real-world application but a test environment
that I have set up which is meant to only imitate another real-world
application. So, please don't respond with recommendations of how to
do this same thing differently. :)

I have created 10 tables each with 1 million to 10 million rows. Each
table has 1 million more than the previous. Each table contains two
64-bit integers which together form the primary key and a data field
which is of type TEXT. I then have a program that spawns 50
connections to the first table. Each thread then sends an equal
distribution of INSERTs, SELECTs and UPDATEs. I then monitor the
average query time of each type of query. After finishing with the
first table, I then move to the next table and so on until all tables
have been tested.

On my machine, I am seeing query times in the range of .6 ms/query.
There is little change in the values until I reach the larger tables.
Suddenly, at a point which changes each time, the query time jumps up
in the range of 30 ms/query. Sometimes this occurs at the 7 million
table, sometimes higher or lower. There are times when it never
happens at all and the .6 ms/query timing continues even in the 10
million row table. It never happens on the smaller tables though.

I then setup a test case where once it popped, it re-ran the tests on
the smaller tables. Once the query time has jumped to the 20-30 ms
point, queries on the small tables also take 20-30 ms. So, it has to
be something that is happening in mysql, not just the given table.
Once I let it run for a while, eventually the query times drop down low
again.

When I run "SHOW PROCESSLIST" while the test is running, I see only a
couple of queries running/waiting. Once the timing has popped, the
process list shows lots of queries, most of which are waiting.
Generally the waiting reason is either "Locked" or "System Lock".

I am running 3.23.51 on Linux 2.4.32. The tables are MyISAM. There is
plenty of memory. In fact, during the test my swap is completely
unused. So it can't be a swapping issue. At least, it can't be a
system swapping issue. It could very well be a swapping issue inside
of mysql.

If anyone could provide any light on this situation, I would greatly
appreciate it. Thank you.

Reply With Quote
  #2 (permalink)  
Old 09-18-2006
Brian Wakem
 
Posts: n/a
Default Re: Weird performance bottle-neck with large tables.

jinxidoru@byu.net wrote:

> I am running into something really weird with MySQL. Bare in mind that
> the following is not a real-world application but a test environment
> that I have set up which is meant to only imitate another real-world
> application. So, please don't respond with recommendations of how to
> do this same thing differently. :)
>
> I have created 10 tables each with 1 million to 10 million rows. Each
> table has 1 million more than the previous. Each table contains two
> 64-bit integers which together form the primary key and a data field
> which is of type TEXT. I then have a program that spawns 50
> connections to the first table. Each thread then sends an equal
> distribution of INSERTs, SELECTs and UPDATEs. I then monitor the
> average query time of each type of query. After finishing with the
> first table, I then move to the next table and so on until all tables
> have been tested.
>
> On my machine, I am seeing query times in the range of .6 ms/query.
> There is little change in the values until I reach the larger tables.
> Suddenly, at a point which changes each time, the query time jumps up
> in the range of 30 ms/query. Sometimes this occurs at the 7 million
> table, sometimes higher or lower. There are times when it never
> happens at all and the .6 ms/query timing continues even in the 10
> million row table. It never happens on the smaller tables though.
>
> I then setup a test case where once it popped, it re-ran the tests on
> the smaller tables. Once the query time has jumped to the 20-30 ms
> point, queries on the small tables also take 20-30 ms. So, it has to
> be something that is happening in mysql, not just the given table.
> Once I let it run for a while, eventually the query times drop down low
> again.
>
> When I run "SHOW PROCESSLIST" while the test is running, I see only a
> couple of queries running/waiting. Once the timing has popped, the
> process list shows lots of queries, most of which are waiting.
> Generally the waiting reason is either "Locked" or "System Lock".
>
> I am running 3.23.51



That's a very old version of MySQL. It might not happen in 4.0, 4.1 or 5.0.


> on Linux 2.4.32.



That's old too, I haven't used a 2.4 kernel for a while.


> The tables are MyISAM. There is
> plenty of memory. In fact, during the test my swap is completely
> unused. So it can't be a swapping issue. At least, it can't be a
> system swapping issue. It could very well be a swapping issue inside
> of mysql.



But do all the indexes of all the tables fit in the key buffer?

I suggest you first check this, raise it if necessary, then if the problem
persists, first upgrade MySQL, then your kernel.



--
Brian Wakem
Email: http://homepage.ntlworld.com/b.wakem/myemail.png
Reply With Quote
  #3 (permalink)  
Old 09-18-2006
jinxidoru
 
Posts: n/a
Default Re: Weird performance bottle-neck with large tables.


> That's a very old version of MySQL. It might not happen in 4.0, 4.1 or 5.0.


Yeah, I'm in the process of trying the same test on 5.0

> That's old too, I haven't used a 2.4 kernel for a while.


I agree, but because of certain project requirements, this is not
changeable.

> But do all the indexes of all the tables fit in the key buffer?
> I suggest you first check this, raise it if necessary, then if the problem
> persists, first upgrade MySQL, then your kernel.


I've tried increasing the key buffer, and it had no noticeable effect.
Also, if that were the case then I probably wouldn't see the timing
increase on the smaller tables when I returned to them.

Reply With Quote
  #4 (permalink)  
Old 09-18-2006
Brian Wakem
 
Posts: n/a
Default Re: Weird performance bottle-neck with large tables.

jinxidoru wrote:

>
>> That's a very old version of MySQL. It might not happen in 4.0, 4.1 or
>> 5.0.

>
> Yeah, I'm in the process of trying the same test on 5.0
>
>> That's old too, I haven't used a 2.4 kernel for a while.

>
> I agree, but because of certain project requirements, this is not
> changeable.
>
>> But do all the indexes of all the tables fit in the key buffer?
>> I suggest you first check this, raise it if necessary, then if the
>> problem persists, first upgrade MySQL, then your kernel.

>
> I've tried increasing the key buffer, and it had no noticeable effect.
> Also, if that were the case then I probably wouldn't see the timing
> increase on the smaller tables when I returned to them.



Yes you would as the key buffer would contain the indexes of the larger
tables as the smaller tables' indexes would have been pushed out of the
buffer.


--
Brian Wakem
Email: http://homepage.ntlworld.com/b.wakem/myemail.png
Reply With Quote
  #5 (permalink)  
Old 09-18-2006
jinxidoru
 
Posts: n/a
Default Re: Weird performance bottle-neck with large tables.

> Yes you would as the key buffer would contain the indexes of the larger
> tables as the smaller tables' indexes would have been pushed out of the
> buffer.


But, once a couple queries have been run on the smaller tables,
wouldn't the index for the smaller tables be put back into the cache.
I could see this being a problem when we're talking about just a few
individual queries, but my test case involves thousands of queries.

Reply With Quote
  #6 (permalink)  
Old 09-18-2006
Brian Wakem
 
Posts: n/a
Default Re: Weird performance bottle-neck with large tables.

jinxidoru wrote:

>> Yes you would as the key buffer would contain the indexes of the larger
>> tables as the smaller tables' indexes would have been pushed out of the
>> buffer.

>
> But, once a couple queries have been run on the smaller tables,
> wouldn't the index for the smaller tables be put back into the cache.
> I could see this being a problem when we're talking about just a few
> individual queries, but my test case involves thousands of queries.



As I understand it, only the index blocks that are required for the queries
you execute would get cached in the key buffer, so if you keep querying
different parts of the table you might not get any buffer hits.

With MySQL 5.0 you can use 'LOAD INDEX INTO
CACHE' (http://dev.mysql.com/doc/refman/5.0/en/load-index.html) to make
sure all the index is cached beforehand, but of course you still need a key
buffer big enough to fit them all in.


--
Brian Wakem
Email: http://homepage.ntlworld.com/b.wakem/myemail.png
Reply With Quote
  #7 (permalink)  
Old 09-19-2006
Axel Schwenke
 
Posts: n/a
Default Re: Weird performance bottle-neck with large tables.

jinxidoru@byu.net wrote:

> I have created 10 tables each with 1 million to 10 million rows. Each
> table has 1 million more than the previous. Each table contains two
> 64-bit integers which together form the primary key and a data field
> which is of type TEXT. I then have a program that spawns 50
> connections to the first table. Each thread then sends an equal
> distribution of INSERTs, SELECTs and UPDATEs. I then monitor the
> average query time of each type of query. After finishing with the
> first table, I then move to the next table and so on until all tables
> have been tested.


So this is kind of "Performance vs. Data Set Size" test. How do you
access the rows? Random access, uniform distribution? This will not
simulate real workload then. Real data has hotspots and generally a
nonuniform access pattern.

> On my machine, I am seeing query times in the range of .6 ms/query.
> There is little change in the values until I reach the larger tables.
> Suddenly, at a point which changes each time, the query time jumps up
> in the range of 30 ms/query. Sometimes this occurs at the 7 million
> table, sometimes higher or lower. There are times when it never
> happens at all and the .6 ms/query timing continues even in the 10
> million row table. It never happens on the smaller tables though.


Looks like some cache gets too small. Since you are using MyISAM, there
is only the key_buffer inside MySQL. If you access the data files as
well (that is: fields not in indexes), those accesses will be cached by
the operating system. The fluctuation could be explained by the random
access pattern of your test. But linear progression is not very useful.
You should better increase your data set sizes in geometric
progression. Common are powers of two.

> I then setup a test case where once it popped, it re-ran the tests on
> the smaller tables. Once the query time has jumped to the 20-30 ms
> point, queries on the small tables also take 20-30 ms. So, it has to
> be something that is happening in mysql, not just the given table.


No. It's just how caches work. The key_buffer is not instantly flushed
and refilled, once you go back to a smaller table. Same goes for caches
outside MySQL. Some caches may also be dirty and need to get flushed
first.

> Once I let it run for a while, eventually the query times drop down low
> again.


Right. As soon as the caches are "hot" again, the performance
increases.

> When I run "SHOW PROCESSLIST" while the test is running, I see only a
> couple of queries running/waiting. Once the timing has popped, the
> process list shows lots of queries, most of which are waiting.
> Generally the waiting reason is either "Locked" or "System Lock".


Locking comes from your operation mix. MyISAM tables can be read or
written mutually exclusive. So locking happens all the time. Though
you cannot see it unless there is a backlog of queries in the server.
As long as the server can answer your questions faster than your
testing environment asks new ones, everything will be fine.

If you want to understand the phenomenon, you have to closely monitor
both MySQL and your operating system. I.e. the efficiency of the
key_buffer can be estimated by looking at the key_read_requests and
key_reads counters from SHOW STATUS.

With BLOBs in MyISAM tables you will experience table fragmentation.
Some UPDATEs will create new records, leaving a hole where the old
record was. Eventually some holes will be filled later, but overall
the table size will increase and make your caches less usefull and
performance drop.


XL
--
Axel Schwenke, Senior Software Developer, MySQL AB

Online User Manual: http://dev.mysql.com/doc/refman/5.0/en/
MySQL User Forums: http://forums.mysql.com/
Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT +1. The time now is 03:54 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0