slow query on big table

This is a discussion on slow query on big table within the MySQL Database forums, part of the Database Forums category; I have a table with more than 3,000,000 records. Frequently, I have to do an update on this ...


Go Back   Usenet Forums > Database Forums > MySQL Database

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 09-14-2006
lambelly@gmail.com
 
Posts: n/a
Default slow query on big table

I have a table with more than 3,000,000 records. Frequently, I have to
do an update on this table that is taking a long time and I'd like some
advice as to what I can do to fix the problem.

Let's call the table students. The primary key is an auto-indent ID
which is unique for each student. However, the table itself basically
works to connect students with groups and campaigns they are associated
with. Because of this a student might exist in the table more than
once. When I do this update I want it to change information in each of
the instances of that student, so instead of updating based on the
unique id in Students I have to use an identifier that is common to the
student regardless of which campaign or group they belong to. Let's say
this is their phone number, which is an integer for the purposes of
this question. I realize this is a poor example, but for the purposes
of the question it'll work.

So, the table is indexed not only on the unique identifier, but on the
phone number as well. When I do a SHOW INDEX on this table it shows me
both of these. However, when I do an explain on both, the unique index
shows me that it will have to inspect 1 row while the phone number
shows me that it will have to inspect all of the rows in the table.
Because this table is big, this takes a long time.

So, I guess my question is, why isn't the phone number as efficient as
the unique indentifier? Is it because it could exist multiple times in
the database? Is there any way to make it more efficient? is the only
solution to this problem to make the phone number a unique key for this
table and having associations between groups and campaigns done
elsewhere?

Thanks for your help. I've been researching this for a while, and any
assistance would be greatly appreciated.

Reply With Quote
  #2 (permalink)  
Old 09-14-2006
Kim Hunter
 
Posts: n/a
Default Re: slow query on big table

lambelly@gmail.com wrote:
> I have a table with more than 3,000,000 records. Frequently, I have to
> do an update on this table that is taking a long time and I'd like some
> advice as to what I can do to fix the problem.
>
> Let's call the table students. The primary key is an auto-indent ID
> which is unique for each student. However, the table itself basically
> works to connect students with groups and campaigns they are associated
> with. Because of this a student might exist in the table more than
> once. When I do this update I want it to change information in each of
> the instances of that student, so instead of updating based on the
> unique id in Students I have to use an identifier that is common to the
> student regardless of which campaign or group they belong to. Let's say
> this is their phone number, which is an integer for the purposes of
> this question. I realize this is a poor example, but for the purposes
> of the question it'll work.
>
> So, the table is indexed not only on the unique identifier, but on the
> phone number as well. When I do a SHOW INDEX on this table it shows me
> both of these. However, when I do an explain on both, the unique index
> shows me that it will have to inspect 1 row while the phone number
> shows me that it will have to inspect all of the rows in the table.
> Because this table is big, this takes a long time.
>
> So, I guess my question is, why isn't the phone number as efficient as
> the unique indentifier? Is it because it could exist multiple times in
> the database? Is there any way to make it more efficient? is the only
> solution to this problem to make the phone number a unique key for this
> table and having associations between groups and campaigns done
> elsewhere?
>
> Thanks for your help. I've been researching this for a while, and any
> assistance would be greatly appreciated.
>

updates and insert speeds are affected by the number of indexes you have,

is there any reason you can't just use the autoincrement as the primary
key index, and drop the other indexes (if you do you will see a lot
faster update).
Reply With Quote
  #3 (permalink)  
Old 09-14-2006
lambelly@gmail.com
 
Posts: n/a
Default Re: slow query on big table

Kim Hunter wrote:
>
> is there any reason you can't just use the autoincrement as the primary
> key index, and drop the other indexes (if you do you will see a lot
> faster update).


I could, but I'd have to redesign the table structure. As it is, a
student might be in the table more than once and I need to update this
information on every instance of this student in the student table. So,
the student might be in there maybe two or three times.

If I were to use the autoincrement I'd have to move the associations
with groups and campaigns into a different table. Which might not be a
bad idea.

Reply With Quote
  #4 (permalink)  
Old 09-15-2006
Jerry Stuckle
 
Posts: n/a
Default Re: slow query on big table

lambelly@gmail.com wrote:
> Kim Hunter wrote:
>
>>is there any reason you can't just use the autoincrement as the primary
>>key index, and drop the other indexes (if you do you will see a lot
>>faster update).

>
>
> I could, but I'd have to redesign the table structure. As it is, a
> student might be in the table more than once and I need to update this
> information on every instance of this student in the student table. So,
> the student might be in there maybe two or three times.
>
> If I were to use the autoincrement I'd have to move the associations
> with groups and campaigns into a different table. Which might not be a
> bad idea.
>


First of all, a phone number will be a char field, not an int. And
chars fields always take longer to compare than ints.

However, you have a bigger problem. You shouldn't have anyone in there
more than once. (And what happens if you get two students sharing an
apartment - and therefore the same phone number?)

You need to normalize your database. Have one entry for a student in
this table, and use another table with two columns - a student id and an
organization id.

Yes, you'll have to do a little work to do it this way, but it will cut
down your table size, allow you to search in the (much quicker) auto
increment field and make everything a lot faster.

And google for tutorials on "database normalization". You'll probably
find other places you can make this more efficient.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@attglobal.net
==================
Reply With Quote
  #5 (permalink)  
Old 09-15-2006
lambelly@gmail.com
 
Posts: n/a
Default Re: slow query on big table

Jerry Stuckle wrote:

> You need to normalize your database.


Yeah. I didn't design the database and kind of adopted it when I got
this job. When I talk about normalizing it my boss doesn't really want
to do that because of the changes we'd have to make to the software,
which would be complicated. So, I'm hoping that there are solutions
which will make both of us happy because right now we're hampered with
large problems. However, it looks like all we can do is fix the
database properly.

Reply With Quote
  #6 (permalink)  
Old 09-15-2006
Jerry Stuckle
 
Posts: n/a
Default Re: slow query on big table

lambelly@gmail.com wrote:
> Jerry Stuckle wrote:
>
>
>>You need to normalize your database.

>
>
> Yeah. I didn't design the database and kind of adopted it when I got
> this job. When I talk about normalizing it my boss doesn't really want
> to do that because of the changes we'd have to make to the software,
> which would be complicated. So, I'm hoping that there are solutions
> which will make both of us happy because right now we're hampered with
> large problems. However, it looks like all we can do is fix the
> database properly.
>


With the current design, your options are going to be very limited.
There probably isn't going to be a lot you can do to speed it up
significantly.

You can play with the MySQL tuning parameters some, and might get some
gains. But I suspect since you're main problem is on an UPDATE you
won't gain a lot.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@attglobal.net
==================
Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT +1. The time now is 04:25 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0