This is a discussion on How to optimize primary key change on a large table? within the MySQL Database forums, part of the Database Forums category; I have a table with about 200 million rows with a AUTO INCREMENT primary key. However, the auto increment count ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
I have a table with about 200 million rows with a AUTO INCREMENT
primary key. However, the auto increment count is being locked all the time, because there are many concurrent inserts. I haved decided to use an UUID primary key instead, which would increse the primary key field from 32 bits to 96 bits. How to do this migration using the least time possible? I have read that changing primary keys is a very time consuming task, would it be faster to dump and re-populate the table? Thank you. |
|
|||
|
eduardorochabr wrote :
> I have a table with about 200 million rows with a AUTO INCREMENT > primary key. > I haved decided to use an UUID primary key instead, which would > increse the primary key field from 32 bits to 96 bits. Are you using InnoDB engine ? If so, I *highly* discourage you do to so. Primary Key is very special for InnoDB, and: 1) Long PK is to avoid as much as possible. 2) Randomly valued PK is to avoid as much as possible. By using an UUID, you are violating 1 and 2 statement at the same time. So at least consider doing some tests before actually make the migration, to validate it is a good solution (which I doubt). > the auto increment count is being locked all the time, because there > are many concurrent inserts. As a workaround, you may be able to use an outside process to attribute a pseudo auto-increment value to the PK. In case you are using MyISAM (or other) engine, I don't know much about those, and could not help you further. -- Hugo |
|
|||
|
On 16 nov, 09:04, Hugo <h...@nospam.invalid> wrote:
> eduardorochabr wrote : > > > I have a table with about 200 million rows with a AUTO INCREMENT > > primary key. > > I haved decided to use an UUID primary key instead, which would > > increse the primary key field from 32 bits to 96 bits. > > Are you using InnoDB engine ? If so, I *highly* discourage you do to so. > Primary Key is very special for InnoDB, and: > > 1) Long PK is to avoid as much as possible. > 2) Randomly valued PK is to avoid as much as possible. > > By using an UUID, you are violating 1 and 2 statement at the same time. > So at least consider doing some tests before actually make the > migration, to validate it is a good solution (which I doubt). > > > the auto increment count is being locked all the time, because there > > are many concurrent inserts. > > As a workaround, you may be able to use an outside process to attribute > a pseudo auto-increment value to the PK. > > In case you are using MyISAM (or other) engine, I don't know much about > those, and could not help you further. > > -- > Hugo Hi Hugo, Thank you for your answer. I am using innodb, sorry for not telling this before. I will look for alternatives, I am thinking about using a FK in the primary key, so it could be kind of [user_id, some_sequential_value]. This way, I could have the benefits of clustering the index for related records (user based), but the key would be bigger than it is today (at least 32 bits + 32 bits, maybe 32 bits + 64 bits, since I am about to use some kind of timestamping in "some_sequential_value"). Such primary key would still be a trouble for innodb? Thank you. |
|
|||
|
> I have a table with about 200 million rows with a AUTO INCREMENT
> primary key. > > However, the auto increment count is being locked all the time, > because there are many concurrent inserts. > > I haved decided to use an UUID primary key instead, which would > increse the primary key field from 32 bits to 96 bits. I don't get it. You want to migrate the database structure during full load of the database? > How to do this migration using the least time possible? I have read > that changing primary keys is a very time consuming task, would it be > faster to dump and re-populate the table? I would first be worried about the data itself, and only after that about the time it takes. Even then, key migration is not some "modify field" work. There are references to the key, and even if there are no references to it, I'd add a new key field first and drop the old one later, so I get back in every step I take. So: add a new field for UUID, with a uniqueness constraint and random contents, if that is at all possible (I would not know how). If you have to do this in steps (add a non-unique field, fill it with random data, add the constraint), your concurrent inserts will make this impossible. Really, you should get the inserts off-line before doing maintenance like this. Modify the other tables that have a foreign key to the above table, so that they use your new field. If all works well, drop the Primary Key index, The old primary key column, and create a new Primary Key with the new field. That should do it. -- Willem Bogaerts Application smith Kratz B.V. http://www.kratz.nl/ |
|
|||
|
On 16 nov, 11:36, eduardorochabr <eduardoroch...@gmail.com> wrote:
> On 16 nov, 09:04, Hugo <h...@nospam.invalid> wrote: > > > > > eduardorochabr wrote : > > > > I have a table with about 200 million rows with a AUTO INCREMENT > > > primary key. > > > I haved decided to use an UUID primary key instead, which would > > > increse the primary key field from 32 bits to 96 bits. > > > Are you using InnoDB engine ? If so, I *highly* discourage you do to so. > > Primary Key is very special for InnoDB, and: > > > 1) Long PK is to avoid as much as possible. > > 2) Randomly valued PK is to avoid as much as possible. > > > By using an UUID, you are violating 1 and 2 statement at the same time. > > So at least consider doing some tests before actually make the > > migration, to validate it is a good solution (which I doubt). > > > > the auto increment count is being locked all the time, because there > > > are many concurrent inserts. > > > As a workaround, you may be able to use an outside process to attribute > > a pseudo auto-increment value to the PK. > > > In case you are using MyISAM (or other) engine, I don't know much about > > those, and could not help you further. > > > -- > > Hugo > > Hi Hugo, > > Thank you for your answer. I am using innodb, sorry for not telling > this before. > > I will look for alternatives, I am thinking about using a FK in the > primary key, so it could be kind of [user_id, some_sequential_value]. > This way, I could have the benefits of clustering the index for > related records (user based), but the key would be bigger than it is > today (at least 32 bits + 32 bits, maybe 32 bits + 64 bits, since I am > about to use some kind of timestamping in "some_sequential_value"). > > Such primary key would still be a trouble for innodb? > > Thank you. I also had a thought about using a simple field primary key with such composition: (user_id << 64) + (some_sequential_value_maybe_using_timestamp_and_r andom_value) Innodb would cluster this single field primary key in similar way as using [user_id, some_sequential_value]? Thank you. |
|
|||
|
On 16 nov, 11:49, Willem Bogaerts
<w.bogae...@kratz.maardanzonderditstuk.nl> wrote: > > I have a table with about 200 million rows with a AUTO INCREMENT > > primary key. > > > However, the auto increment count is being locked all the time, > > because there are many concurrent inserts. > > > I haved decided to use an UUID primary key instead, which would > > increse the primary key field from 32 bits to 96 bits. > > I don't get it. You want to migrate the database structure during full > load of the database? > > > How to do this migration using the least time possible? I have read > > that changing primary keys is a very time consuming task, would it be > > faster to dump and re-populate the table? > > I would first be worried about the data itself, and only after that > about the time it takes. > > Even then, key migration is not some "modify field" work. There are > references to the key, and even if there are no references to it, I'd > add a new key field first and drop the old one later, so I get back in > every step I take. > > So: add a new field for UUID, with a uniqueness constraint and random > contents, if that is at all possible (I would not know how). If you have > to do this in steps (add a non-unique field, fill it with random data, > add the constraint), your concurrent inserts will make this impossible. > Really, you should get the inserts off-line before doing maintenance > like this. > > Modify the other tables that have a foreign key to the above table, so > that they use your new field. > > If all works well, drop the Primary Key index, The old primary key > column, and create a new Primary Key with the new field. > > That should do it. > > -- > Willem Bogaerts > > Application smith > Kratz B.V.http://www.kratz.nl/ Hi William, thanks for your answer. You're right, I will get the system offline while doing this migration. However, I will try to minimize this downtime as much as possible. The steps you've described are one of the paths I am about to follow, though I am afraid that recalculating the PK like this should be much slower than export/import the table with the new PK structure. |
|
|||
|
> The steps you've described are one of the paths I am about to follow,
> though I am afraid that recalculating the PK like this should be much > slower than export/import the table with the new PK structure. I don't know. You could be right. But whatever path you use, do not forget to set the autocommit off. That really speeds things up! Good luck, -- Willem Bogaerts Application smith Kratz B.V. http://www.kratz.nl/ |
|
|||
|
Not entirely sure this would work with InnoDB.. but I've used it with
MyISAM tables with more than 50 million rows. I'd suggest making a copy of the table. With the copy, do: ALTER TABLE <table> CHANGE <key_column_name> <key_column_name> BIGINT UNSIGNED AUTO_INCREMENT The PRIMARY KEY status will remain there. Per MySQL "However, BIGINT values have 64 bits, so do note that if you were to insert one million rows per second, it would still take nearly three hundred thousand years before BIGINT reached its upper bound." Then you get the benefits of keeping it as an integer type instead of a UUID. Once it finishes, this is where you could expect some down time. Go offline for a while, and select any entries from the first table that don't exist in the original table. If you've been using an auto_increment column, then you can do INSERT INTO <table> SELECT * FROM <original_table> WHERE id > <last_entry_id>. Shutdown the DB server and move the new tables/db over. For changes to multiple tables, I'd suggest making a new DB and use the same table names. This way you only need to rename the old DB files (add a _ to the end or something, so MySQL won't load them).. then move the new DB tables into the old DB directory and start the server up again. Note that this will work for MyISAM tables.. I'm not 100% sure about InnoDB tables. Although with an InnoDB table, I'm sure it will work up to the point of moving the new DB files into the old DB folder. From there, I'd advise doing a little test of your own before trying any of this. Another method that I've used is very circumstantial. I have it set up to work with a view for reads, but writes go directly to the table. This was by design, so I could change the view over to an offline table without losing any write requests. This allowed me to create the "offline" table, copy the data I wanted into it, change the view to read from the offline table, alter the original table, then change the view back and drop the "offline" table. The number of columns was low, and the row count would at most, EVER be around two million rows.. after years and years and years of collecting data. Hope some of this is useful. Regards, Michael Martinek |