Checksum Table producing different results?

This is a discussion on Checksum Table producing different results? within the MySQL Database forums, part of the Database Forums category; == Quote from ircmaxell (ircmaxell@gmail.com)'s article > On May 23, 5:46 pm, gordonb.ni...@burditt.org (Gordon ...


Go Back   Usenet Forums > Database Forums > MySQL Database

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #11 (permalink)  
Old 05-24-2007
lark
 
Posts: n/a
Default Re: Checksum Table producing different results?

== Quote from ircmaxell (ircmaxell@gmail.com)'s article
> On May 23, 5:46 pm, gordonb.ni...@burditt.org (Gordon Burditt) wrote:
> > >> >I've got 2 database servers running the same database (cannot do
> > >> >replication as one is live and the other development). One of the
> > >> >databases needs to be synced from time to time. Because it's rather
> > >> >large, I'm looking to only transfer tables that have changed, and
> > >> >hence am using the CHECKSUM TABLE command to determine if tables have
> > >> >identical data. Dev server is MySQL 5.0.24a Live is MySQL 5.0.18

> >
> > >> I don't think there is any guarantee that CHECKSUM TABLE produces
> > >> the same checksum even if the data contents is the same. For
> > >> example, the records could be in different physical order. (Try
> > >> that with a text file: run md5 on the file, re-order a couple of
> > >> the lines, and try again. You'll almost certainly come up with a
> > >> different checksum.) Unallocated space might make a difference,
> > >> too.

> >
> > >> >The problem, is that a few of our tables NEVER report identical
> > >> >checksums. I have verified that they are identical (by exporting from
> > >> >one server, importing to the other as a different name, and doing a
> > >> >checksum on that server). Any ideas as to what I can do? Schema's are
> > >> >Identical. I've ran optimize and repair table, to no effect. Thanks!

> >
> > >Is there any alternative as to how I can check a table for mirrored
> > >data?

> >
> > Consider this: mysqldump both tables into separate files, using
> > options that produce a single insert statement per record. Sort
> > both files. (at this point you've probably messed up the file to
> > the point of not being able to restore it, but this doesn't matter).
> > Run diff(1) between the two sorted files.- Hide quoted text -
> >
> > - Show quoted text -

> That's the problem I was hoping to avoid... We've got a T-1 between
> the servers, and about 100meg of data. To download all the data would
> take about 9 minutes, processing it would be quick, and updating would
> be slow (about 30 minutes, as instead of syncing, if they differ, I
> drop the entire table and re-install it). I wish there was an easier
> (and faster) way...



ok, this is a long shot but i've heard that google has some nifty tools you may be
able to use. take a look:

http://code.google.com/p/google-mysql-tools/



alternatively you could get creative and use something like this:

mysqldump -u root -proot_pwd --where="idimg between 93305 and 93346"
--insert-ignore mydb images | mysql -h master_host -u root -proot_pwd mydb

you may want to ignore --where clause for your purposes.
--
POST BY: PHP News Reader
Reply With Quote
  #12 (permalink)  
Old 05-24-2007
ircmaxell
 
Posts: n/a
Default Re: Checksum Table producing different results?

On May 24, 11:35 am, lark <ham...@sbcglobal.net> wrote:
> == Quote from ircmaxell (ircmax...@gmail.com)'s article
>
>
>
> > On May 23, 5:46 pm, gordonb.ni...@burditt.org (Gordon Burditt) wrote:
> > > >> >I've got 2 database servers running the same database (cannot do
> > > >> >replication as one is live and the other development). One of the
> > > >> >databases needs to be synced from time to time. Because it's rather
> > > >> >large, I'm looking to only transfer tables that have changed, and
> > > >> >hence am using the CHECKSUM TABLE command to determine if tables have
> > > >> >identical data. Dev server is MySQL 5.0.24a Live is MySQL 5.0.18

>
> > > >> I don't think there is any guarantee that CHECKSUM TABLE produces
> > > >> the same checksum even if the data contents is the same. For
> > > >> example, the records could be in different physical order. (Try
> > > >> that with a text file: run md5 on the file, re-order a couple of
> > > >> the lines, and try again. You'll almost certainly come up with a
> > > >> different checksum.) Unallocated space might make a difference,
> > > >> too.

>
> > > >> >The problem, is that a few of our tables NEVER report identical
> > > >> >checksums. I have verified that they are identical (by exporting from
> > > >> >one server, importing to the other as a different name, and doing a
> > > >> >checksum on that server). Any ideas as to what I can do? Schema's are
> > > >> >Identical. I've ran optimize and repair table, to no effect. Thanks!

>
> > > >Is there any alternative as to how I can check a table for mirrored
> > > >data?

>
> > > Consider this: mysqldump both tables into separate files, using
> > > options that produce a single insert statement per record. Sort
> > > both files. (at this point you've probably messed up the file to
> > > the point of not being able to restore it, but this doesn't matter).
> > > Run diff(1) between the two sorted files.- Hide quoted text -

>
> > > - Show quoted text -

> > That's the problem I was hoping to avoid... We've got a T-1 between
> > the servers, and about 100meg of data. To download all the data would
> > take about 9 minutes, processing it would be quick, and updating would
> > be slow (about 30 minutes, as instead of syncing, if they differ, I
> > drop the entire table and re-install it). I wish there was an easier
> > (and faster) way...

>
> ok, this is a long shot but i've heard that google has some nifty tools you may be
> able to use. take a look:
>
> http://code.google.com/p/google-mysql-tools/
>
> alternatively you could get creative and use something like this:
>
> mysqldump -u root -proot_pwd --where="idimg between 93305 and 93346"
> --insert-ignore mydb images | mysql -h master_host -u root -proot_pwd mydb
>
> you may want to ignore --where clause for your purposes.
> --
> POST BY: PHP News Reader

Not a bad idea... The only problem with something like that is that
we are using Windows 2003 servers on both ends... I'll check into
google's tools... Thanks!

Reply With Quote
  #13 (permalink)  
Old 05-24-2007
ircmaxell
 
Posts: n/a
Default Re: Checksum Table producing different results?

On May 24, 11:42 am, ircmaxell <ircmax...@gmail.com> wrote:
> On May 24, 11:35 am, lark <ham...@sbcglobal.net> wrote:
>
>
>
> > == Quote from ircmaxell (ircmax...@gmail.com)'s article

>
> > > On May 23, 5:46 pm, gordonb.ni...@burditt.org (Gordon Burditt) wrote:
> > > > >> >I've got 2 database servers running the same database (cannot do
> > > > >> >replication as one is live and the other development). One of the
> > > > >> >databases needs to be synced from time to time. Because it's rather
> > > > >> >large, I'm looking to only transfer tables that have changed, and
> > > > >> >hence am using the CHECKSUM TABLE command to determine if tables have
> > > > >> >identical data. Dev server is MySQL 5.0.24a Live is MySQL 5.0.18

>
> > > > >> I don't think there is any guarantee that CHECKSUM TABLE produces
> > > > >> the same checksum even if the data contents is the same. For
> > > > >> example, the records could be in different physical order. (Try
> > > > >> that with a text file: run md5 on the file, re-order a couple of
> > > > >> the lines, and try again. You'll almost certainly come up with a
> > > > >> different checksum.) Unallocated space might make a difference,
> > > > >> too.

>
> > > > >> >The problem, is that a few of our tables NEVER report identical
> > > > >> >checksums. I have verified that they are identical (by exporting from
> > > > >> >one server, importing to the other as a different name, and doing a
> > > > >> >checksum on that server). Any ideas as to what I can do? Schema's are
> > > > >> >Identical. I've ran optimize and repair table, to no effect. Thanks!

>
> > > > >Is there any alternative as to how I can check a table for mirrored
> > > > >data?

>
> > > > Consider this: mysqldump both tables into separate files, using
> > > > options that produce a single insert statement per record. Sort
> > > > both files. (at this point you've probably messed up the file to
> > > > the point of not being able to restore it, but this doesn't matter).
> > > > Run diff(1) between the two sorted files.- Hide quoted text -

>
> > > > - Show quoted text -
> > > That's the problem I was hoping to avoid... We've got a T-1 between
> > > the servers, and about 100meg of data. To download all the data would
> > > take about 9 minutes, processing it would be quick, and updating would
> > > be slow (about 30 minutes, as instead of syncing, if they differ, I
> > > drop the entire table and re-install it). I wish there was an easier
> > > (and faster) way...

>
> > ok, this is a long shot but i've heard that google has some nifty tools you may be
> > able to use. take a look:

>
> >http://code.google.com/p/google-mysql-tools/

>
> > alternatively you could get creative and use something like this:

>
> > mysqldump -u root -proot_pwd --where="idimg between 93305 and 93346"
> > --insert-ignore mydb images | mysql -h master_host -u root -proot_pwd mydb

>
> > you may want to ignore --where clause for your purposes.
> > --
> > POST BY: PHP News Reader

>
> Not a bad idea... The only problem with something like that is that
> we are using Windows 2003 servers on both ends... I'll check into
> google's tools... Thanks!- Hide quoted text -
>
> - Show quoted text -


Alright, I figured it out... Here's what I'm doing
SELECT BIT_XOR(
CONCAT( col1, col2, col3, col4, col...)
)
FROM table1
GROUP BY NULL

Since order doesn't matter in an XOR, different orderings will not
matter. There are a few limitations. For example, the following two
sets have the same XOR value
col1 - col2
1 - test1234
2 - test4321

AND
col1 - col2
1 - test4321
2 - test1234

It checks to see if the data is the same, not necessarally the
ordering/paring of data. This should be fine for my needs, so I think
I have my anser... (just need to check with the boss when he gets back
from lunch).

Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT +1. The time now is 05:27 PM.


Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0