This is a discussion on Load testing results in "Lost connection to MySQL server during query" within the MySQL Database forums, part of the Database Forums category; I'm load-testing a dedicated server by making 20 CGI requests at the same time, and I'm getting ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
I'm load-testing a dedicated server by making 20 CGI requests
at the same time, and I'm getting an occasional MySQL database error 2013: Lost connection to MySQL server during query The server is running Linux (Red Hat Fedora 5) and MySQL 5. The MySQL server and client are on the same machine, so there shouldn't be a network error. MySQL isn't crashing. The requests aren't large and each one affects only one table row. About 1 in 50 requests fails. 20 simultaneous requests shouldn't be overloading anything, although the server does go compute-bound. I've read http://dev.mysql.com/doc/refman/5.0/en/gone-away.html and none of the problems there seem to apply. John Nagle |
|
|||
|
== Quote from John Nagle (nagle@animats.com)'s article
> I'm load-testing a dedicated server by making 20 CGI requests > at the same time, and I'm getting an occasional > MySQL database error 2013: > Lost connection to MySQL server during query > The server is running Linux (Red Hat Fedora 5) and MySQL 5. > The MySQL server and client are on the same machine, so there shouldn't > be a network error. MySQL isn't crashing. The requests aren't large > and each one affects only one table row. About 1 in 50 requests fails. > 20 simultaneous requests shouldn't be overloading anything, although > the server does go compute-bound. > I've read > http://dev.mysql.com/doc/refman/5.0/en/gone-away.html > and none of the problems there seem to apply. > John Nagle a couple of things: make sure your conx are on socket (socket file defined in the my.cnf) and the other is what are you getting in the mysql log files, if you can post them. finally what kind of engine is the table using. could it be that the myisam is choking because of concurrency issues. |
|
|||
|
More info: This is happening during the opening of a MySQL connection,
not during a query. What's actually failing in Python is db = MySQLdb.connect(host="localhost", use_unicode = True, charset = "utf8", user=username, passwd=password, db=database) and you can see the last line of that in the Python backtrace below. That's raising: OperationalError: (2013, 'Lost connection to MySQL server during query') which is wrong. So it's not related to the query; we're still in connection setup. This only occurs under load, when about 20 programs, each with several threads attached to the database, are running. All the connections are local, to the same machine, so it's not a network problem. The MySQL server can support up to 100 connections, and we monitor that; SHOW PROCESSLIST never shows more than 40 processes. 'MySQL error 2013 "lost connection' has 52,000+ hits in Google, so there might be a problem. Apparently this is the generic MySQL error for "can't connect", as well as an indication of a connection loss. The MySQLAB documentation says that you get a 2002 or 2003 error for "Can't connect" situations. But that does not seem to be the case. So now I have the code retrying the database connect at one second intervals until it either succeeds or fails 30 times in succession. This seems to be a useful workaround. It's not failing in minutes of heavy load; it used to fail within a minute. Still, this shouldn't be happening on a local connection, ever. Exception in thread Thread-5: Traceback (most recent call last): File "/usr/local/lib/python2.5/threading.py", line 460, in __bootstrap self.run() File "./sitetruth/InfoSiteRating.py", line 111, in run if not self.getDb() : # if no database File "./sitetruth/InfoSiteBase.py", line 42, in getDb self.db = self.owner().getDbConnection() # get a new database connection File "./sitetruth/InfoSite.py", line 59, in getDbConnection return(miscutils.dbattach(self.keyfile)) # get a new database connection File "./sitetruth/miscutils.py", line 227, in dbattach user=username, passwd=password, db=database) File "build/bdist.linux-i686/egg/MySQLdb/__init__.py", line 74, in Connect File "build/bdist.linux-i686/egg/MySQLdb/connections.py", line 170, in __init__ super(Connection, self).__init__(*args, **kwargs2) OperationalError: (2013, 'Lost connection to MySQL server during query') John Nagle wrote: > I'm load-testing a dedicated server by making 20 CGI requests > at the same time, and I'm getting an occasional > > MySQL database error 2013: > Lost connection to MySQL server during query > > The server is running Linux (Red Hat Fedora 5) and MySQL 5. > > The MySQL server and client are on the same machine, so there shouldn't > be a network error. MySQL isn't crashing. The requests aren't large > and each one affects only one table row. About 1 in 50 requests fails. > > 20 simultaneous requests shouldn't be overloading anything, although > the server does go compute-bound. > > I've read > http://dev.mysql.com/doc/refman/5.0/en/gone-away.html > and none of the problems there seem to apply. > > John Nagle |
|
|||
|
== Quote from John Nagle (nagle@animats.com)'s article
> More info: This is happening during the opening of a MySQL connection, > not during a query. What's actually failing in Python is > db = MySQLdb.connect(host="localhost", > use_unicode = True, charset = "utf8", > user=username, passwd=password, db=database) > and you can see the last line of that in the Python backtrace > below. That's raising: > OperationalError: (2013, 'Lost connection to MySQL server during query') > which is wrong. > So it's not related to the query; we're still in connection > setup. > This only occurs under load, when about 20 programs, each > with several threads attached to the database, are running. > All the connections are local, to the same machine, so it's > not a network problem. The MySQL server can support up to 100 > connections, and we monitor that; SHOW PROCESSLIST never shows > more than 40 processes. > 'MySQL error 2013 "lost connection' has 52,000+ hits in Google, > so there might be a problem. Apparently this is the generic > MySQL error for "can't connect", as well as an indication of a > connection loss. The MySQLAB documentation says that you get a > 2002 or 2003 error for "Can't connect" situations. But that > does not seem to be the case. > So now I have the code retrying the database connect at one > second intervals until it either succeeds or fails 30 times in > succession. This seems to be a useful workaround. It's not > failing in minutes of heavy load; it used to fail within a minute. > Still, this shouldn't be happening on a local connection, ever. > Exception in thread Thread-5: > Traceback (most recent call last): > File "/usr/local/lib/python2.5/threading.py", line 460, in __bootstrap > self.run() > File "./sitetruth/InfoSiteRating.py", line 111, in run > if not self.getDb() : # > if no database > File "./sitetruth/InfoSiteBase.py", line 42, in getDb > self.db = self.owner().getDbConnection() # get a new database > connection > File "./sitetruth/InfoSite.py", line 59, in getDbConnection > return(miscutils.dbattach(self.keyfile)) # get a new > database connection > File "./sitetruth/miscutils.py", line 227, in dbattach > user=username, passwd=password, db=database) > File "build/bdist.linux-i686/egg/MySQLdb/__init__.py", line 74, in Connect > File "build/bdist.linux-i686/egg/MySQLdb/connections.py", line 170, in __init__ > super(Connection, self).__init__(*args, **kwargs2) > OperationalError: (2013, 'Lost connection to MySQL server during query') > John Nagle wrote: > > I'm load-testing a dedicated server by making 20 CGI requests > > at the same time, and I'm getting an occasional > > > > MySQL database error 2013: > > Lost connection to MySQL server during query > > > > The server is running Linux (Red Hat Fedora 5) and MySQL 5. > > > > The MySQL server and client are on the same machine, so there shouldn't > > be a network error. MySQL isn't crashing. The requests aren't large > > and each one affects only one table row. About 1 in 50 requests fails. > > > > 20 simultaneous requests shouldn't be overloading anything, although > > the server does go compute-bound. > > > > I've read > > http://dev.mysql.com/doc/refman/5.0/en/gone-away.html > > and none of the problems there seem to apply. > > > > John Nagle that's odd. i ran the perror on 2002 and 2003 and 2013 but every time it returned illegal code. i remember i had this same problem a while back. it turned out that i had some corrupted files on myisam and innodb databases. are you sure there is no corruption of data files? to be sure, try to access all of your objects including views and/or stored procedures you may have. |
|
|||
|
lark wrote:
>>John Nagle wrote: >> >>> I'm load-testing a dedicated server by making 20 CGI requests >>>at the same time, and I'm getting an occasional >>> >>> MySQL database error 2013: >>> Lost connection to MySQL server during query >>> >>>The server is running Linux (Red Hat Fedora 5) and MySQL 5. >>> >>>The MySQL server and client are on the same machine, so there shouldn't >>>be a network error. MySQL isn't crashing. The requests aren't large >>>and each one affects only one table row. About 1 in 50 requests fails. >>> >>>20 simultaneous requests shouldn't be overloading anything, although >>>the server does go compute-bound. >>> >>>I've read >>> http://dev.mysql.com/doc/refman/5.0/en/gone-away.html >>>and none of the problems there seem to apply. >>> >>> John Nagle > > > that's odd. i ran the perror on 2002 and 2003 and 2013 but every time it returned > illegal code. "perror" is for UNIX "errno" values, which are a completely different numbering system. > i remember i had this same problem a while back. it turned out that i had some > corrupted files on myisam and innodb databases. are you sure there is no > corruption of data files? to be sure, try to access all of your objects including > views and/or stored procedures you may have. Unlikely. This error is showing up when first establishing the connection to the database server. Retrying connections seems to work well. I haven't been able to force this error in load testing since I set up the code to retry connections at 1 second intervals if the initial attempt fails. It's not happening once a connection has been established. (Remember, this is a local host connection on the same machine; there shouldn't be any errors.) John Nagle |