This is a discussion on REGEXP Clarification within the MySQL Database forums, part of the Database Forums category; Greetings: I am trying to locate incorrect data in a column using REGEXP. Only numeric data is allowed but I ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
Greetings:
I am trying to locate incorrect data in a column using REGEXP. Only numeric data is allowed but I am importing data, some of which is not properly formatted. I am using: SELECT * FROM `table` WHERE `column` REGEXP "[^[:digit:]]"; to locate any value that is not a number. I've tried several variations and I get an empty set back. I know that there are illegal characters in some of the data but I can't seem to locate it using this query. The field can contain anywhere from 1 to 6 digits, if that is relevant. Advice is much appreciated. -- Regards, Jeff Gardner ___________________________ "Contrary to popular belief, Unix is user friendly. It just happens to be very selective about who its friends are." --Kyle Hearn |
|
|||
|
"Jeff Gardner" wrote... > Greetings: > > I am trying to locate incorrect data in a column using REGEXP. Only > numeric data is allowed but I am importing data, some of which is not > properly formatted. I am using: > > SELECT * > FROM `table` > WHERE `column` REGEXP "[^[:digit:]]"; > > to locate any value that is not a number. I've tried several variations > and I get an empty set back. I know that there are illegal characters in > some of the data but I can't seem to locate it using this query. The > field can contain anywhere from 1 to 6 digits, if that is relevant. Advice > is much appreciated. Could you post an example of the "invalid" data, on my 5.1.12 you query works. Regards Dimitre |
|
|||
|
Radoulov, Dimitre wrote:
> "Jeff Gardner" wrote... >> Greetings: >> >> I am trying to locate incorrect data in a column using REGEXP. Only >> numeric data is allowed but I am importing data, some of which is not >> properly formatted. I am using: >> >> SELECT * >> FROM `table` >> WHERE `column` REGEXP "[^[:digit:]]"; >> >> to locate any value that is not a number. I've tried several variations >> and I get an empty set back. I know that there are illegal characters in >> some of the data but I can't seem to locate it using this query. The >> field can contain anywhere from 1 to 6 digits, if that is relevant. Advice >> is much appreciated. > > Could you post an example of the "invalid" data, > on my 5.1.12 you query works. > > > Regards > Dimitre > > The program that I am importing into tells me that one of the fields contains a period. I am using mysql from the command line to "sanitize" the data before final import. If I find it, I'll post it. -- Regards, Jeff Gardner ___________________________ "Contrary to popular belief, Unix is user friendly. It just happens to be very selective about who its friends are." --Kyle Hearn |
|
|||
|
Jeff Gardner wrote:
> Radoulov, Dimitre wrote: >> "Jeff Gardner" wrote... >>> Greetings: >>> >>> I am trying to locate incorrect data in a column using REGEXP. Only >>> numeric data is allowed but I am importing data, some of which is not >>> properly formatted. I am using: >>> >>> SELECT * >>> FROM `table` >>> WHERE `column` REGEXP "[^[:digit:]]"; >>> >>> to locate any value that is not a number. I've tried several >>> variations and I get an empty set back. I know that there are >>> illegal characters in some of the data but I can't seem to locate it >>> using this query. The field can contain anywhere from 1 to 6 digits, >>> if that is relevant. Advice is much appreciated. >> >> Could you post an example of the "invalid" data, >> on my 5.1.12 you query works. >> >> >> Regards >> Dimitre >> > The program that I am importing into tells me that one of the fields > contains a period. I am using mysql from the command line to "sanitize" > the data before final import. If I find it, I'll post it. > It turns out that there were 13 rows with a NULL value in the field in question. NULL is not whitespace, nor is it digits. Any explanation as to why the above query returned no rows? -- Regards, Jeff Gardner ___________________________ "Contrary to popular belief, Unix is user friendly. It just happens to be very selective about who its friends are." --Kyle Hearn |
|
|||
|
"Jeff Gardner" wrote... > Jeff Gardner wrote: >> Radoulov, Dimitre wrote: >>> "Jeff Gardner" wrote... >>>> Greetings: >>>> >>>> I am trying to locate incorrect data in a column using REGEXP. Only >>>> numeric data is allowed but I am importing data, some of which is not >>>> properly formatted. I am using: >>>> >>>> SELECT * >>>> FROM `table` >>>> WHERE `column` REGEXP "[^[:digit:]]"; >>>> >>>> to locate any value that is not a number. I've tried several >>>> variations and I get an empty set back. I know that there are illegal >>>> characters in some of the data but I can't seem to locate it using this >>>> query. The field can contain anywhere from 1 to 6 digits, if that is >>>> relevant. Advice is much appreciated. [...] > It turns out that there were 13 rows with a NULL value in the field in > question. NULL is not whitespace, nor is it digits. Any explanation as > to why the above query returned no rows? Because of its special meaning, I suppose. "Conceptually, NULL means "a missing unknown value" and it is treated somewhat differently from other values. " Unknown means that you cannot _know_ whether it's a digit or something else (btw, whitespaces are handled correctly with your example query). So, a) Use constraints (not null) or b) Add "column is not null" in your query Regards Dimitre |
|
|||
|
"Radoulov, Dimitre" wrote... > > "Jeff Gardner" wrote... >> Jeff Gardner wrote: >>> Radoulov, Dimitre wrote: >>>> "Jeff Gardner" wrote... >>>>> Greetings: >>>>> >>>>> I am trying to locate incorrect data in a column using REGEXP. Only >>>>> numeric data is allowed but I am importing data, some of which is not >>>>> properly formatted. I am using: >>>>> >>>>> SELECT * >>>>> FROM `table` >>>>> WHERE `column` REGEXP "[^[:digit:]]"; >>>>> >>>>> to locate any value that is not a number. I've tried several >>>>> variations and I get an empty set back. I know that there are illegal >>>>> characters in some of the data but I can't seem to locate it using >>>>> this query. The field can contain anywhere from 1 to 6 digits, if >>>>> that is relevant. Advice is much appreciated. > [...] >> It turns out that there were 13 rows with a NULL value in the field in >> question. NULL is not whitespace, nor is it digits. Any explanation as >> to why the above query returned no rows? > > Because of its special meaning, I suppose. > > "Conceptually, NULL means "a missing unknown value" and it is treated > somewhat differently from other values. " > Unknown means that you cannot _know_ whether it's a digit or something > else (btw, whitespaces are handled correctly with your example query). I ment "as you very likely expect", not "correctly" :) Regards Dimitre |