The Daily WTF: Curious Perversions in Information Technology
Welcome to TDWTF Forums Sign in | Join | Help
in Search

It works...

Last post 05-22-2007 12:16 AM by Random832. 34 replies.
Page 1 of 1 (35 items)
Sort Posts: Previous Next
  • 04-25-2007 4:51 PM

    It works...

    Found in previous employee's code.  Not a huge deal, but, I think it's kinda funny:

     

    public static bool IsSixDigits(string strDigits)
    {
       Regex objSixDigitPattern = new Regex("[0-9][0-9][0-9][0-9][0-9][0-9]");
       return objSixDigitPattern.IsMatch(strDigits);
    }

    Filed under:
  • 04-25-2007 5:36 PM In reply to

    Re: It works...

    Not too bad. It's a little overkill to instantiate a Regex object just for that. I would've stuck with a numeric and length check.

    Join us at #TDWTF on irc.slashnet.org !

  • 04-25-2007 5:54 PM In reply to

    • skippy
    • Top 150 Contributor
    • Joined on 03-10-2006
    • Calgary, AB
    • Posts 174

    Re: It works...

    Is that for exactly six digits or at least 6?  
  • 04-25-2007 6:07 PM In reply to

    Re: It works...

    The real WTF is that he didn't do "[0-9]{6}".

    The real real WTF is that he forgot ^ and $. 

  • 04-25-2007 8:10 PM In reply to

    Re: It works...

    viraptor:

    The real real WTF is that he forgot ^ and $. 



    Oh yeah.. that, too.

    Join us at #TDWTF on irc.slashnet.org !

  • 04-25-2007 8:30 PM In reply to

    • tster
    • Top 10 Contributor
    • Joined on 04-11-2006
    • Worcester, MA
    • Posts 1,106

    Re: It works...

    AbbydonKrafts:
    Not too bad. It's a little overkill to instantiate a Regex object just for that. I would've stuck with a numeric and length check.

    probably because you have not mastered regexes.  Otherwise you would have realized that "^\d{6}$" is much easier to write and maintain than however you would have done it. 

    The pig go. Go is to the fountain. The pig put foot. Grunt. Foot in what? ketchup. The dove fly. Fly is in sky. The dove drop something. The something on the pig. The pig disgusting... see bio for the earth shattering ending.
  • 04-25-2007 8:40 PM In reply to

    Re: It works...

    tster:

    AbbydonKrafts:
    Not too bad. It's a little overkill to instantiate a Regex object just for that. I would've stuck with a numeric and length check.

    probably because you have not mastered regexes.  Otherwise you would have realized that "^\d{6}$" is much easier to write and maintain than however you would have done it. 

    Yes, but, since the one minor strike against me on my quiz-for-employment was with a regular expression problem not entirely dissimilar to this one, I must point out the error: that particular regular expression you have would match "123456\n" - which is six digits and a newline, not just six digits. Apparently it's generally-somewhat-better to use \A and \Z instead of ^ and $ in validation cases like this. Pesky newlines.
    :(){ :|:& };:
  • 04-25-2007 9:06 PM In reply to

    Re: It works...

    Logarithms are your friends...
  • 04-26-2007 4:12 AM In reply to

    Re: It works...

    fennec:
    the one minor strike against me on my quiz-for-employment was with a regular expression problem not entirely dissimilar to this one, I must point out the error: that particular regular expression you have would match "123456\n" - which is six digits and a newline, not just six digits. Apparently it's generally-somewhat-better to use \A and \Z instead of ^ and $ in validation cases like this. Pesky newlines.

    If your regex engine does that, it depends on the flavour. :)

    In JS, /^\d{6}$/ will test for precisely six numeric characters, and not the newline. $ will test for the end of the string. It will return false on "123456\n", as the end of the string is not taken by a digit.

    When multiline mode is set, all newlines function as end-of-string, and the $ will also match \n. However, $ matches a position, and not a character, and the string selected by the reg will not include a \n.

    And it will suck balls, by the way, if $ suddenly returns a whole character. It voids the use of $.

    — Flurp.
  • 04-26-2007 9:57 AM In reply to

    Re: It works...

    dhromed:
    If your regex engine does that, it depends on the flavour. :)
    Perl.

    In JS, /^\d{6}$/ will test for precisely six numeric characters, and not the newline. $ will test for the end of the string. It will return false on "123456\n", as the end of the string is not taken by a digit.

    When multiline mode is set, all newlines function as end-of-string, and the $ will also match \n. However, $ matches a position, and not a character, and the string selected by the reg will not include a \n.

    And it will suck balls, by the way, if $ suddenly returns a whole character. It voids the use of $.

    True. But you're in an isSixDigits function, not an extractSixDigits function, so even if it doesn't return the character, the string being checked will still pass if it matches. This looks like Java. What would Java do with this, I wonder...? maybe I'll test it some day.
    :(){ :|:& };:
    Filed under: ,
  • 04-26-2007 10:10 AM In reply to

    • XIU
    • Top 200 Contributor
    • Joined on 01-08-2007
    • Posts 126

    Re: It works...

    Looks dotnet to me, Java doesn't use Pascal casing.
  • 04-26-2007 10:22 AM In reply to

    • zip
    • Top 500 Contributor
    • Joined on 08-26-2005
    • Posts 89

    Re: It works...

    dhromed:

    fennec:
    the one minor strike against me on my quiz-for-employment was with a regular expression problem not entirely dissimilar to this one, I must point out the error: that particular regular expression you have would match "123456\n" - which is six digits and a newline, not just six digits. Apparently it's generally-somewhat-better to use \A and \Z instead of ^ and $ in validation cases like this. Pesky newlines.

    If your regex engine does that, it depends on the flavour. :)

    In JS, /^\d{6}$/ will test for precisely six numeric characters, and not the newline. $ will test for the end of the string. It will return false on "123456\n", as the end of the string is not taken by a digit.

    When multiline mode is set, all newlines function as end-of-string, and the $ will also match \n. However, $ matches a position, and not a character, and the string selected by the reg will not include a \n.

    And it will suck balls, by the way, if $ suddenly returns a whole character. It voids the use of $.

     

    The fact that there's this much discussion about as simple a regex as "is it 6 numeric characters?"  is why regexes are more trouble than they're worth at least half the time.

    If I can write code that does what the regex does in less than 60 seconds, there's no reason to use one.  If I write code like this:

    match = true
    match &= (string length is 6)
    
    foreach(char in string)
    {
     match &= char >= 0 && char <= 9
    }
    return match
    

     There's no gotchas about regex engines and I can get on with my life.


     

  • 04-26-2007 12:09 PM In reply to

    Re: It works...

    tster:

    probably because you have not mastered regexes.  Otherwise you would have realized that "^\d{6}$" is much easier to write and maintain than however you would have done it. 



    You have no idea how many RegExes are in this Frankenstein JavaScript/HTML/WinForm/COM thing that I have to maintain. I've had to write plenty due to the way the "SDK" works.

    However, for simple validation in WinForm apps, I would write out the code for it. It's easier to read and maintain. Also, it's guaranteed to work. I've already found that every single RegEx implementation is different from the next. So, "mastering" RegEx is almost like mastering the English language including its many dialects-within-dialects (ex: American English/New York Bronx). It's retarded when I have to tinker with a RegEx that works fine in one language just to make it work in another. Yet, it's pretty much a straight translation when regular code is involved.

    Join us at #TDWTF on irc.slashnet.org !

  • 04-26-2007 2:43 PM In reply to

    Re: It works...

    zip:
    The fact that there's this much discussion about as simple a regex as "is it 6 numeric characters?"  is why regexes are more trouble than they're worth at least half the time.

    As the saying goes, 

    "Sometimes you have this problem, and you think Regex may help you solve it.

    Now you have two problems." 

    — Flurp.
  • 04-26-2007 4:13 PM In reply to

    Re: It works...

    Correct, this is C#
  • 04-26-2007 4:25 PM In reply to

    Re: It works...

    zip:

    If I can write code that does what the regex does in less than 60 seconds, there's no reason to use one.  If I write code like this:

    match = true
    match &= (string length is 6)

    foreach(char in string)
    {
    match &= char >= 0 && char <= 9
    }
    return match

     There's no gotchas about regex engines and I can get on with my life.

    Well, ok, but I would leave out the  
    match = true
    and just write
    match = (string length is 6)

    I wouldn't bother iterating the chars either, if the length wasn't 6.  It's not a WTF, but there's no need to be gratuitously inefficient - how do you suppose Windoze ended up getting so bloated?


     
    
    										    
    									    
  • 04-26-2007 5:33 PM In reply to

    Re: It works...

    zip:

    The fact that there's this much discussion about as simple a regex as "is it 6 numeric characters?"  is why regexes are more trouble than they're worth at least half the time.

    If I can write code that does what the regex does in less than 60 seconds, there's no reason to use one.  If I write code like this:

    match = true
    match &= (string length is 6)

    foreach(char in string)
    {
    match &= char >= 0 && char <= 9
    }
    return match

    The primary reason for using a regexp is that the regexp engine is probably much smarter than you are about executing them efficiently. For example, the code you quote is considerably slower (O(N)) than what any reasonable regexp engine would have generated for the correct regexp (O(1)).

    You can implement anything in a Turing-complete language, but that doesn't mean you necessarily should. More restrictive languages offer better opportunities for an optimising compiler. Regexps are a special case: for anything that can be implemented as a regexp, there is a known algorithm for generating the fastest code possible. (Turing-complete languages lie at the opposite end of the scale: there is a proof that no such algorithm can exist for a program written in such a language)

  • 04-26-2007 10:29 PM In reply to

    • zip
    • Top 500 Contributor
    • Joined on 08-26-2005
    • Posts 89

    Re: It works...

    DaveK:
    zip:

    If I can write code that does what the regex does in less than 60 seconds, there's no reason to use one.  If I write code like this:

    match = true
    match &= (string length is 6)

    foreach(char in string)
    {
    match &= char >= 0 && char <= 9
    }
    return match

     There's no gotchas about regex engines and I can get on with my life.

    Well, ok, but I would leave out the  
    match = true
    and just write
    match = (string length is 6)

    I wouldn't bother iterating the chars either, if the length wasn't 6.  It's not a WTF, but there's no need to be gratuitously inefficient - how do you suppose Windoze ended up getting so bloated?

    I hope some day I can be as 1337 as you at critiquing pseudocode.



     

  • 04-26-2007 10:36 PM In reply to

    • zip
    • Top 500 Contributor
    • Joined on 08-26-2005
    • Posts 89

    Re: It works...

    asuffield:
    zip:

    The fact that there's this much discussion about as simple a regex as "is it 6 numeric characters?"  is why regexes are more trouble than they're worth at least half the time.

    If I can write code that does what the regex does in less than 60 seconds, there's no reason to use one.  If I write code like this:

    match = true
    match &= (string length is 6)

    foreach(char in string)
    {
    match &= char >= 0 && char <= 9
    }
    return match

    The primary reason for using a regexp is that the regexp engine is probably much smarter than you are about executing them efficiently. For example, the code you quote is considerably slower (O(N)) than what any reasonable regexp engine would have generated for the correct regexp (O(1)).

     Are you joking?  You really think running time of a regex doesn't scale linearly as input size increases?

  • 04-26-2007 11:35 PM In reply to

    Re: It works...

    zip:
    asuffield:
    zip:

    The fact that there's this much discussion about as simple a regex as "is it 6 numeric characters?"  is why regexes are more trouble than they're worth at least half the time.

    If I can write code that does what the regex does in less than 60 seconds, there's no reason to use one.  If I write code like this:

    match = true
    match &= (string length is 6)

    foreach(char in string)
    {
    match &= char >= 0 && char <= 9
    }
    return match

    The primary reason for using a regexp is that the regexp engine is probably much smarter than you are about executing them efficiently. For example, the code you quote is considerably slower (O(N)) than what any reasonable regexp engine would have generated for the correct regexp (O(1)).

     Are you joking?  You really think running time of a regex doesn't scale linearly as input size increases?

    For one that anchors to the start and end with ^$ or \A\Z and makes no use of the Kleen closure? I should hope it doesn't. Otherwise, whoever wrote the regular expression engine clearly hasn't optimized it nearly enough. Would you like me to run some benchmarks?
    :(){ :|:& };:
    Filed under: ,
  • 04-27-2007 4:12 AM In reply to

    Re: It works...

    zip:
    asuffield:
    zip:
    match = true
    match &= (string length is 6)

    foreach(char in string)
    {
    match &= char >= 0 && char <= 9
    }
    return match

    The primary reason for using a regexp is that the regexp engine is probably much smarter than you are about executing them efficiently. For example, the code you quote is considerably slower (O(N)) than what any reasonable regexp engine would have generated for the correct regexp (O(1)).

     Are you joking?  You really think running time of a regex doesn't scale linearly as input size increases?

    This particular regexp can be resolved by inspecting a maximum of six bytes of memory, regardless of the input size. The standard regexp compiler algorithm will find this solution.

  • 04-27-2007 5:24 AM In reply to

    Re: It works...

    asuffield:

    This particular regexp can be resolved by inspecting a maximum of six bytes of memory, regardless of the input size. The standard regexp compiler algorithm will find this solution.

    I have two problems with this comment.

    1) It would take a minimum of 7 bytes to be checked - one for each digit and at least one one to check that the length is 6 (which could be by reading the following byte and checking for a terminating zero, or reading a length value)

    2) Are you really claiming that the C# compiler will optimize a Regex object to inline code by examining the contents of the expression - I very very much doubt that!

    Linux is not a code base. Or a distro. Or a kernel. It's an attitude. And it's not about Open Source. It's about a bunch of people who still think vi is a good config UI.

    Notice: Phorm, and it's agents including ISPs collecting data on Phorm's behalf, are specifically forbidden from performing any processing or monitoring of the content of the above post. Hence, under the Regulation of Investigatory Powers Act 2000 any such attempt to profile this page by Phorm or it's agents is illegal.
  • 04-27-2007 10:07 AM In reply to

    • zip
    • Top 500 Contributor
    • Joined on 08-26-2005
    • Posts 89

    Re: It works...

    asuffield:
    zip:
    asuffield:
    zip:
    match = true
    match &= (string length is 6)

    foreach(char in string)
    {
    match &= char >= 0 && char <= 9
    }
    return match

    The primary reason for using a regexp is that the regexp engine is probably much smarter than you are about executing them efficiently. For example, the code you quote is considerably slower (O(N)) than what any reasonable regexp engine would have generated for the correct regexp (O(1)).

     Are you joking?  You really think running time of a regex doesn't scale linearly as input size increases?

    This particular regexp can be resolved by inspecting a maximum of six bytes of memory, regardless of the input size. The standard regexp compiler algorithm will find this solution.

     

    Oh, ok, so if you know the right input contains exactly 6 chars then you only need to look at 6 bytes of memory.  So change my pseudocode to only run the foreach statement if string length is 6, or to break on a 7th character and they're both O(1).

     The regex still isn't magically faster.  I thought you were saying that a the regex runs in O(1) when N is the number of input chars that need to be examined to determine validation, not just the total number of input chars.
     

  • 04-27-2007 10:17 AM In reply to

    Re: It works...

    GettinSadda:
    asuffield:

    This particular regexp can be resolved by inspecting a maximum of six bytes of memory, regardless of the input size. The standard regexp compiler algorithm will find this solution.

    I have two problems with this comment.

    1) It would take a minimum of 7 bytes to be checked - one for each digit and at least one one to check that the length is 6 (which could be by reading the following byte and checking for a terminating zero, or reading a length value)

    Bah. Depends how you're thinking of the problem. For regexps anchored on both ends, yes, you need to check seven bytes. I was assuming left-anchor only, in the style of a stream parser (since that's what regexps are most commonly used for).

    2) Are you really claiming that the C# compiler will optimize a Regex object to inline code by examining the contents of the expression - I very very much doubt that!

    I have no idea what the C# compiler does. Good regexp engines more or less do this. They don't actually generate inline code, because the standard regexp algorithm uses exactly the same code to evaluate every regexp - what they do is generate a table of data which that code then uses. The code itself will be in a library function, which the compiler may or may not inline, and is approximately this:

    i = 0;
    state = start;
    while (state) {
      c = data[i]; // where data is the string to examine
      edge = state[c];
      state = edge.state;
      i += edge.shift;
    }
    
    (All the real work is done by the regexp compiler, which generates a 'program' of states and e