Replace all? Nah, too hard; looping 10 times should be enough.



  • So I'm converting some (read: a lot of) legacy C code to C#, and discovered a substitue function used throughout the product to copy a char* replacing %FIELD% with a new value. For some reason it only converted the first occurance and I decided that was stupid and just changed it to a string.Replace in C# (which replaces all occurrances).

    It appears at some point someone recognised this limitation. Did they fix the function? Nope. Create a new SubstituteAll function? Nope. They did this every time it didn't work properly.

        for(j=0; j < 10; j++)
        {
          strcpy_s(temp, dest);
          Substitute(temp, "%STARTDATE%", startDate, dest);
          strcpy_s(temp, dest);
          Substitute(temp, "%ENDDATE%", endDate, dest);
        }

    Will someone please shoot me?

     



  • Why did they copy the string four times instead of calling Substitute in both directions?



  • The substitute function doesn't make an internal copy of the string, so it would/may crap itself when trying to read from and write to the same string. There are probably places in the code where it genuinely wants to leave the original unchanged and as we've seen from this post, the developer was rather averse to creating methods to do what he needs when he can use one that already exists (even if that means writing more code every time he needs to use it).



  • @Mithious said:

    The substitute function doesn't make an internal copy of the string, so it would/may crap itself when trying to read from and write to the same string. There are probably places in the code where it genuinely wants to leave the original unchanged and as we've seen from this post, the developer was rather averse to creating methods to do what he needs when he can use one that already exists (even if that means writing more code every time he needs to use it).

    I mean like this:

        for(j=0; j < 10; j++)
        {
          Substitute(dest, "%STARTDATE%", startDate, temp);
          Substitute(temp, "%ENDDATE%", endDate, dest);
        } 
    


  • Ah I see. To be honest I think that would be even more of a WTF, you're now coupling two unrelated statements so that if someone comments one out, or adds a third line suddenly the program breaks. More efficient yes, but much less maintainable.



  •     for(j=0; j < 5; j++)
        {
          Substitute(dest, "%STARTDATE%", startDate, temp);
          Substitute(temp, "%STARTDATE%", startDate, dest);
          Substitute(dest, "%ENDDATE%", endDate, temp);
          Substitute(temp, "%ENDDATE%", endDate, dest);
        } 
    




  • Does Substitute not take a size parameter?



  •  @immibis said:

    Does Substitute not take a size parameter?

    Do you mean a buffer size for the destination to prevent buffer overruns? It does, although in this case it's being obtained automatically using a template function (in the same way that strcpy_s can).



  • This is fine as long as he adds:

    #define 10 CountChar(dest, "%")/2

    Also, did Substitute return whether it had done any replacement? If not, the author had no choice but to call it multiple times and hope things would be alright.

     


  • BINNED

    @Mithious said:

    Ah I see. To be honest I think that would be even more of a WTF, you're now coupling two unrelated statements so that if someone comments one out, or adds a third line suddenly the program breaks. More efficient yes, but much less maintainable.

    So, if I understand you correctly, you're saying that you're TRWTF?



  • @geocities said:

    Also, did Substitute return whether it had done any replacement? If not, the author had no choice but to call it multiple times and hope things would be alright.
     

    No it doesn't, so you're right, looping 10 times and hoping for the best was totally his only option :P

    @PedanticCurmudgeon said:

    So, if I understand you correctly, you're saying that you're TRWTF because...(snip)?

    I'm going to pray you're having a laugh there, the alternative is little bit frightening. :)

     


  • Considered Harmful

    Early in my career I wrote an environmental variable replacement method once that would chew up all the available memory and crash if there was any kind of circular reference. (It was a required feature that variables be able to contain other variables, at least 4-ply deep.) Stupid me, but since then I always limit the maximum number of iterations an algorithm can take; 10 seems reasonable here.

    (10 in my method would mean maximum nesting depth not maximum number of total replacements.)


  • BINNED

    @Mithious said:

    @PedanticCurmudgeon said:

    So, if I understand you correctly, you're saying that you're TRWTF because...(snip)?

    I'm going to pray you're having a laugh there, the alternative is little bit frightening. :)

     

    At least you're aware that there was a joke.



  • @joe.edwards said:

    since then I always limit the maximum number of iterations an algorithm can take; 10 seems reasonable here.

    I've done similar, in other situations I've kept a dictionary of replacements it's already encountered and thrown an exception when it finds one it's seen before.

    In this case though, simply updating the substitute function to do what it should have done in the first place was the sensible course of action, the string can then contain 1000 instances of it and it'll cope fine. It's not like your example where making the substitution will potentially result in more substitutions, it simply requires one run through the string replacing each instance it finds.

     


Log in to reply