On 30 Nov 2002, Alexandre Julliard wrote:
Francois Gouget fgouget@free.fr writes:
I don't like pieces of code that go:
strcpy(foo, bar1); strcat(foo, bar2); strcat(foo, bar3); strcat(foo, bar4); strcat(foo, bar5); strcat(foo, bar6);
It's really inefficient: the cost increases quadratically with the size of the resulting string.
Well, no, the cost is linear. It would only be quadratic if the number of strcat calls depended on the length of the string.
True.
It's more efficient to do:
sprintf(foo, "%s%s%s%s%s%s", bar1,bar2,bar3,bar4,bar5,bar6);
I seriously doubt that sprintf would be faster that a couple of strcats. And I don't think we need to worry about this kind of micro-optimizations right now...
Which is why I'm not going through the source looking for them. It's just that sometimes I find one and cringe when I think at the inefficiency of reading the first string 3 or more times and having even more stuff to read with each strcat.
This seems like a popular subject and everyone goes with their benchmark. So I had to do one too<g>.
I took what I think is a quite common case where one does concatenates three strings, typically a path with a filename:
sprintf(buf,"%s/%s",path,filename);
The goal of my benchmark (attached) is to determine for what path length an sprintf becomes as efficient as an strcpy+2*strcat combination. I implemented the above operation using: * sprintf * strcpy+strcat * my own naive strcpy+strcat, just to see how much difference it makes compared to the optimized glibc implementation * a cpycat implementation which is how strcpy and strcat should have been implemented in the first place. cpycat returns a pointer to the trailing '\0' which means you can chain calls to cpycat.
The test have been run on Debian (so assume the glibc may not be optimized much), on an Athlon processor and compiled with a simple '-O2 -fPIC', i.e. with the same options as Wine. The cutoff point turns out to be for a path length of about 220 characters:
$ ./sprintf 100 len=100 sprintf: elapsed=45791 strcat: elapsed=27811 mystrcat: elapsed=27830 cpycat: elapsed=13205
$ ./sprintf 220 len=220 sprintf: elapsed=54540 strcat: elapsed=55507 mystrcat: elapsed=53610 cpycat: elapsed=21876
$ ./sprintf 300 len=300 sprintf: elapsed=63352 strcat: elapsed=76068 mystrcat: elapsed=73931 cpycat: elapsed=30482
So yes, most of the time sprintf is probably not more efficient than strcpy+strcat. Just for fun I added a 5 string case ( add '.'+extension) and the cutoff goes down to 120 characters which is still quite long. That being said I still prefer the sprintf approach because it has a much better worst case behavior.
Other observations: * my naive strcpy/strcat implementation seems more efficient than the one in the glibc! That's pretty weird. * cpycat is much more efficient in this type of scenario. That's not very surprising of course. Why does the C library have such braindead functions as strcpy and strcat?
Oh, and I'll side with Alexandre to say that in most cases, a better alternative to using snprintf/strncpy/strncat is to carefully mesure the length of your arguments, compute the exact length of the concatenated string (or at least an upper bound, e.g. in the case of widechar to multi-byte conversions) and allocate a buffer of the required size. Do anything else you will end up with truncated data in your buffer which means your function probably did not perform up to spec.