Charles Davis cdavis@mymail.mines.edu wrote:
Hi,
There may be a problem with the way the authors.c file is generated on a Mac with GNU sed installed.
On Mac OS X, the C locale's default encoding is MacRoman, not UTF-8. This has some pretty surprising consequences. For example, since the AUTHORS file contains UTF-8 multibyte sequences that aren't valid in the MacRoman encoding, GNU sed doesn't match those sequences in the .* regexp, and the authors.c file comes out wrong. Mac OS sed does not have this problem. So now I'm stuck changing the Makefile to use the system sed (in /usr/bin) instead of GNU sed.
I found this a long time ago. Fink just modifies the Authors.c file to remove those sequences.
I went one better and removed GNU sed from my Mac (fink purge sed) as I found that MacOSX version of sed would do everything that GNU sed did and I did not have to edit the file.
I could uninstall GNU sed, but there's one small problem. I have a Gentoo prefix set up. It is the reason I have GNU sed installed. If I install or upgrade practically anything in my Gentoo prefix, then GNU sed will just get pulled right back in.
Reading the manual for GNU sed tells me that this is by design and that this behavior--not matching characters that are invalid in the current locale--is in fact mandated by POSIX. If that's the case, then the LC_ALL= statement in the Makefile needs to change. To what, I don't know. I'm hoping one of you has an idea.
If you think, however, that this is a bug in GNU sed, then I will gladly write a report to the maintainer about it.
No, it is not a bug in GNU sed. The authors.c file needs to have the erroneous characters for the language used by MacOSX changed to be acceptable?
James McKenzie