Thu, 01 Mar 2007
Apache piped ErrorLog broken
I came across this frustrating little Apache bug today: if you configure your error log to be a pipe to an external command, the external process is not killed when doing a graceful restart of the server, so you end up with redundant processes building up (and holding old logfiles open), one per restart.
How it works is: let's say you have what appears to be a popular configuration these days, with:
ErrorLog "|/usr/sbin/cronolog /var/log/apache/error.%Y%m%d.log"
When you do a graceful restart, apache starts the new cronolog first, attaches this to its stderr (closing the handle to the old cronolog), then kills the old child process. But: the new cronolog inherits apache's original stderr, which points to the old cronolog. And instead of killing the old one, apache kills the immediate child process, which is a sh (because apache passes the ErrorLog command line to be interpreted by the shell); leaving the cronolog behind. So the old cronolog is not killed, and although apache closes the filehandle that it is reading from, the new cronolog process is still holding it open.
Once you have done a few graceful restarts, you will have a whole tree of cronolog's (or other external logging program). The first cronolog reads from the second cronolog's stderr, which is reading from the third cronolog's stderr, which is reading from … which is reading from apache's stderr. So if you kill the penultimate cronolog, the whole stack of stale processes unwinds (each cronolog exits when the last process writing to its stdin exits).
Given that cronolog has been popular for > 5 years, and apache itself ships with a similar rotatelogs script, I'm surprised this bug has lasted so long. I have tested that it occurs with Apache 1.3 on both FreeBSD and Linux. It appears to have been fixed only in October, so I guess only Apache 2.2 and up includes the fix?
But I seem to have a workaround:
ErrorLog "|exec /usr/sbin/cronolog /var/log/apache/error.%Y%m%d.log"
… a favourite trick of mine, for avoiding all those sh processes kicking around, which here happens also to make everything work: because there is now only one direct child process, so apache now kills the cronolog when it has completed its restart. Disclaimer: I have only tried this in my test setup so far.
There is just one hit on Google for "exec cronolog": and it seems related (but they haven't realised there is a problem with restarts). But it seems to me that all the sites recommending cronolog need to update to make this their suggested configuration.
[21:04] | [/computers/code] | #
Sun, 06 Aug 2006
Finding Unused Externs in C
gcc -Wunused-function -Wmissing-declarations does a good job at identifying functions that are not declared in any header file, and once you mark them static then tells you if they are unused. However, with programs that have been developed over many years, and have a lot of cruft in the source code (such as Doom/PrBoom), there may be many functions that are prototyped in header files but are not actually used from other files; gcc cannot tell you about these.
I had an idea for locating these unnecessary externs: just run the link stage of gcc with one .o file taken out, and record all the 'undefined reference to' errors you get. That gives you the list of used externs; so any others are unused. So I wrote a couple of scripts to assist with this. First, find-used-externs:
#!/usr/bin/ruby -w
def filter_gcc(cmdline)
IO.popen(cmdline.join(" ") + " 2>&1 ") do |p|
p.each_line do |l|
print $1,"\n" if l =~ /undefined reference to \`(.*)'/;
end
end
end
ARGV.each do | a |
next unless a =~ /\.o$/;
cmdline = ARGV.reject { |b| b == a }
filter_gcc(cmdline)
end
Which tells you all the symbols in .o files referenced by other files — cut and paste the normal link step for the project, and prefix the command with ./find-used-externs. It runs the link command once for each .o file, omitting only that .o file from the command line, and parsing the errors.
Secondly, you have to find all the externs provided by each .h file, and see if any are not on the used list. list-externs is a very crude script to try and read extern variables and function prototypes from header files:
#!/usr/bin/ruby -w
ARGF.each_line do |l|
l.sub!(/\/\/.*/,'')
l.sub!(/\/\*.*/,'')
print $1,"\n" if l =~ /^\s*extern.*\s([a-zA-Z_][a-zA-Z_0-9]*)\s*;/
print $1,"\n" if l =~ /^\s*[^#].*\s([a-zA-Z_][a-zA-Z_0-9]*)\s*\(/
end
So sort the output of that, and use comm(1) to find the entries that are not in the (sorted) output of find-used-externs. And that tells you what header file declarations are not currently needed.
[13:38] | [/computers/code] | #
Sun, 09 Apr 2006
Net::SelectServer v0.2
Updated version of my single-process, multiple-connections server module for perl: Net::SelectServer.
[14:04] | [/computers/code] | #
Wed, 08 Feb 2006
perl IO::Select server
There seems to be a lack of a good standard single-process, multiple-connections server module for perl. I couldn't find one yesterday, anyway (Net::Daemon can only handle multiple connections simultaneously if you have a threads-enabled perl). So I have written one: Net::SelectServer. It used select to multiplex different clients, and it's protocol-neutral, so should work with more than just TCP sockets.
[09:03] | [/computers/code] | #
Fri, 13 Jan 2006
Errors compiling g-wrap based programs on Debian or Ubuntu
If you get errors due to <g-wrap-wct.h> being missing, this is due to this file being incorrectly dropped from the Debian package of g-wrap (and the mistake copied in Ubuntu's universe repository). The file is in fact trivial, because it's only there as a compatibility bridge for old programs that have not updated to use the newer header:
/* Provided for compatibility with G-Wrap 1.3.4 */
#include <g-wrap/guile-wct.h>
So you could just edit the offending file to use the new header. I didn't realise this until I had done the long-winded solution: adding /usr/include/g-wrap-wct.h to debian/guile-g-wrap.install and rebuilding the Debian package.
[16:54] | [/computers/code] | #
Thu, 22 Dec 2005
Lighttpd error logging
I switched moria.org.uk to use the lighttpd web server at the weekend. Partly so I would have more free memory on my VPS account, but mostly because 'engineers love to change things' :-).
One significant little feature was missing though — an external log splitter like cronolog was only supported for the access log. Personally, I find using something like this easier than log rotation, as it makes it easier to find the logs for a given day; and clearly if you want it for one log, it makes sense to use it for both. So here is a patch.
[23:24] | [/computers/code/lighttpd] | #
Sun, 27 Nov 2005
Cacti Graphs for Apache
I have been using RRDtool for a while now, for graphing the traffic on my home ADSL. It's a huge improvement on MRTG, but fairly hard work to use by itself, as you have to write your own data collection scripts. So, having come into contact with Cacti at work, I decided to switch to that. This is an example of a really great open-source network app, with a very slick and powerful web interface.
I was surprised, though, at there being no included script or graph template for monitoring Apache stats. So I have written one. This tarball contains the script, template to import into cacti, and instructions. Very much a first draft, though.
PS. Looks like I should have searched Freshmeat and not Google, as there is a package called ApacheStats for cacti. Ah well — I prefer the presentation of my graphs.
[22:46] | [/computers/code] | #
Wed, 20 Apr 2005
pngrewrite
Here is a handy little program: pngrewrite. I came across it in the FreeBSD ports collection. It's a PNG optimiser, which is particularly good for images with few colours, as it converts them to indexed with a minimal palette. All the Kye screenshots (the graphics for which date from 16-colour VGA days) are 40% smaller after going through pngrewrite.
It looks like it is freely-usable source-code-available only (not open source); none of the open source alternatives seem to be half as good. Only with pngrewrite can I get the tiles for Kye to be smaller as PNG than as GIF.
[22:29] | [/computers/code] | #
Sun, 27 Mar 2005
Finally, gzip-encoding works
I've been trying to fix gzip-transfer-encoding on moria.org.uk.
mod_negotiation was refusing to offer a .gz as an alternative to an uncompressed file.
After trying a lot of different arrangements, I spotted that even when it served a .gz, it did not return the underlying content's MIME type; it seemed to be ignoring the AddEncoding for .gz.
Finally, I stopped messing around and read the source code.
It seems that apache simply looks up each component of a filename against its list of extensions, and that it checks MIME types before it checks encodings.
Remove .gz from /etc/mime.types and everything starts working.
Always read the source!
[09:41] | [/computers/code] | #
Thu, 24 Mar 2005
tar pit
I've been working with some people on the use of zsync for Gentoo mentioned
previously, and we came across this oddity of tar. We were trying to work
out why a given directory tree took 160MB as a .tar, but only 88MB when stored
in a different format. It turns out that tar uses a block size of 512 bytes, so
every file takes at least 512 bytes for the data, and 512 bytes for the
metadata (filename etc). So half the tarball was empty space because... because
the tar format was designed to go to 512-byte-block tapes. Given that most of
tar's use is for distributing files online
now, I don't want to know how much space is wasted just because the old format
demanded it.
cpio seems to be more efficient - it doesn't use anything like as much padding.
Somehow I don't think I will boost zsync's popularity if I advise people to use cpio :-).
zip also seems to have very compact metadata.
[20:41] | [/computers/code] | #