Committed revision 399. cph@athlon zsync/c% svn log -r399 makegz.c ----------------------------------------------------------- r399 | cph | 2005-03-25 09:36:36 +0000 (Fri, 25 Mar 2005) Built in gzip compressor which optimises for zsync.
I have been thinking about the problem of compression with zsync. Strangely, the code for looking inside gzipped files — written as a workaround, to help kickstart zsync when there is little rsync-able content already available — is currently the most efficient way of transferring most files. I was sure that there had to be something more efficient than compressing stuff with gzip --best and then doing elaborate hacks in zlib to enable us to decopresss mid-file.
I had been intending to look at Transfer-Encoding: gzip, using mod_deflate/mod_gzip, to see if this could get us compresion without the nasty hacks. But I was sceptical that this could take off, because it puts some load back on the server, and these modules are far from ubiquitous. What we want is the individual blocks to be stored compressed on the server, so it does not have to do any compression when clients connect.
Now we have it. I have imported the deflate code from zlib into zsync, and written a small gzip program (actually built into zsyncmake) which optimises the comressed file for zsync. It starts a new deflate block in the output at the start of every zsync block (so every 1024, 2048 or whatever bytes). Initial tests are very promising — on my main test case, of updating a 12MB Debian Packages file with 1 week of changes, the total transfer drops from 140KB to 107KB, taking us amazingly close to rsync -z's best result of 82KB (in fact, the difference between zsync and rsync now is almost precisely the size of the Z-Map — the map of the .gz file — at 23KB).
I have updated the technical paper with the theory and the results.