At Devscovery this year one of the libraries Wintellect distributed was a granular performance “timer” that John Robbins and Jeff Richter wrote. Like System.Diagnostics.Stopwatch it tracks time, but it also tracks CPU cycles, number of garbage collections (GCs) including their generation, and has an easy way to run the test multiple iterations.
To try it out and do something useful at the same time I benchmarked different compression algorithms/libraries. ms is time in milliseconds, Kc is kilocycles, G# is the number of garbage collections.
The test data was 95 MB of text files I downloaded from the SEC
System.IO.Compression.GZipStream Compression
5,620ms 10,930,530Kc (G0= 4, G1= 2, G2= 1)
25.08% = compressed size / uncompressed
Decompression
2,696ms 5,253,716Kc (G0= 1, G1= 1, G2= 1)
------------------------------------------------------------
GNU BZip2 Compression
36,574ms 70,954,961Kc (G0= 16, G1= 1, G2= 1)
13.82% = compressed size / uncompressed
Decompression
6,201ms 11,536,973Kc (G0= 10, G1= 1, G2= 1)
------------------------------------------------------------
Xceed BZip2 Compression
380,690ms 740,282,089Kc (G0= 999, G1= 16, G2= 5)
13.82% = compressed size / uncompressed
Decompression
10,295ms 20,068,563Kc (G0= 6, G1= 3, G2= 3)
------------------------------------------------------------
LZMA Compression
137,585ms 269,214,645Kc (G0= 12, G1= 10, G2= 10)
13.7% = compressed size / uncompressed
Decompression
3,628ms 7,081,899Kc (G0= 2, G1= 2, G2= 2)
------------------------------------------------------------
SharpZipLib BZip2 Compression
88,732ms 172,606,685Kc (G0=3356, G1= 12, G2= 3)
13.83% = compressed size / uncompressed
Decompression
11,696ms 22,875,325Kc (G0= 4, G1= 3, G2= 3)
BZip2 and LZMA (7zip) get roughly equivalent compression ratios, and each are about twice at good at compressing the test data than GZip. But if you look at the performance between the LZMA and BZip2 libraries there are huge differences.
LZMA and SharpZibLib are over twice as fast as Xceed. And the GNU BZip2 (C code) with a managed wrapper is three times faster than them. So taking a the time to test out a couple options can pay off, especially if you use compression to speed up moving data over the network.
No comments:
Post a Comment