Weissman score
The Weissman score is a performance metric for lossless compression applications. It was developed by Tsachy Weissman, a professor at Stanford University, and Vinith Misra, a graduate student, at the request of producers for HBO's television series Silicon Valley, a television show about a fictional tech start-up working on a data compression algorithm.[1][2][3][4] It compares both required time and compression ratio of measured applications, with those of a de facto standard according to the data type.
The formula is the following; where r is the compression ratio, T is the time required to compress, the overlined ones are the same metrics for a standard compressor, and alpha is a scaling constant.[1]
The Weissman score has been used by Daniel Reiter Horn and Mehant Baid of Dropbox to explain real-world work on lossless compression. According to the authors it "favors compression speed over ratio in most cases."[5]
Example
This example shows the score for the data of the Hutter Prize,[6] using the paq8f as a standard and 1 as the scaling constant.
| Application | Compression ratio | Compression time [min] | Weissman score | 
|---|---|---|---|
| paq8f | 5.467600 | 300 | 1.000000 | 
| raq8g | 5.514990 | 420 | 0.720477 | 
| paq8hkcc | 5.682593 | 300 | 1.039321 | 
| paq8hp1 | 5.692566 | 300 | 1.041145 | 
| paq8hp2 | 5.750279 | 300 | 1.051701 | 
| paq8hp3 | 5.800033 | 300 | 1.060801 | 
| paq8hp4 | 5.868829 | 300 | 1.073826 | 
| paq8hp5 | 5.917719 | 300 | 1.082325 | 
| paq8hp6 | 5.976643 | 300 | 1.093102 | 
| paq8hp12 | 6.104276 | 540 | 0.620247 | 
| decomp8 | 6.261574 | 540 | 0.63623 | 
| decomp8 | 6.276295 | 540 | 0.637726 | 
Limitations
Although the value is relative to the standards against which it is compared, the unit used to measure the times changes the score (see examples 1 and 2). This is a consequence of the requirement that the argument of the logarithmic function must be dimensionless. The multiplier also can't have a numeric value of 1 or less, because the logarithm of 1 is 0 (examples 3 and 4), and the logarithm of any value less than 1 is negative (examples 5 and 6); that would result in scores of value 0 (even with changes), undefined, or negative (even if better than positive).
| # | Standard compressor | Scored compressor | Weissman score | Observations | ||||
|---|---|---|---|---|---|---|---|---|
| Compression ratio | Compression time | Log (compression time) | Compression ratio | Compression time | Log (compression time) | |||
| 1 | 2.1 | 2 min | 0.30103 | 3.4 | 3 min | 0.477121 | 1×3.4/2.1×0.30103/0.477121=1.021506 | Change in unit or scale, changes the result. | 
| 2 | 2.1 | 120 sec | 2.079181 | 3.4 | 180 sec | 2.255273 | 1×3.4/2.1×2.079181/2.255273=1.492632 | |
| 3 | 2.2 | 1 min | 0 | 3.3 | 1.5 min | 0.176091 | 1×3.3/2.2×0/0.176091=0 | If time is 1, its log is 0; then the score can be 0 or ±∞. | 
| 4 | 2.2 | 0.667 min | −0.176091 | 3.3 | 1 min | 0 | 1×3.3/2.2×−0.176091/0=−∞ | |
| 5 | 1.6 | 0.5 h | −0.30103 | 2.9 | 1.1 h | 0.041393 | 1×2.9/1.6×−0.30103/0.041393=−13.18138 | If time is less than 1, its log is negative; then the score can be negative. | 
| 6 | 1.6 | 1.1 h | 0.041393 | 1.6 | 0.9 h | −0.045757 | 1×1.6/1.6×0.041393/−0.045757=−0.904627 | |
See also
References
- ^ a b Perry, Tekla (July 28, 2014). "A Fictional Compression Metric Moves Into the Real World". Retrieved January 25, 2016.
- ^ Perry, Tekla (July 25, 2014). "A Made-For-TV Compression Algorithm". Retrieved January 25, 2016.
- ^ Sandberg, Elise (April 12, 2014). "HBO's 'Silicon Valley' Tech Advisor on Realism, Possible Elon Musk Cameo". The Hollywood Reporter. Retrieved June 10, 2014.
- ^ Jurgensen, John; Rusli, Evelyn M. (April 3, 2014). "There's a New Geek in Town: HBO's 'Silicon Valley'". The Wall Street Journal. Retrieved June 10, 2014.
- ^ "Lossless compression with Brotli in Rust for a bit of Pied Piper on the backend". Dropbox Tech Blog. Retrieved 2017-06-24.
- ^ Hutter, Marcus (July 2016). "Contestants". Retrieved January 25, 2016.