Executable packers unpacking performance tested.

Discuss anything related to portable freeware here.
Message
Author
User avatar
m^(2)
Posts: 890
Joined: Sat Mar 31, 2007 2:38 am
Location: Kce,PL
Contact:

Executable packers unpacking performance tested.

#1 Post by m^(2) »

At first I wanted to test UPX decompression performance in different modes. I knew that LZMA is slow, but at what size does it start to matter?
As I prepared tests, I tried other compressors that I have in here too. Results turned out to be worse than I expected...Is size gain worth having longer startup, bigger installers and risk of breaking something?
Read and discuss.

Data: a compilation of ~30 executables, 104 914 944 bytes.
All tests done with Pentium D 805 @2.66 GHZ.
Results:

Code: Select all

Program    size     decompression time switches used
UPX NRV2B  37060096              0.968 --best --all-filters --crp-ms=99999 --nrv2b  
UPX NRV2E  36585472              1.000 --best --all-filters --crp-ms=99999 --nrv2e  
UPX LZMA   28961792              6.453 --best --all-filters --crp-ms=99999 --lzma
Upack      28923468             37.422 -brute
kkrunchy   error (application failed to initialaize properly)
mpress     error (bad executable)
MEW        error (crash after 34.828)
20to4      error (application hangs)
4 of 6 contenders couldn't compress correctly such a big file. :?
Upack is very slow and just slightly better than UPX LZMA, IMO has no practical uses.
Only UPX left, which mode to use?
NRV2B decompresses at 103.3 MB/s, NRV2E 100, LZMA - 15.5.

Now count in hdd / pendrive speed... it gets tougher.
At startup compressed applications have to read whole executable, uncompressed - only a part of it. How big? There's no easy answer, it highly depends on particular application. Let's try to measure it. The accurate method would be to ask Windows memory manager about it, but I don't know of any tool that would do this. I could write such myself, but now I lack access to Microsoft documentation, which is necessary.
But there's another thing that we can use...Compressed executables read whole executable and this is expressed in amount of memory that they use. The trick to run the same program 2 times, once compressed and once not and measure the difference. It's a bit less accurate though.
Important thing: Many applications are combination of .exes and .dlls. Not all of them are being loaded on the startup and we should calculate startup usage ratio(SUR) for total size of what's in use. All good taks managers let you see the list of loaded modules, so that's easy.
My test results:

Code: Select all

SumatraPDF          - 16.9%
CDex                - 25.3%
KMPlayer            - 34.1%
PhotoScape          - 35.8%
VirtualDub          - 40.4%
TrueCrypt           - 40.7%
WinHTTrack          - 40.8%
AngelWriter         - 43.0%
Rainlendar          - 45.9%
Artweaver           - 46.0%
Media Coder         - 46.5%
DSynchronize        - 46.9%
FSViewer            - 47.2%
Audio Identifier    - 52.7%
FreeOCR             - 55.4%
InfraRecorder       - 55.9%
Ant Movie Catalog   - 57.0%
IcoFX               - 57.0%
PStart              - 57.6%
PSPad               - 64.0%
HxD                 - 67.6%
Dev Project Manager - 72.3%
eToolz              - 78.6%
Opera               - 95.1%
I think that we can say that it's usually 30-75% with average of about 50%.

Uncompressed application load speed:

Code: Select all

read_speed/startup_usage_ratio
Compressed:

Code: Select all

1/(compression_ratio/read_speed+1/decompression_speed)
Variant1: fast pendrive that reads at 30 MB/s.

Code: Select all

Compression:  Speed(30% SUR) Speed(50% SUR)
Uncompressed:     100.0 MB/s      60.0 MB/s
NRV2B              46.6 MB/s      46.6 MB/s
NRV2E              46.3 MB/s      46.3 MB/s
LZMA               13.6 MB/s      13.6 MB/s
If you assume 50% startup usage ratio, NRV2 loads just 10% slower than uncompressed exe and 3 times smaller. Keep in mind that startup time is usually much longer than loading, so total difference is gonna be much smaller.
But how does it work in practice? I think that the limit of what can you notice is about 0.05s. How big program can you compress with virtually no load time increase?

Code: Select all

Compression: Size(0% SUR) Size(30% SUR) Size(50% SUR)
NRV2B             2.33 MB       4.36 MB      10.45 MB
NRV2E             2.31 MB       4.30 MB      10.09 MB
LZMA              0.67 MB       0.78 MB       0.87 MB
Average application has less than 10 MB of executables, so it won't start noticeably slower. No app below 2.31 MB is gonna be slower. As long as you use NRV2, of course.
Variant2: slow pendrive, 10 MB/s.

Code: Select all

Compression:  Speed(30% SUR) Speed(50% SUR)
Uncompressed:      33.3 MB/s      20.0 MB/s
NRV2B              22.2 MB/s      22.2 MB/s
NRV2E              22.3 MB/s      22.3 MB/s
LZMA               10.9 MB/s      10.9 MB/s

Compression: Size(0% SUR) Size(30% SUR) Size(50% SUR)
NRV2B             1.11 MB       3.33 MB         every
NRV2E             1.11 MB       3.36 MB         every
LZMA              0.54 MB       0.80 MB       1.18 MB
NRV2 is faster than uncompressed in most cases here, LZMA is faster on some executables too, but it's a very rare case.
Variant3: HDD, 80 MB/s.

Code: Select all

Compression:  Speed(30% SUR) Speed(50% SUR)
Uncompressed      266.7 MB/s     200.0 MB/s
NRV2B              71.0 MB/s      71.0 MB/s
NRV2E              69.7 MB/s      69.7 MB/s
LZMA               14.7 MB/s      14.7 MB/s

Compression: Size(0% SUR) Size(30% SUR) Size(50% SUR)
NRV2B             3.78 MB       4.83 MB       6.09 MB
NRV2E             3.70 MB       4.71 MB       5.89 MB
LZMA              0.74 MB       0.77 MB       0.80 MB
To sum up:
Claims that UPX does not slow down application startup are wrong. It usually doesn't have a big impact, but applications cases where it does happen often enough to be significant.
For very small programs you can use LZMA. You'll gain little, but there's virtually no loss.
For bigger ones use NRV2E, it compresses slightly better and because of this it's faster in pessimistic case and usually practically as fast.

Other thoughts:

1. Machines about twice slower than mine are still in use today. On them impact of decompression speed is going to be bigger, I'd say that the upper limit for UPX LZMA should be no higher than 300-400 KB unless you know what machines you'll be using it at. That's for all application's .exes and .dll together.

2. Compressed executables are loaded to memory almost sequentially, while uncompressed not necessarily. This can reduce startup performance of uncompressed programs when running from HDD, but I think it matters very rarely.

3. When loading multiple instances of the same executable OS optimizes the process: everything is loaded only once. Packing breakes this, every instance has to be loaded separately, it's slower and takes more memory.
Things like Google Chrome, where each tab is a different process shouldn't be packed. The same applies to dlls shared by programs.

4. In my test I made a one big executable containing many smaller ones. It affects the results in several ways:
- Bigger data tends to compress better, real compression ratios are going to be somewhat worse.
- Each executable contains a header which has some constant sized parts that can't be compressed. It significantly affects compression ratios of small files, while here it's not significant. (Upack scores better on smaller files ;))
Icon has the same impact.
- Compressors can do some optimizations that are section-specific. Putting everything in .rsrc like I did could break them. I guess that it's not the case because compression ratios are very good compared to what standalone programs can do.

5. Some programs have something called "overlay data" which can take a lot of their size. Packers can do 2 things with it: Either preserve unchanged or remove, removal can break them. This greatly affects compression and startup usage ratios, but:
a) It doesn't change load time.
b) Ratios above assumed no overlay data.
Therefore it can change size gain, but doesn't break performance calculations.

6. There's one thing that makes SUR calculations a bit less accurate. When you start application, it's first icon is practically always loaded into memory even when it's not used. Because your file manager loaded it. This can account for like 100 KB difference, but only in rare cases.

7. I remember reading somewhere that UPX does in place decompression, after it's done no memory is wasted. If this is wrong, SUR is going to be slightly higher, but usually much less than 0.1% anyway. Anyway, if anybody knows how is it, please let me know.

8. The method of calculating SUR based on memory difference is incorrect in some cases - where there are multiple executables stuffed in a single one. I.e. Sysinternals Process Explorer works this way. It often doesn't work with launchers either, many quits before you can measure something, some are written in NSIS and have plugins stuffed inside - like Process Explorer.
I didn't include any problematic app in this test.

9. Size gain can be different than it looks to be at first sight. What matters is not file size, but how much does it take from your hard drive. Real size is usually file size aligned to some value from 4-32 KB + some constant (called "metadata").
It makes no sense to compress files that have 4 KB, you'll only loose this way. With many bigger ones it's the case too.

10. UPXed files = bigger installers.

For the sake of completeness, program versions:

Code: Select all

UPX 3.03w
Upack 0.399
MEW11 SE
kkrunchy 0.23 alpha 2
Mpress 1.25
20to4 2004.04.18
I'll try PECompact shortly, tell me if you want something else.

User avatar
m^(2)
Posts: 890
Joined: Sat Mar 31, 2007 2:38 am
Location: Kce,PL
Contact:

#2 Post by m^(2) »

Nobody's interested?
I tested PECompact and it hangs whenever I try to pack the file, regardless what codec do I use.

chezduong
Posts: 67
Joined: Mon Jan 15, 2007 6:14 am

Au contraire

#3 Post by chezduong »

Au contraire. I was actually very interested and read your entire post.

Immediately after doing so, I promptly decompressed (UPX) all of my exe's and dll's and recompressed using UPX NRV2E. To my pleasant surprise, ALL THE APPS on my USB key loaded much much faster. It was very noticeable. I am using a Sandisk cruzer Titanium (U3) 2 Gb which is relatively fast with small files and I think that this makes much more of a difference than if the drive was slow to begin with.

All these years I had been using LZMA to save a few MB's on my USB key and the result was that I was staring at the screen waiting for a few minutes a day waiting for the app to launch…time lost for nothing.

Now I also have a 16 Gb OCZ Rally2 (which is very slow for small files, but very fast for large files) that I might put all of my uncompressed files on if I thought that uncompressing them could speed up the launch due to non-decompression (albeit at the cost of having to read a file 3x as large). So I am not sure of the results here. Care to take a guess?

More generally, what exactly is your recommendation for the best compromise for USB drives? NRV2E or NO COMPRESSION? And for slow drives?

CD

User avatar
m^(2)
Posts: 890
Joined: Sat Mar 31, 2007 2:38 am
Location: Kce,PL
Contact:

#4 Post by m^(2) »

From my tests, average SUR is ~50%, while compression ratio ~35%, which means that on average compressed exes make less reads, which means that the slower drive - the better relative performance of compressed programs, but there's huge variance depending on particular application.
As I said, I recommend LZMA for very small programs and NRV2E for bigger ones.

chezduong
Posts: 67
Joined: Mon Jan 15, 2007 6:14 am

Thanks

#5 Post by chezduong »

Thanks. CD

Onesimus Prime
Posts: 133
Joined: Wed Sep 05, 2007 8:42 pm

#6 Post by Onesimus Prime »

Thanks for doing this! I really don't use much other than UPX (if that), but it's good to know now what some of the tradeoffs are in this sort of thing.

User avatar
m^(2)
Posts: 890
Joined: Sat Mar 31, 2007 2:38 am
Location: Kce,PL
Contact:

#7 Post by m^(2) »

Followup:
I decided to test one more thing: Are NRV / LZMA good choices for an executable packer?
I tested several archivers to see how well do they work compared to UPX.

Code: Select all

LZOP -9 -F      46023879  0.656(1)
QuickLZ -3      49342629  0.687
QuickLZ -2      52595713  0.718
QuickLZ -1      56463297  0.734
LZSS ex         47882127  0.828
FastLZ -2       57871075  0.875
FastLZ -1       59193978  0.921
FastLZ opt -1   59383836  0.921
FastLZ opt -2   58255894  0.953
UPX NRV2B       37060096  0.968(1)
UPX NRV2E       36585472  1.000(1)
tor -12 -c1     41928945  1.015(2)
LZTurbo 19      38773267  1.046
4x4 tor:1:128m  62324238  1.078
Quick -0        55693688  1.125
LZTurbo 29      36596233  1.187
tor -12 -c2     38032542  1.250(2)
FreeArc -1xx    42219499  1.359
LZTurbo 39      36982013  1.578(3)
FreeArc -2xx    38260468  1.640
CabArc LZX:21   35307686  1.671(1)
nz -cd          34508054  1.781(1)
tor -12 -c3     34284810  1.796(1)
4x4 tor:4:128m  41509713  2.156
Thor e2         50217340  2.281
Thor e5         41955492  2.406
FreeArc -3xx    31746653  2.546
Thor e1         53220550  2.546
nz -cD          31159412  2.687(1)
4x4 tor:12:48m  35831898  2.812(3)
Thor e3         45614772  3.031
tor -12 -c4     33729113  3.046
nz -cf          44669276  3.078
LZTurbo 49      33727604  3.468(3)
FreeArc -4xx    31192925  3.579
4x4 12:48m      31007550  4.781(1)(3)
slug            43081286  5.140
nz -cF          38839581  5.578
Thor e4         41243524  5.656
UPX LZMA        28961792  6.453(1)
FreeArc -9x     26162769  6.843(1)
7z -mx9         25791853  7.515(1)
LZTurbo 59      33079469  9.734(3)
rzm             25498961 15.875(1)
nz -co          25121167 23.562(1)
Upack           28923468 37.422
nz -cO          23867674 57.531(1)

(1) "The best" mode. Nothing beats it in both size and time.
(2) "The best" external.
(3) Global time, includes IO.
The external packers have a slight advantage - UPX / Upack have to add decompression code, so archive size is slightly bigger.

LZOP...is fast. 152.5 MB/s. I did the previous calculations to see how would it perform when you count disk load time:

Variant1: fast pendrive that reads at 30 MB/s.

Code: Select all

Compression:  Speed(30% SUR) Speed(50% SUR)
Uncompressed:     100.0 MB/s      60.0 MB/s
NRV2B              46.6 MB/s      46.6 MB/s
NRV2E              46.3 MB/s      46.3 MB/s
LZMA               13.6 MB/s      13.6 MB/s
LZOP               47.2 MB/s      47.2 MB/s

Compression: Size(0% SUR) Size(30% SUR) Size(50% SUR)
NRV2B             2.33 MB       4.36 MB      10.45 MB
NRV2E             2.31 MB       4.30 MB      10.09 MB
LZMA              0.67 MB       0.78 MB       0.87 MB
LZOP              2.36 MB       4.47 MB      11.08 MB
Variant2: slow pendrive, 10 MB/s.

Code: Select all

Compression:  Speed(30% SUR) Speed(50% SUR)
Uncompressed:      33.3 MB/s      20.0 MB/s
NRV2B              22.2 MB/s      22.2 MB/s
NRV2E              22.3 MB/s      22.3 MB/s
LZMA               10.9 MB/s      10.9 MB/s
LZOP               19.8 MB/s      19.8 MB/s

Compression: Size(0% SUR) Size(30% SUR) Size(50% SUR)
NRV2B             1.11 MB       3.33 MB         every
NRV2E             1.11 MB       3.36 MB         every
LZMA              0.54 MB       0.80 MB       1.18 MB
LZOP              0.99 MB       2.44 MB      117.8 MB
Variant3: HDD, 80 MB/s.

Code: Select all

Compression:  Speed(30% SUR) Speed(50% SUR)
Uncompressed      266.7 MB/s     200.0 MB/s
NRV2B              71.0 MB/s      71.0 MB/s
NRV2E              69.7 MB/s      69.7 MB/s
LZMA               14.7 MB/s      14.7 MB/s
LZOP               83.1 MB/s      83.1 MB/s

Compression: Size(0% SUR) Size(30% SUR) Size(50% SUR)
NRV2B             3.78 MB       4.83 MB       6.09 MB
NRV2E             3.70 MB       4.71 MB       5.89 MB
LZMA              0.74 MB       0.77 MB       0.80 MB
LZOP              4.15 MB       6.03 MB       8.63 MB
Well, not that great, the size difference is big. Before doing these calculations, I thought I'll request LZOP as another UPX mode, but it's just too weak.
UPX team did really impressive job.

portackager
Posts: 169
Joined: Sun Apr 29, 2007 2:01 pm

#8 Post by portackager »

I'v noticed many programs I've packed with UPX in general have slower startup time than those unpacked. Mostly it's because my Antivirus (AVG 8 Free) is busy unpacking and scanning the programs in the memory which can get quite aggressive with the CPU. :?

I mainly use the "--best" compression option, I didn't know UPX had so many undocumented options. :shock:

I'm not much of a command line guy, I was looking for information on the "NRV2E" option, I found FreeUPX instead, which provides a nice interface, and plenty of features to boot.

Thanks for sharing your results with us.

User avatar
m^(2)
Posts: 890
Joined: Sat Mar 31, 2007 2:38 am
Location: Kce,PL
Contact:

#9 Post by m^(2) »

portackager wrote:I'v noticed many programs I've packed with UPX in general have slower startup time than those unpacked. Mostly it's because my Antivirus (AVG 8 Free) is busy unpacking and scanning the programs in the memory which can get quite aggressive with the CPU. :?
Very good point. I've heard about issues with many antiviruses, but forgot about them.

User avatar
webfork
Posts: 10821
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Gauging interest level

#10 Post by webfork »

[Edit: This post was originally saved to a separate thread, which created a lot of confusion. I hit the wrong button -- sorry about that.]

---
Nobody's interested?
Although this thread has been here for a while and obviously some people are interested, I wanted to reinterate that this topic is extremely important. Speed of application start seems pretty unimportant since the vast majority of computer users are using fairly speedy machines and those that aren't are usually prepared for the slowdown.

However, size versus performance remains important in the case of OpenOffice for example. This program, a key component of most any portable collection, already starts up fairly slow on my 2 GHz Core 2 Duo machine. Compressing it for even a minor improvement in file size isn't worth it in this case. So even with very fast systems, developers will continue to have to make decisions about their size/speed priority.

I know I have to.
Last edited by webfork on Wed Nov 19, 2008 9:00 am, edited 2 times in total.

User avatar
MiDoJo
Posts: 282
Joined: Thu Apr 17, 2008 2:36 pm

#11 Post by MiDoJo »

I'm confused this is it's own thread but it seems to start in the middle of a conversation.

What is this about?

Onesimus Prime
Posts: 133
Joined: Wed Sep 05, 2007 8:42 pm

#12 Post by Onesimus Prime »

I'm pretty sure webfork was trying to respond to m^(2)'s thread regarding "Executable packers unpacking performance tested", and hit "New Topic" instead of "Post Reply" on accident.

So there ya go - all the more funny with
Although this thread has been here for a while...
:)

Hey, I'm sure I'll do something similar before too long also (actually, it's almost a miracle I haven't already!)

User avatar
MiDoJo
Posts: 282
Joined: Thu Apr 17, 2008 2:36 pm

#13 Post by MiDoJo »

thanks for the link. I were confused and interested not at all mad at webfork nor was I tryin to clown :)

Gonna reread m^2's thread now

Onesimus Prime
Posts: 133
Joined: Wed Sep 05, 2007 8:42 pm

#14 Post by Onesimus Prime »

...not at all mad at webfork nor was I tryin to clown
Sorry, I didn't intend to imply that you were.

I had pointed out a particularly ironic part of the original post, but didn't want webfork to think I was making fun of him or anything - so I just closed my last post with a comment on my own frequent klutziness. (a la "It could happen to any of us.")

Hopefully I have now avoided any further misunderstandings or hurt feelings. Hopefully. 8)

User avatar
m^(2)
Posts: 890
Joined: Sat Mar 31, 2007 2:38 am
Location: Kce,PL
Contact:

#15 Post by m^(2) »

Can any moderator join the 2 topics?
webfork wrote:Speed of application start seems pretty unimportant since the vast majority of computer users are using fairly speedy machines and those that aren't are usually prepared for the slowdown.
Good to see a different point of view.
I'm disappointed by application's slowness if it doesn't show up instantly. After 1 second I get bored. After 3 - I'm getting angry. It changes with application's complexity, but slightly.

There are few programs that offer satisfactory startup performance for me.
And a lot that fail miserably.

I can list many more really slow ones:
Firefox (Yeah, I got like 20 addons)
Miranda (Plugins)
GIMP (It's so slow that I barely use it, PhotoFiltre is usually enough)
FoxitReader (Can't wait until Sumatra gets better)
ProcessExplorer (I've been using ProcessViewer instead, but it's not it)
PSPad (I love it, but had to switch to not portable, payware TextPad because of the PSPad's startup slowness).
WinMerge (I usually use TotalCommander's internal compare tool, though it lacks features).
webfork wrote:gauging interest level
Not really. Maybe a bit. I was very surprised that something so important to me gets 0 attention. It seems that my take on the topic is extreme.

Post Reply