mentor information 201 glöps
- general:
- level: user
- personal:
- first name: rune
- last name: stubbe
- cdcs:
- cdc #1: fr-08: .the .product by Farbrausch [web]
- demotool Windows Crinkler by Loonies [web] & TBC
- g0blinish: I'm not sure if you are implying that you need Visual Studio for Crinkler. The Visual Studio integration is just a convenience feature. You can also just use Crinkler as a command-line tool.
- isokadded on the 2015-07-28 10:49:34
- demotool Windows Crinkler by Loonies [web] & TBC
- g0blinish: Crinkler is not specific to any language, compiler or assembler. As long as it can generate a COFF obj it should work just fine with Crinkler.
Also, this might be a good time to remind everybody that you can still support the Crinkler project by donating intro objs to us for use in our internal test suite. In particular, we seem to be very low on 1k intros :) - isokadded on the 2015-07-28 07:31:05
- demotool Windows Crinkler by Loonies [web] & TBC
- After all these years, this release has reminded me just how much there is left for us to do.
I hope to see some 1k intros at assembly putting these ~100bytes to good use :) - isokadded on the 2015-07-28 02:07:08
- demotool Windows Crinkler by Loonies [web] & TBC
- iq: no, just crinkler for now :)
- isokadded on the 2013-01-20 20:33:02
- demotool Windows Crinkler by Loonies [web] & TBC
- that would require more than just two pushes as loadlibrary is called in a loop. we actually need two pushes per dll. this can of course still be done, but it will be at least at couple of instructions and won't be easy to fit into the current header layout and we are still talking about a potential 1 byte improvement :)
I'm curious now. If we are guaranteed that ebx=[fs:30] on all windows versions, do we have similar guarantees about any of the other registers, flags, etc.?
I get ecx=esi=edi=0 and edx=eip, but can I rely on any of this? I have tried, and failed, to find this information many times. - isokadded on the 2012-07-19 21:23:42
- demotool Windows Crinkler by Loonies [web] & TBC
- ah ok, I misunderstood your suggestion then as ebp is also zero at the end of the normal execution. I'm running with /UNSAFEIMPORT here, so I didn't think of the error message path. Yes, that push 0 can be turned into push ebp. The difference tends to be around +-1byte depending on the style of the intro code. We are using 0 right now, as we have an expectation that sequences of
push 0 is more common than sequences of push ebp in intro code. But looking at it again now it seems to be pretty even with the ebp variant, so we should probably make it a majority vote based on our test suite of intros.
It is very common for 4ks to call ExitProcess, but that wouldn't really be a problem, as I can see it is also in kernelbase.
I tried the old import code again and it is just around one byte smaller after compression. Which will then unfortunately be negated by the need to push two zeroes on the stack before LoadLibraryExA.
Thanks again, It's great to have someone with in-depth knowledge of the darker corners of win32 look into this :) - isokadded on the 2012-07-19 18:46:45
- demotool Windows Crinkler by Loonies [web] & TBC
- qkumba:
Thanks for all the input.
It seems you are right about ebx=[fs:30] on startup. A quick test suggests that we can
save about 3 bytes. It does complicate the call transform code, but I think it will still be a net win in that case.
Can we rely on this across all windows versions?
It would be much appreciated if you can find the documentation you are mentioning :)
Quote:
mov ebx,<image base>
could be
mov ebx,[eax+08]
if placed after the fs: line, for a 2 bytes saving.
This seems to be slightly worse, at least for my example project, because we go from 2 to 3 non-zero bytes.
Optimizing for compressed size is tricky.
We reuse the same instruction sequences, addressing modes and registers as much as possible.
In the end it tends to compress better than the more compact alternatives we have tried. (lodsd+xchg+SIB)
This is of course highly context dependent, so you would ideally want the import code and
intro code to be written in a similar style.
Quote:
also, the check for DLL loading:
test ebp,ebp
jne OUT
push 00000000
push 00000000
push edx
push 00000000
could be
test ebp,ebp
jne OUT
push ebp
push ebp
push edx
push ebp
since ebp is known to be zero at that point, for a 3 bytes saving.
You do realize that these pushes are in your code, right?
Yes, ebp is guaranteed to be zero after the import code, but you will have to exploit it
yourself. We are not going to rewrite user code :)
Quote:
the code in the header could be shortened by at least 3 bytes, but I didn't see any advantage because it seems that nothing can move into the gap.
Yes, gaining a lone byte somewhere in the header doesn't really help.
I'm guessing part of the saving you are mentioning is in the code around the stack reserve field.
You also need to be aware that the code inside the header is not as naive as it might seem, as it is under some additional constraints.
mov ebx, dword 3
is actually just a shorter way of jumping across the subsystem field (dword 2/3), while at the same time initializing ebx to something >1.
The next 4 fields are the reserve/commit fields for the stack and heap. We have to be extra careful about these, as they need to be small
valued dwords in order for windows not to explode, so the instructions are chosen to always have 00/01 in the most significant bytes of these 4 dwords :)
Quote:
mov eax,[edx+1C]
add eax,ebp
mov eax,[eax+4*ecx]
mov [esp+1C],eax
if ebp and edx were exchanged, you could use
add edx,[ebp+1C]
mov eax,[edx+4*ecx]
mov [esp+1C],eax
for a 2 bytes saving
I tried this and it seems to be more than a byte worse than the original code.
mov eax, [edx+xxh]
add eax, ebp
This is actually one of these repeated patterns I mentioned earlier. This is the third instance of the pattern and the fourth instance of add eax, ebp , so at this point it is at a significant discount :)
I also prefer to use ebp instead of edx as it is preserved across calls, so it is easier to exploit it being 0 in your intro.
Quote:
mov eax,[edx+4*ecx]
mov [esp+1C],eax
popad
could be
pop eax
push [edx+4*ecx]
popad
for a 3 bytes saving.
Wouldn't this store the value into EDI instead of EAX?
Quote:
for the import table on Win2k, the requirement was to import from either kernel32.dll, or something that imports from kernel32.dll, so that kernel32.dll is loaded somehow. lz32.dll could have been wmi.dll for a 1 byte saving.
This is from the old header. We no longer support win2k. Also, you wouldn't really save 1 byte as it was in a 8 byte slot in the header.
Quote:
the original PEB_LDR_DATA code could be used, if you resolved LoadLibraryExA instead of LoadLibraryA. the ExA version exists in kernelbase.dll and kernel32.dll, and it takes two additional parameters which would both be zero.
This sounds very interesting. I'm not really sure I follow. How do you get to kernelbase.dll and can you do this reliably on all windows versions?
Using kernelbase would result in an overhead when the intro imports from kernel32, but I guess we could redirect the imports to kernelbase in the
instances where we can. Either way, I'm curious about this :) - isokadded on the 2012-07-19 10:53:19
- 1k Windows himalaya by TBC
- the archive has been updated with a win7 compatible binary.
- isokadded on the 2010-05-29 15:30:51
- 1k Windows tracie by TBC
- the archive has been updated with a win7 compatible binary.
- isokadded on the 2010-05-29 15:17:44
- 4k Windows receptor by TBC
- the archive has been updated with win7 compatible binaries.
- isokadded on the 2010-05-29 14:51:36
account created on the 2003-02-16 23:09:29