OSX ld: why does pagezero_size default to 4GB on 64b OSX? - macos

This is an OSX linker question. I don't think OSX (BSD or Mach layers) cares how large the zero page is or indeed whether it even exists. I think this is a tools thing. But that's my opinion and that's why I'm asking.
-pagezero_size size: By default the linker creates an unreadable segment starting at address zero named __PAGEZERO. Its existence will cause a bus error if a NULL pointer is dereferenced.
This is clear; it's for trapping NULL ptrs. On a 32b OSX system, the size of the segment is 4KB which is the system pagesize. But on current 64b system, the size of this segment increases to 4GB. Why doesn't it remain at the system pagesize 4KB or the architecture's maximum pagesize, 2MB? This means I can't use 32b absolute addressing at all.
Are there any problems with using this flag and overriding the default? Apple Store rules, ...?
(This feature is specific to the OSX ld64 linker. The feature dates at least to ld64-47.2 March 2006. Address Space Layout Randomization and 64b support start with Leopard in October 2007.)

The -pagezero_size option is a linker option, not a compiler option. So, when you use the compiler to drive linking, you need to pass it as -Wl,-pagezero_size,0x1000 (or whatever size you want). This works fine. The Wine project, to which I'm a contributor, relies on this for compatibility of its 64-bit build.
My understanding as to why the default page-zero size is 4GB for 64-bit is to catch cases where a pointer was inadvertently stored in a 32-bit variable and thus truncated. When it's eventually cast back to a pointer, it will be in the low 4GB and therefore invalid. Any attempt to dereference it will cause an access violation.
Update:
It seems that -pagezero_size is recognized as a compiler option, too, and works just fine in my testing. Either way, I get a functioning executable and otool shows a __PAGEZERO segment of the desired size.
What versions of the tools are you using? I'm using Xcode 8 on Sierra (10.12.6):
$ cc --version
Apple LLVM version 8.1.0 (clang-802.0.41)
...

Related

Is it possible to generate native x86 code for ring0 in gcc?

I wonder, are there any ways to generate with the gcc some native x86 code (which can be booted without any OS)?
Yes, the Linux kernel is compiled with GCC and runs in ring 0 on x86.
The question isn't well-formed. Certainly not all of the instructions needed to initialize a modern CPU from scratch can be emitted by gcc alone, you'll need to use some assembly for that. But that's sort of academic because modern CPUs don't actually document all this stuff and instead expect your hardware manufacturer to ship firmware to do it. After firmware initialization, a modern PC leaves you either in an old-style 16 bit 8086 environment ("legacy" BIOS) or a fairly clean 32 or 64 bit (depending on your specific hardware platform) environment called "EFI Boot Services".
Operations in EFI mode are all done using C function pointers, and you can indeed build for this environment using gcc. See the gummiboot boot loader for an excellent example of working with EFI.

Mac OS X: Application with NX flag, Stack Cookies and ASLR enabled?

I want to know if an executable supports the common security protections such as NX flag, stack cookies or ASLR. It seems ASLR is set at the OS level but how do you know it is enabled? On Windows some executable do not support ASLR so I was wondering how you can determine this on Mac OS X.
First of all ALSR used in OSX 10.6 and below did not randomize all regions of memory. As far as I know ASLR is enabled for all running executables. This is very easy to test for, just fire up a debugger set a break point and record any memory address on the stack. Restart the application and see if that same variable has the same memory address.
I think in OSX 10.7 they started randomizing the dynamic linker. Which linux, bsd, and even windows systems have been doing for a number of years.
For OSX, linked libraries ASLR can be tested for using executing export DYLD_PRINT_SEGMENTS=1 and then running a command. The TEXT memory region is the base address for the library. Run this command twice against any binary. If the base address is different between the two execution then ASLR's dirty work is to blame.
Stack cookies are an entirely different ballgame. This is a compiler level protection and will vary based on the application. Modern versions of GCC should default to stack carnies enabled. Again you should consult your debugger to see if a specific application is using canaries. Just examine the stack frame of any function to see if there is a random value inserted between the locally declared variables and the return address.
As far as the NX flag goes, you should assume any system made after 1999 uses this trivial form of protection. But, this is by far the most simple protection for you to bypass, just ret-to-libc or employ an ROP chain (because of aslr).

How does 64 bit code work on OS-X 10.5?

I initially thought that 64 bit instructions would not work on OS-X 10.5.
I wrote a little test program and compiled it with GCC -m64.
I used long long for my 64 bit integers.
The assembly instructions used look like they are 64 bit. eg. imultq and movq 8(%rbp),%rax.
I seems to work.
I am only using printf to display the 64 bit values using %lld.
Is this the expected behaviour?
Are there any gotcha's that would cause this to fail?
Am I allowed to ask multiple questions in a question?
Does this work on other OS's?
Just to make this completely clear, here is the situation for 32- and 64-bit executables on OS X:
Both 32- and 64-bit user space executables can be run on both 32- and 64-bit kernels in OS X 10.6, without emulation. On 10.4 and 10.5, both 32- and 64-bit executables can run on the 32-bit kernel. (This is not true on Windows)
The user space system libraries and frameworks are built 32/64-bit fat on 10.5 and 10.6. You can link against them normally, whether you're building for 32-bit, 64-bit, or both. A few libraries (basically the POSIX layer) are also built 32/64-bit fat on 10.4, but many of them are not.
On 10.6, the build tools produce 64-bit executables by default. On 10.5 and earlier, the default is 32-bit.
On 10.6, executables that are built fat will run the 64-bit side by default. On 10.5 and earlier, the 32-bit side is executed by default.
You can always manually specify which slice of a fat executable to use by using the arch command. eg. arch -arch i386 someCommandToRunThatIWantToRunIn32BitMode. For application bundles, you can either launch them from the command line, or there is a preference if you "get info" on the application.
OS X and Linux use the LP64 model for 64-bit executables. Pointers and long are 64 bits wide, int is still 32 bits, and long long is still 64 bits. (Windows uses the LLP64 model instead -- long is 32 bits wide in 64 bit Windows).
Mac OS X 10.5 supports 64-bit user-land applications pretty well. In fact, Xcode runs in 64-bit in 10.5 on a compatible architecture.
It's only the built-in applications (Finder, Safari, frameworks, daemons etc.) also have the 64-bit version in 10.6.
Meta: I don't like to see answers deleted. I guess this has been discussed somewhere.
Anyway, KennyTM and the other kind sole got me started and although one answer was deleted, I appreciated your efforts.
It looks like this is expected behaviour on the Mac, and it even seems to work on a 32-bit Linux as well (although I have not tested extensively)
Yep. GCC behaves different (at least in my limited observation) for 32 (-m32) and 64 (-m64) bit modes. In 32 bit, I was able to access variable arguments using an array. In 64 bit mode this just does not work.
I have learnt that you MUST access variable parameters using va_list as defined by stdarg.h because it works in both modes.
Now I have a command-line program that runs and passes all of my test cases in 32 bit and 64 bit modes on Mac OS-X.
The program implements a linked list garbage collector sweeping 16-byte aligned malloc-allocated objects from a global list as well as machine registers and the stack - actually, there are extra registers in 64 bit mode, so I still have a bit of work to do.
Objects are either a collection of 32 or 64 bit words which link together to form LISP/Scheme-like data structures.
In summary, it is a complex program that does a lot of messing with pointers and it works the same under 32 and 64 bit modes.
Asking multiple questions does not get you all the answers you might want.
It seems to work, as I wrote, on Linux.
Again, thank you for helping me with this.

IMAGE_FILE_LARGE_ADDRESS_AWARE and 3GB OS Switch

If a Windows application has the IMAGE_FILE_LARGE_ADDRESS_AWARE set in the image header (via the /LARGEADDRESSAWARE compiler flag), this is typically to allow a 32-bit application to use more than 2GB of memory (only makes sense if the 32-bit Operating System has set the 3GB switch in boot.ini). See MSDN article /3GB for more info.
My questions is, what happens if you run this application on a system that does NOT have the 3GB switch set. Is it simply ignored? Or will the app try and use a 3GB heap and get out-of-memory errors because the userspace only has 2GB available?
I keep hearing anecdotally that the LARGEADDRESSAWARE switch is ignored for 2GB userspace systems but cannot find any official Microsoft documentation on this.
Thanks in advance.
Basically the IMAGE_FILE_LARGE_ADDRESS_AWARE tells the system, "I know that addresses with the high bit set are not negative, and can handle them".
If the system is prepared to provide user mode addresses above 2GB, then it will. If the system is not prepared to give those addresses (ie., a 32-bit Windows OS without the /3GB setting), the process can't get those addresses anyway - but no harm done.
Also note that if an image has the IMAGE_FILE_LARGE_ADDRESS_AWARE bit set it will get access to address space above 2GB on Win64 systems, which do not support (or need) the /3GB switch. A 32-bit application will get an address space of something close to 4GB and a 64-bit application will get a huge address space - 7TB to 8TB depending on the platform (64-bit builds set the bit by default).
http://msdn.microsoft.com/en-us/library/aa366778.aspx#memory_limits
The switch is ignored, if you can call it that. For once, Microsoft actually managed to come up with a descriptive name.
The flag means exactly what it says. This image file is aware that large addresses exist.
That is, it won't crash, if it is given a pointer above the 2GB boundary.
And that's all. The OS doesn't have to treat the process special in any way. It simply indicates that if the OS is able to provide more than 2GB memory, this process can handle it without crashing.
You can make a simple hello world application which never uses more than 1.5MB, and still has this flag set. It doesn't mean "I want to use 3GB of memory", it means "When I request memory, I don't care if it's above or below the 2GB boundary".
So since the flag doesn't require the OS to do anything special, the OS simply won't do anything special if there is nothing special it can do.

Can a 32bit process access more memory on a 64bit windows OS?

From what I understand, a 32-bit process can only access 2 GB of memory on 32-bit Windows without the /3GB switch, and that some of that memory is taken up by the OS for its own diabolical reasons. This seems to mesh with my experience as we have an app that crashes when it reaches around 1.2 - 1.5 GB of RAM without memory exceptions, even though there is still plenty of memory available.
Would moving this 32-bit application to 64-bit Windows allowing it accesses more than 1.5 GB it can now? Would the application itself have to be upgraded to 64-bit?
Newer versions of Visual Studio have a new flag which make 32-bit apps "big address space aware". Basically what it does is say that if it's loaded on a 64-bit version of windows, then it will get 4GB (the limit of 32-bit pointers). This is certainly better than the 2 or 3 GB you get on 32-bit versions of windows. See http://msdn.microsoft.com/en-us/library/aa366778.aspx:
Most notably it says:
Limits on memory and address space
vary by platform, operating system,
and by whether the
IMAGE_FILE_LARGE_ADDRESS_AWARE value
of the LOADED_IMAGE structure and
4-gigabyte tuning (4GT) are in use.
IMAGE_FILE_LARGE_ADDRESS_AWARE is set
or cleared by using the
/LARGEADDRESSAWARE linker option.
Also see: http://msdn.microsoft.com/en-us/library/wz223b1z.aspx
Yes, under the right circumstances, a 32-bit process on Windows can access a full 4GB of memory, rather than the 2Gb it's normally limited to.
For this to work, you need the following:
The app must be running on a 64-bit OS
The app must be compiled with the /LARGEADDRESSAWARE flag.
The app should be tested to make sure it actually works properly in this case. ;) (specifically, code that relies on all pointers pointing to addresses below the 2GB boundary will obviously not work here)
Your app will be limited by the pointer size, in your example 32 bits.
If your app was to access more memory then you would need some sort of segmented memory architecture like we had in the 16 bit days where apps used 16bit pointers and offsets to access the full 32bit memory space.
WOW64 allows using 32-bit Windows application on 64-bit Windows, translating 32-bit pointers to real 64-bit pointers. And actually 32-bit addressing should allow accessing 4GB of memory.

Resources