Moving to 64-bit on OS X? - macos

What is the best practice for moving to 64-bit on OS X? Using the 10.6 SDK and 64-bit intel as my SDK and target.
I have int32 types to change
Does OS X have an 'int64' or would one use a 'long long'?
Where might I find a resource to available data types?
What other issues are there?

Apple has exactly the documentation you want, called 64-bit transition guide. OS X uses LP64 model, so you should use
#ifdef __LP64__
etc. to conditionally compile things according to the bit width, especially if you want your code to be 32bit/64bit clean.
For Cocoa, see 64 bit Transition Guide for Cocoa. There, NSInteger has an appropriate bit width according to the mode, so you don't have to deal with the bit width yourself.

Both long and long long are 64-bit types when building 64-bit on OS X. In addition, you can use the int64_t and uint64_t types defined in <stdint.h> if you need to specify that an integer is exactly 64 bits wide.
You generally don't need to change existing int32 types in your program, unless they're being used to perform pointer arithmetic (or otherwise depend on them being the same size as a pointer). 32 bit arithmetic continues to work just fine in 64-bit programs. If you do have variables that must be the same size as a pointer, use the uintptr_t type, which will work in both 32- and 64-bit builds.
The other situation where you might need to make changes is if a API expects to be passed (or returns) a size_t or long or intptr_t, and you've been using int all this time, instead of what the function in question actually specifies. It will have worked on 32-bit builds, but may introduce errors when built for 64-bit.
Yuji's suggestion of reading the 64-bit transition guides is excellent advice.

Related

ARM softfp vs hardfp performance

I have an ARM based platform with a Linux OS. Even though its gcc-based toolchain supports both hardfp and softfp, the vendor recommends using softfp and the platform is shipped with a set of standard and platform-related libraries which have only softfp version.
I'm making a computation-intensive (NEON) AI code based on OpenCV and tensorflow lite. Following the vendor guide, I have built these with softfp option. However, I have a feeling that my code is underperformed compared to other somewhat alike hardfp platforms.
Does the code performance depend on softfp/hardfp setting? Do I understand it right that all .o and .a files the compiler makes to build my program are also using softfp convention, which is less effective? If it does, are there any tricky ways to use hardfp calling convention internally but softfp for external libraries?
Normally, all objects that are linked together need to have the same float ABI. So if you need to use this softfp only library, i'm afraid you have to compile your own software in softfp too.
I had the same question about mixing ABIs. See here
Regarding the performance: the performance lost with softfp compared to hardfp is that you will pass (floating point) function parameters through usual registers instead of using FPU registers. This requires some additional copy between registers. As old_timer said it is impossible to evaluate the performance lost. If you have a single huge function with many float operations, the performance will be the same. If you have many small function calls with many floating variables and few operations, the performance will be dramatically slower.
The softfp option only affects the parameter passing.
In other words, unless you are passing lots of float type arguments while calling functions, there won't be any measurable performance hit compared to hardfp.
And since well designed projects heavily rely on passing pointer to structures instead of many single values, I would stick to softfp.

Finding most significant and least significant bit set in a 64-bit integer?

I'm trying to get this done in a C++ program on Windows, using visual C++. I only need to support 64-bit targets. I know about hacks that use division or multiplication to get the info, but I'd like to know if there's a faster non-generic way to do this... I would even consider inline assembly but you can't do that in VS for 64-bit.
If code portability is not an issue you should try _BitScanForward64 and _BitScanReverse64. They're compiler intrinsics and map to a single, efficient assembler instruction.

Any pointers to fix the Unix millennium bug or Y2k38 problem?

I've been reviewing the year 2038 problem (Unix Millennium Bug).
I read the article about this on Wikipedia, where I read about a solution for this problem.
Now I would like to change the time_t data type to an unsigned 32bit integer, which will allow me to be alive until 2106. I have Linux kernel 2.6.23 with RTPatch on PowerPC.
Is there any patch available that would allow me to change the time_t data type to an unsigned 32bit integer for PowerPC? Or any patch available to resolve this bug?
time_t is actually defined in your libc implementation, and not the kernel itself.
The kernel provides various mechanisms that provide the current time (in the form of system calls), many of which already support over 32-bits of precision. The problem is actually your libc implementation (glibc on most desktop Linux distributions), which, after fetching the time from the kernel, returns it back to your application in the form of a 32-bit signed integer data type.
While one could theoretically change the definition of time_t in your libc implementation, in practice it would be fairly complicated: such a change would change the ABI of libc, in turn requiring that every application using libc to also be recompiled from sources.
The easiest solution instead is to upgrade your system to a 64-bit distribution, where time_t is already defined to be a 64-bit data type, avoiding the problem altogether.
About the suggested 64-bit distribution suggested here, may I note all the issues with implementing that. There are many 32-bit NONPAE computers in the embedded industry. Replacing these with 64-bit computers is going to be a LARGE problem. Everyone is used to desktop's that get replaced/upgraded frequently. All Linux O.S. suppliers need to get serious about providing a different option. It's not like a 32-bit computer is flawed or useless or will wear out in 16 years. It doesn't take a 64 bit computer to monitor analog input , control equipment, and report alarms.

Is there really such a thing as a char or short in modern programming?

I've been learning to program for a Mac over the past few months (I have experience in other languages). Obviously that has meant learning the Objective C language and thus the plainer C it is predicated on. So I have stumbles on this quote, which refers to the C/C++ language in general, not just the Mac platform.
With C and C++ prefer use of int over
char and short. The main reason behind
this is that C and C++ perform
arithmetic operations and parameter
passing at integer level, If you have
an integer value that can fit in a
byte, you should still consider using
an int to hold the number. If you use
a char, the compiler will first
convert the values into integer,
perform the operations and then
convert back the result to char.
So my question, is this the case in the Mac Desktop and IPhone OS environments? I understand when talking about theses environments we're actually talking about 3-4 different architectures (PPC, i386, Arm and the A4 Arm variant) so there may not be a single answer.
Nevertheless does the general principle hold that in modern 32 bit / 64 bit systems using 1-2 byte variables that don't align with the machine's natural 4 byte words doesn't provide much of the efficiency we may expect.
For instance, a plain old C-Array of 100,000 chars is smaller than the same 100,000 ints by a factor of four, but if during an enumeration, reading out each index involves a cast/boxing/unboxing of sorts, will we see overall lower 'performance' despite the saved memory overhead?
The processor is very very fast compared to the memory speed. It will always pay to store values in memory as chars or shorts (though to avoid porting problems you should use int8_t and int16_t). Less cache will be used, and there will be fewer memory accesses.
Can't speak for PPC/Arm/A4Arm, but x86 has the ability to operate on data as if it was 8bit, 16bit, or 32bit (64bit if an x86_64 in 64bit mode), although I'm not sure if the compiler would take advantage of those instructions. Even when using 32bit load, the compiler could AND the data with a mask that'd clear the upper 16/24bits, which would be relatively fast.
Likely, the ability to fit far more data into the cache would at least cancel out the speed difference... although the only way to know for sure would be to actually profile the code.
Of course there is a need to use data structures less than the register size of the target machine. Imagine your are storing text data encoded as UTF-8, or ASCII in memory where each character is mostly like a byte in size, do you want to store the characters as 64 bit quantities?
The advice you are looking is a warning not to over optimizes.
You have to balance the savings in space versus the computation performance of you choice.
I wouldn't worry to much about it, today's modern CPUs are complicated enough that its hard to make this kind of judgement on your own. Choose the obvious datatype and let the compiler worry about the rest.
The addressing model of the x86 architecture is that the basic unit of memory is 8 bit bytes.
This is to simplify operation with character strings and decimal arithmetic.
Then, in order to have useful sizes of integers, the instruction set allows using these in units of 1, 2, 4, and (recently) 8 bytes.
A Fact to remember, is that most software development takes place writing for different processors than most of us here deal with on a day to day basis.
C and assembler are common languages for these.
About ten billion CPUs were manufactured in 2008. About 98% of new CPUs produced each year are embedded.

Non-Linux Implementations of boost::random_device

Currently, Boost only implements the random_device class for Linux (maybe *nix) systems. Does anyone know of existing implementations for other OS-es? Ideally, these implementations would be open-source.
If none exist, how should I go about implementing a non-deterministic RNG for Windows as well as Mac OS X? Do API calls exist in either environment that would provide this functionality? Thanks (and sorry for all the questions)!
On MacOSX, you can use /dev/random (since it's a *nix).
On Windows, you probably want the CryptGenRandom function. I don't know if there's an implementation of boost::random_device that uses it.
Depends on what you want to use you RNG for.
In general terms, you'll feed seed data into a buffer, generate hash values of the buffer, mix a counter into the result and hash it some more. The reason for using a hash function is that good hashes are designed to yield random-looking results from input data that's more structured.
If you want to use it for cryptography, things'll turn a lot hairier. You'll need to jump through more hoops to ensure that your RNG keeps repeating patterns within reasonably safe limits. I can recommend Bruce Schneier's "Practical Cryptography" (for an introduction on RNGs, and a sample implementation). He's also got some RNG-related stuff up about his yarrow RNG.
If boost relies on /dev/random, chances are it works on MacOS also (as it has that).
On Windows there is CryptoAPI as part of the OS, and that provides a crypto quality RNG.
Also, I believe modern Intel CPUs have a hardware RNG on the chip - however you'd have to figure out how to get at that on each OS. Using the higher level APIs is probably a better bet.
edit: Here's a link to how the Intel RNG works
OpenSSL has a decent one.
#include <openssl/rand.h>
...
time_t now = time(NULL);
RAND_seed(&now, sizeof(now)); // before first number you need
int success = RAND_bytes(...);
if (!success) die_loudly();
RAND_cleanup(); // after you don't need any more numbers
Microsoft CryptoAPI has one on Win32. It requires a few more function calls. Not including the details here because there are 2 to 5 args to each of these calls. Be careful, CryptoAPI seems to require the user to have a complete local profile (C:\Documents and Settings\user\Local Settings) correctly set up before it can give you a random number.
CryptAcquireContext // see docs
CryptGenRandom
CryptReleaseContext

Resources