Clang compiler auto-changed strcpy to memmov. macOS - gcc

When I compile with Clang on macOS (with or without xCode), the call into strcpy is auto-substituted to memmov.
Is there a Clang flag to turn this off?
int main(void)
{
char nice_message[6];
const char *message = "hello";
strcpy(nice_message, message);
return 0;
}
Compile
clang -arch x86_64 -mmacosx-version-min=10.13 -g -fno-PIE main.c -o foobar
Trace
frida-trace -i "*memmove" -i "*strcpy" -f foobar
Instrumenting functions...
Loaded handler at "/libSystem.B.dylib/_platform_memmove.js"
Loaded handler at "/libSystem.B.dylib/wmemmove.js"
Loaded handler at "/libSystem.B.dylib/_platform_strcpy.js"
Started tracing 3 functions. Press Ctrl+C to stop.
/* TID 0x407 */
8 ms _platform_strcpy()
8 ms | _platform_memmove()
Update
I tried the same with gcc-9 ( installed via Homebrew ) and the behavior was largely the same.
Why do I care?
I was demonstrating Stack Overflows and Heap Overflows with strcpy and the differences between:

Compilers recognize certain functions and automatically expand them inline. For GCC and Clang, you can disable this behavior by compiling without optimization (-O0), switching to freestanding mode (-ffreestanding, as opposed to hosted), or by disabling built-in expansions (-fno-builtin).

Related

How to prevent GCC from inserting memset during link-time optimization?

While developping a bare metal firmware in C for a RV32IM target (RISC-V), I encountered a linking error when LTO is enabled:
/home/duranda/riscv/lib/gcc/riscv64-unknown-elf/10.2.0/../../../../riscv64-unknown-elf/bin/ld: /tmp/firmware.elf.5cZNyC.ltrans0.ltrans.o: in function `.L0 ':
/home/duranda/whatever/firmware.c:493: undefined reference to `memset'
There are however no call to memset in my firmware. The memset is inserted by GCC during optimization as described here. The build is optimized for size using GCC -Os and -flto -fuse-linker-plugin flags. In addition, the -fno-builtin-memset -nostdinc -fno-tree-loop-distribute-patterns -nostdlib -ffreestanding flags are used to prevent the use of memset during optimization and to not include standard libs.
How to prevent memset insertion during LTO? Note that the firmware should not be linked against libc. I also tried providing a custom implementation of memset but the linker does not want to use it for memset inserted during optimization (still throws undefined reference).
I hit similar issue servers years ago and tried to fixed that, but it turns out I misunderstanding the meaning of -fno-builtin[1], -fno-builtin not guaranteed GCC won't call memcpy, memmove or memset implicitly.
I guess the simplest solution is, DO NOT compile your libc.c with -flto, or in another word, compile libc.c with -fno-lto.
That's my guess about what happen, I don't have know how to reproduce what you see, so it might incorrect,
During the first phase of LTO, LTO will collect any symbol you used in program
And then ask linker to provide those files, and discard any unused symbol.
Then read those files into GCC and optimize again, in this moment gcc using some built-in function to optimize or code gen, but it not pull-in before.
The symbol reference is created at LTO stage, which is too late pull in any symbol in current GCC LTO flow, and in this case, memset is discard in earlier stage...
So you might have question about why compile libc.c with -fno-lto will work? because if it didn't involved into LTO flow, which means it won't be discarded in the LTO flow.
Some sample program to show the gcc will call memset even you compile with -fno-builtin, aarch64 gcc and riscv gcc will generate a function call to memset.
// $ riscv64-unknown-elf-gcc x.c -o - -O3 -S -fno-builtin
struct bar {
int a[100];
};
struct bar y;
void foo(){
struct bar x = {{0}};
y = x;
}
Here is the corresponding gcc source code[2] for this case.
[1] https://gcc.gnu.org/pipermail/gcc-patches/2014-August/397382.html
[2] https://github.com/riscv/riscv-gcc/blob/riscv-gcc-10.2.0/gcc/expr.c#L3143
I'm not sure -fno-builtin-* does what you think it does. If you use those flags, then GCC will try to call an external function. If you don't use those flags, GCC will instead just insert inline code instead of relying on the library.
So it would appear to me you should simply not use any -fno-builtin flags.

What are __CUDABE__ and __CUDA_LIBDEVICE__ for?

Let's say I'm interested on preprocessing (with gcc) hpp/cpp files which include CUDA kernel declarations. I want the preprocessor to not to scrap the __global__ specifier, otherwise I wouldn't be able to link against the definition in the .cu file.
For instance, a file t1.hpp:
__global__ void foo(int* v, int n);
And preprocess with:
gcc -E t1.hpp -I/usr/local/cuda/include -include cuda_runtime.h
But the result scraps global !:
...
# 1888 "/usr/local/cuda/include/cuda_runtime.h"
#pragma GCC diagnostic pop
# 1 "<command-line>" 2
# 1 "t1.hpp"
void foo();
But if I define __CUDABE__ (on CUDA 8.0) or __CUDA_LIBDEVICE__ in CUDA 9.0+ i amb able to keep that information:
gcc -E t1.hpp -I/usr/local/cuda/include -include cuda_runtime.h -D__CUDABE__
Final result:
...
# 1888 "/usr/local/cuda/include/cuda_runtime.h"
#pragma GCC diagnostic pop
# 1 "<command-line>" 2
# 1 "t1.hpp"
__attribute__((global)) void foo();
So my question is, what is __CUDABE__ and __CUDA_LIBDEVICE__ for and if what could be the side effects.
I've also seen that clang defines those macros in __clang_cuda_runtime_wrapper.h. Is this then this something safe to do?
Since it is not documented anywhere, it's some sort of an internal flag they use (which can, as you've noticed change between the compilers), so you probably shouldn't rely on it. It is defined in crt/host_defines.h, which is not very well documented, so I cannot decipher what it means.
Is there any reason why you cannot preprocess the file with nvcc?
This should do what you want, and it invokes gcc with correct parameters (at least on my system):
nvcc -E --x=cu t1.hpp`
If you cannot use nvcc for whatever reason, you can always call it in verbose mode (nvcc -E -v --x=cu t1.hpp) and see which flags it sets. On my linux system with CUDA 9.1 I get:
gcc -std=c++14 -D__CUDA_ARCH__=300 -E -x c++ \
-DCUDA_DOUBLE_MATH_FUNCTIONS -D__CUDACC__ \
-D__NVCC__ "-I/opt/cuda/bin/..//include" \
-D"__CUDACC_VER_BUILD__=85" -D"__CUDACC_VER_MINOR__=1" \
-D"__CUDACC_VER_MAJOR__=9" -include "cuda_runtime.h" \
-m64 "t1.hpp"
However, you'll probably have to do it for each CUDA version you want to use, as these flags can change.

Passing multiple -std switches to g++

Is it safe to assume that running g++ with
g++ -std=c++98 -std=c++11 ...
will compile using C++11? I haven't found an explicit confirmation in the documentation, but I see the -O flags behave this way.
The GCC manual doesn't state that the
last of any mutually exclusive -std=... options specified takes effect. The first occurrence
or the last occurrence are the only alternatives. There are numerous
GCC flags that take mutually exclusive alternative values from a finite set - mutually
exclusive, at least modulo the language of a translation unit. Let's call them mutex options for short.
It is a seemingly random rarity for it to be documented that the last setting takes effect. It is
documented for the -O options as you've noted, and in general terms for mutually exclusive warning options, perhaps
others. It's never documented that the first of multiple setting takes effect, because
it's never true.
The documentation leans - with imperfect consistency - on the historical conventions
of command usage in unix-likes OSes. If a command accepts a mutex option
then the last occurrence of the option takes effect. If the command were - unusually -
to act only on the first occurrence of the option then it would be a bug for
the command to accept subsequent occurrences at all: it should give a usage error.
This is custom and practice. The custom facilitates scripting with tools that
respect it, e.g. a script can invoke a tool passing a default setting of some
mutex option but enable the user to override that setting via a parameter of the script,
whose value can simply be appended to the default invocation.
In the absence of official GCC documentation to the effect you want, you might get
reassurance by attempting to find any GCC mutex option for which it is not
the case that the last occurrence takes effect. Here's one stab:
I'll compile and link this program:
main.cpp
#include <cstdio>
#if __cplusplus >= 201103L
static const char * str = "C++11";
#else
static const char * str = "Not C++11";
#endif
int main()
{
printf("%s\n%d\n",str,str); // Format `%d` for `str` mismatch
return 0;
}
with the commandline:
g++ -std=c++98 -std=c++11 -m32 -m64 -O0 -O1 -g3 -g0 \
-Wformat -Wno-format -o wrong -o right main.cpp
which requests contradictory option pairs:
-std=c++98 -std=c++11: Conform to C++98. Conform to C++11.
-m32 -m64: Produce 32-bit code. Produce 64-bit code.
-O0 -O1: Do not optimise at all. Optimize to level 1.
-g3 -g0: Emit maximum debugging info. Emit no debugging info.
-Wformat -Wno-format. Sanity-check printf arguments. Don't sanity check them.
-o wrong -o right. Output program wrong. Output program right
It builds successfully with no diagnostics:
$ echo "[$(g++ -std=c++98 -std=c++11 -m32 -m64 -O0 -O1 -g3 -g0 \
-Wformat -Wno-format -o wrong -o right main.cpp 2>&1)]"
[]
It outputs no program wrong:
$ ./wrong
bash: ./wrong: No such file or directory
It does output a program right:
$ ./right
C++11
-1713064076
which tells us it was compiled to C++11, not C++98.
The bug exposed by the garbage -1713064076 was not diagnosed because
-Wno-format, not -Wformat, took effect.
It is a 64-bit, not 32-bit executable:
$ file right
right: ELF 64-bit LSB shared object, x86-64 ...
It was optimized -O1, not -O0, because:
$ "[$(nm -C right | grep str)]"
[]
shows that the local symbol str is not in the symbol table.
And it contains no debugging information:
echo "[$(readelf --debug-dump right)]"
[]
as per -g0, not -g3.
Since GCC is open-source software, another way of resolving doubts
about its behaviour that is available to C programmers, at least,
is to inspect the relevant source code, available via git source-control at
https://github.com/gcc-mirror/gcc.
The relevant source code for your question is in file gcc/gcc/c-family/c-opts.c,
function,
/* Handle switch SCODE with argument ARG. VALUE is true, unless no-
form of an -f or -W option was given. Returns false if the switch was
invalid, true if valid. Use HANDLERS in recursive handle_option calls. */
bool
c_common_handle_option (size_t scode, const char *arg, int value,
int kind, location_t loc,
const struct cl_option_handlers *handlers);
It is essentially a simple switch ladder over option settings enumerated by scode - which
is OPT_std_c__11 for option -std=c++11 - and leaves no doubt that it
puts an -std option setting into effect regardless of what setting was in effect previously. You can look at branches other than master
(gcc-{5|6|7}-branch) with the same conclusion.
It's not uncommon to find GCC build system scripts that rely on the validity of
overriding an option setting by appending a new setting. Legalistically, this
is usually counting on undocumented behaviour, but there's a better
chance of Russia joining NATO than of GCC ceasing to take the last setting that
it parses for a mutex option.

How to make printf work on STM32F103?

I am new to the world of STM32F103. I have a demo code for STM32F103 and I am using arm-none-eabi to compile it.
I tried what I could find on Google, but nothing worked so far. I have already spent three days on the problem.
Anyone can give me a demo code for printf which works well?
Part of my makefile:
CFLAG = -mcpu=$(CPU) -mthumb -Wall -fdump-rtl-expand -specs=nano.specs --specs=rdimon.specs -Wl,--start-group -lgcc -lc -lm -lrdimon -Wl,--end-group
LDFLAG = -mcpu=$(CPU) -T ./stm32_flash.ld -specs=nano.specs --specs=rdimon.specs -Wl,--start-group -lgcc -lc -lm -lrdimon -Wl,--end-group
By including the following linker flags:
LDFLAGS += --specs=rdimon.specs -lc -lrdimon
it looks like you are trying to use what is called semihosting. You are telling the linker to include system call libraries.
Semihosting is a mechanism that enables code running on an ARM target to communicate and use the Input/Output facilities on a host computer that is running a debugger.
Examples of these facilities include keyboard input, screen output, and disk I/O. For example, you can use this mechanism to enable functions in the C library, such as printf() and scanf(), to use the screen and keyboard of the host instead of having a screen and keyboard on the target system.
Since you are using openSource tools for your STM32 development (Makefile and arm-none-eabi), I am assuming you are also using openOCD to program your microcontroller. openOCD requires you to enable semihosting as well using the following command:
arm semihosting enable
You can at the command to your openOCD script making sure you terminate the configuration stage and enter the run stage with the 'init' command. Below is an example of an openOCD script (adapted for STM32F103):
source [find target/stm32f1x.cfg]
init
arm semihosting enable
Other solutions mentioned here where your retarget the fputc() function to a UART interface will also work and might. Semihosting will work on all recent ARM Cortex-M but will require some compiler & debugger configuration (see above). Retargeting the fputc() function to a UART interface will work with any compiler but you will have to check your pin configurations for every board.
Writing an own printf implementation is an option, and probably the most recommended option according to me. Get some inspiration from the standard library implementation and write your own version, only to cater your requirements. In general, what you have to do is, first retarget a putc function to send char s through your serial interface. Then override the printf method by using the putc custom implementation. Perhaps, a very simple approach is sending the string character-wise by recursive calls for putc function.
Last but not least, you can find some lightweight printf implementations. The code size and the set of features offered by these lightweight implementations lie in between the custom written printf function and the stock standard printf function (aka the beast). I have recently tried this Tiny Printf and very pleased with its performance on an ARM core in terms of memory footprint and the number of execution cycles required.
-PS
Copied from my own writings sometime back.
Link: How to retarget printf() on an STM32F10x?
Try hijacking the _write function like so:
#define STDOUT_FILENO 1
#define STDERR_FILENO 2
int _write(int file, char *ptr, int len)
{
switch (file)
{
case STDOUT_FILENO: /*stdout*/
// Send the string somewhere
break;
case STDERR_FILENO: /* stderr */
// Send the string somewhere
break;
default:
return -1;
}
return len;
}
The original printf will go through this function (depending on what libs you use of course).
Look there. This is printf from glib. But you have microcontroller. So you sould write own printf, where vfprintf will return result into buffer and next you will send data from buffer to UART. Kind of
void printf( const char * format, ... )
{
char buffer[256];
va_list args;
va_start (args, format);
vsprintf (buffer,format, args);
send_via_USART1 (buffer);
va_end (args);
}
Also you can write own vsprintf. Standart vsprintf is very heavy. Usually little part of vsprintf features is used.

line number information lost during linking in gcc

I'm using Red Hat 4.4.7-3 and gcc 4.8.3
I have code in two files(test.c and sum.c) and I compiled them separately with gcc(with debug information). In the last phase when I'm making the final output by combining both files, debug information is lost.
test.c:
int main()
{
int a=5,b=7;
int c=testsum(a,b);
printf("%d + %d=%d\n",a,b,c);
return 0;
}
sum.c:
int testsum(int a, int b)
{
return a+b;
}
I did the following:
gcc -c -g test.c -o test.o
gcc -c -g sum.c -o sum.o
gcc -g test.o sum.o -o output
When I do gdb sum.o then it is showing the line number information
(gdb) l testsum
1 int testsum(int a, int b)
2 {
3 return a+b;
4 }
but with the gdb output I'm not getting line number information.
(gdb) l testsum
No line number known for testsum.
(gdb)
I repeated the same thing on my personal laptop(gcc-4.8.real (Ubuntu/Linaro 4.8.1-10ubuntu9) 4.8.1) and here it is working perfectly fine. But I need the debug information in the final output on the redhat machine for some project.
Any suggestions/comments regarding obtaining the line number information in final executable would be much appreciated.
You need to compile and link with gcc -g. Perhaps you forgot the -g flag at link time.
And use surely want to compile with gcc -Wall -g since warnings are incredibly useful.
You should run gdb on the ELF executable file, not on object files (so gdb sum.o is wrong):
gdb ./output
You should have a Makefile (see this example) and build your program using GNU make
Perhaps the gdb on the remote Redhat server is not accepting the same DWARF format than on your local laptop. Check the versions of gdb. (Perhaps consider compiling on the remote sever, or passing some explicit debugging option like -gdwarf-3 or whatever is appropriate for the remote gdb to your gcc laptop compiler).

Resources