I'm trying to debug code generated by Bison + Flex (what a joy!). It segfaults so badly that there isn't even stack information available to gdb. Is there any way to make this combination generate code that's more debuggable?
Note that I'm trying to compile a reentrant lexer and parser (which is in itself a huge pain).
Below is the program that tries to use the yyparse:
int main(int argc, char** argv) {
int res;
if (argc == 2) {
yyscan_t yyscanner;
res = yylex_init(&yyscanner);
if (res != 0) {
fprintf(stderr, "Couldn't initialize scanner\n");
return res;
}
FILE* h = fopen(argv[1], "rb");
if (h == NULL) {
fprintf(stderr, "Couldn't open: %s\n", argv[1]);
return errno;
}
yyset_in(h, yyscanner);
fprintf(stderr, "Scanner set\n");
res = yyparse(&yyscanner);
fprintf(stderr, "Parsed\n");
yylex_destroy(&yyscanner);
return res;
}
if (argc > 2) {
fprintf(stderr, "Wrong number of arguments\n");
}
print_usage();
return 1;
}
Trying to run this gives:
(gdb) r
Starting program: /.../program
[Inferior 1 (process 3292) exited with code 01]
Note 2: I'm passing -d to flex and -t to bison.
After shuffling the code around I was able to get backtrace. But... it appears that passing -t has zero effect as does %debug directive in *.y file. The only way to get traces is to set yydebug = 1 in your code.
You are clobbering the stack by passing the address of yyscanner instead of its value to yyparse. Once the stack has been overwritten in that fashion, even gdb will be unable to provide accurate backtraces.
The -d and %debug directives cause bison to emit the code necessary to perform debugging traces. (This makes the parser code somewhat larger and a tiny bit slower, so it is not enabled by default.) That is necessary for tracing to work, but you still have to request traces by setting yydebug to a non-zero value.
This is mentioned right at the beginning of the Bison manual section on tracing: (emphasis added)
8.4.1 Enabling Traces
There are several means to enable compilation of trace facilities
And slightly later on:
Once you have compiled the program with trace facilities, the way to request a trace is to store a nonzero value in the variable yydebug. You can do this by making the C code do it (in main, perhaps), or you can alter the value with a C debugger.
Unless you are working in an extremely resource-constrained environment, I suggest you always use the -t option, as do the Bison authors:
We suggest that you always enable the trace option so that debugging is always possible.
Related
here is a test program I wrote
int main( int argc, const char* argv[] )
{
const char name[1024] = "/dev/shm/test_file";
off_t len = atol(argv[argc - 1]);
char buf[1024];
FILE * f = fopen(name, "w");
for (int i = 0; i < len; i++) {
int ret = fwrite(buf, 1024, 1, f);
if (ret != 1) {
printf("disk full\n");
}
}
if ( fclose(f) != 0)
printf("failed to close\n");
return 0;
}
I tried to fill the /dev/shm to almost full
tmpfs 36G 36G 92K 100% /dev/shm
and ran
$ ./a.out 93
failed to close
my glibc
$ /lib/libc.so.6
GNU C Library stable release version 2.12, by Roland McGrath et al.
the kernel version is 2.6.32-642.13.1.el6.x86_64
I understand that this behavior is caused by fwrite try to cache the data in memory. (I tried setvbuf(NULL...) and fwrite immediately return failure). But this seems a little different from the definition
The fwrite() function shall return the number of elements successfully
written, which may be less than nitems if a write error is
encountered. If size or nitems is 0, fwrite() shall return 0 and the
state of the stream remains unchanged. Otherwise, if a write error
occurs, the error indicator for the stream shall be set, [CX] [Option
Start] and errno shall be set to indicate the error. [Option End]
The data was not successfully written to disk however its return value is 1. no errno set.
In this test case, the fclose catch the failure. But it could be caught by even a ftell function which is quite confusing.
I am wondering if this happens to all versions of glibc and would this be consider a bug.
The data was not successfully written to disk
The standard doesn't talk about the disk. It talks about data being successfully written to the stream (which it has been).
I am wondering if this happens to all versions of glibc
Most likely.
and would this be consider a bug.
It's a bug in your interpretation of the requirements on fwrite.
I am trying to run sample rsa/dsa code using libtomcrypt.
I have installed LibTomMath first as make install, as a result following files are created.
/usr/lib/libtommath.a
/usr/include/tommath.h
After that I installed libtomcrypt with LibTomMath as external library
CFLAGS="-DLTM_DESC -DUSE_LTM -I/usr/include" EXTRALIBS="/usr/lib/libtommath.a " make install
As a result following file is created
/usr/lib/libtomcrypt.a
I am not getting any error while running following command
CFLAGS="-DLTM_DESC -DUSE_LTM -I/usr/include" EXTRALIBS="/usr/lib/libtommath.a " make test
I have gone through this document libtomcrypt_installation and libtomcrypt_resolved to successfully compile using
gcc -DLTM_DESC rsa_make_key_example.c -o rsa -ltomcrypt
or
gcc rsa_make_key_example.c -o rsa -ltomcrypt
no compile error. However when I try to run, I got following error.
./rsa
LTC_ARGCHK 'ltc_mp.name != NULL' failure on line 34 of file src/pk/rsa/rsa_make_key.c
Aborted
Here is my sample rsa code
#include <tomcrypt.h>
#include <stdio.h>
int main(void) {
# ifdef USE_LTM
ltc_mp = ltm_desc;
# elif defined (USE_TFM)
ltc_mp = tfm_desc;
# endif
rsa_key key;
int err;
register_prng(&sprng_desc);
if ((err = rsa_make_key(NULL, find_prng("sprng"), 1024/8, 65537,&key)) != CRYPT_OK) {
printf("make_key error: %s\n", error_to_string(err));
return -1;
}
/* use the key ... */
return 0;
}
Here is my sample dsa code
#include <tomcrypt.h>
#include <stdio.h>
int main(void) {
# ifdef USE_LTM
ltc_mp = ltm_desc;
# elif defined (USE_TFM)
ltc_mp = tfm_desc;
# endif
int err;
register_prng(&sprng_desc);
dsa_key key;
if ((err = dsa_make_key(NULL, find_prng("sprng"), 20, 128,&key)) != CRYPT_OK) {
printf("make_key error: %s\n", error_to_string(err));
return -1;
}
/* use the key ... */
return 0;
}
Here is how I have compiled it successfully,
gcc dsa_make_key_example.c -o dsa -ltomcrypt
When I try to run the code , I am getting following error .
./dsa
segmentation fault
EDIT 1:
I investigated further and found the reason for segmentation fault
#ifdef LTC_MPI
#include <stdarg.h>
int ltc_init_multi(void **a, ...)
{
...
...
if (mp_init(cur) != CRYPT_OK) ---> This line causes segmentation fault
Where am I making mistakes ? How to resolve this problem to run these programs successfully?
I am using linux , gcc. Any help/link will be highly appreciated. Thanks in advance.
It's been a year or so since this was asked, but I have some component of an answer, and a workaround.
The reason mp_init fails is that the "math_descriptor" is uninitialized. mp_init is a defined as
#define mp_init(a) ltc_mp.init(a)
where ltc_mp is a global struct (of type ltc_math_descriptor) that holds pointers to the math routines.
There are several implementations of the math routines available, and a user can choose which they want. For whatever reason, there does not seem to be a default math implementation chosen for certain builds of libtomcrypt. Thus, the init member of ltc_mp is null, and we get the SIGSEGV.
Here is a manual workaround:
You can make your desired ltc_math_descriptor struct available to your main() routine by #defineing one of
LTM_DESC -- built-in math lib
TFM_DESC -- an external fast math package
GMP_DESC -- presumably a GNU MultiPrecision implementation?
Before #include <tomcrypt.h> (or by using -D on the command-line).
Whichever you choose, a corresponding object will be declared:
extern const ltc_math_descriptor ltm_desc;
extern const ltc_math_descriptor tfm_desc;
extern const ltc_math_descriptor gmp_desc;
To use it, manually copy it to the global math descriptor:
E.g., in my case, for the local math imlpementation,
ltc_mp = ltm_desc;
Now libtomcrypt works.
My C++ 2011 main() program for DiGSE is:
int main(int argc, char* argv[]) {
. . .
return EXIT_SUCCESS;
} // this } DOES match the opening { above
It compiles and executes correctly. A print statement immediately before the return outputs normally. However, a Windows 7.1 notification pops up saying "DiGSE.exe has stopped working." It then graciously offers to search the web for a solution.
I tried replacing the return with return 0; exit(0); and nothing so execution falls out the bottom (which, as I understand, is acceptable). However, in all cases I still get the pop-up.
What do I do to get the main() to exit gracefully?
DiGSE is just the name of the Windows 7 executable compiled on MinGW 4.9.2. The "full" program is already stripped down:
int main(int argc, char* argv[]) {
try {
DiGSE::log_init(DiGSE::log_dest_T::console_dest, "dig.log", true,
DiGSE::log_lvl_T::trace_lvl);
}//try
catch (const std::exception& ex) {
std::cerr << FMSG("\n"
"Executing '%1%' raised this exception:\n"
" %2%", % DiGSE::Partition::productName()
% ex.what())
<< std::endl;
return EXIT_FAILURE;
}//exception
catch (...) {
std::cerr << FMSG("\n"
"Executing '%1%' instance raised an unknown exception.",
% DiGSE::Partition::productName())
<< std::endl;
return EXIT_FAILURE;
}//exception
L_INFO(FMSG("'%1% v%2%' terminated normally.",
% DiGSE::Partition::productName()
% DiGSE::Partition::productVersion()))
return EXIT_SUCCESS;
}//main()
The L_INFO() is a logging call, which outputs as it should. The log_init() at the top initializes the log. Commenting out log_init() and L_INFO() has the same result as originally reported.
Program received signal SIGSEGV, Segmentation fault.
0x000000006fc8da9d in libstdc++-6!_ZNSo6sentryC1ERSo ()
from D:\Program Files\mingw-w64\x86_64-4.9.2-posix-seh-rt_v3-rev0\mingw64\bin
\libstdc++-6.dll
This is what gdb returns while mail() is exiting. It does this even with the log_init() and L_LNFO() commented out. So the problem is probably in one of globals of something it's linked to.
It is completely possible for a program to crash after the end of main -- the program isn't over yet. The following items execute after main() returns:
Registered at_exit handlers
Destructors for main()'s own automatic variables, and all variables with static storage duration (globals and function-static) (C++ only)
DllMain(PROCESS_DETACH) code in all dynamic libraries you are using (Windows only)
In addition to that, various events can occur outside your program and cause failures which you might mistake for a failure of your program (especially if your program forks or spawns copies of itself):
SIGCHLD is raised (on *nix). Process handles become signaled and cause wait functions to return (on Windows)
All open handles (file descriptors) get abandoned, and the close handler in the driver is invoked
The other end of connections (pipes, sockets) shift into a disconnected state (reads return 0, writes fail, on *nix SIGHUP may be raised)
I suggest attaching a debugger, set a breakpoint at the end of main, and then single-step through the cleanup code to find out where the failure is occurring. Divide and conquer may also be helpful (cut out some global variables, or all usage of a particular DLL).
As this is my first post to stackoverflow I want to thank you all for your valuable posts that helped me a lot in the past.
I use MinGW (gcc 4.4.0) on Windows-7(64) - more specifically I use Nokia Qt + MinGW but Qt is not involved in my Question.
I need to find the address and -more important- the length of specific functions of my application at runtime, in order to encode/decode these functions and implement a software protection system.
I already found a solution on how to compute the length of a function, by assuming that static functions placed one after each other in a source-file, it is logical to be also sequentially placed in the compiled object file and subsequently in memory.
Unfortunately this is true only if the whole CPP file is compiled with option: "g++ -O0" (optimization level = 0).
If I compile it with "g++ -O2" (which is the default for my project) the compiler seems to relocate some of the functions and as a result the computed function length seems to be both incorrect and negative(!).
This is happening even if I put a "#pragma GCC optimize 0" line in the source file,
which is supposed to be the equivalent of a "g++ -O0" command line option.
I suppose that "g++ -O2" instructs the compiler to perform some global file-level optimization (some function relocation?) which is not avoided by using the #pragma directive.
Do you have any idea how to prevent this, without having to compile the whole file with -O0 option?
OR: Do you know of any other method to find the length of a function at runtime?
I prepare a small example for you, and the results with different compilation options, to highlight the case.
The Source:
// ===================================================================
// test.cpp
//
// Intention: To find the addr and length of a function at runtime
// Problem: The application output is correct when compiled with: "g++ -O0"
// but it's erroneous when compiled with "g++ -O2"
// (although a directive "#pragma GCC optimize 0" is present)
// ===================================================================
#include <stdio.h>
#include <math.h>
#pragma GCC optimize 0
static int test_01(int p1)
{
putchar('a');
putchar('\n');
return 1;
}
static int test_02(int p1)
{
putchar('b');
putchar('b');
putchar('\n');
return 2;
}
static int test_03(int p1)
{
putchar('c');
putchar('\n');
return 3;
}
static int test_04(int p1)
{
putchar('d');
putchar('\n');
return 4;
}
// Print a HexDump of a specific address and length
void HexDump(void *startAddr, long len)
{
unsigned char *buf = (unsigned char *)startAddr;
printf("addr:%ld, len:%ld\n", (long )startAddr, len);
len = (long )fabs(len);
while (len)
{
printf("%02x.", *buf);
buf++;
len--;
}
printf("\n");
}
int main(int argc, char *argv[])
{
printf("======================\n");
long fun_len = (long )test_02 - (long )test_01;
HexDump((void *)test_01, fun_len);
printf("======================\n");
fun_len = (long )test_03 - (long )test_02;
HexDump((void *)test_02, fun_len);
printf("======================\n");
fun_len = (long )test_04 - (long )test_03;
HexDump((void *)test_03, fun_len);
printf("Test End\n");
getchar();
// Just a trick to block optimizer from eliminating test_xx() functions as unused
if (argc > 1)
{
test_01(1);
test_02(2);
test_03(3);
test_04(4);
}
}
The (correct) Output when compiled with "g++ -O0":
[note the 'c3' byte (= assembly 'ret') at the end of all functions]
======================
addr:4199344, len:37
55.89.e5.83.ec.18.c7.04.24.61.00.00.00.e8.4e.62.00.00.c7.04.24.0a.00.00.00.e8.42
.62.00.00.b8.01.00.00.00.c9.c3.
======================
addr:4199381, len:49
55.89.e5.83.ec.18.c7.04.24.62.00.00.00.e8.29.62.00.00.c7.04.24.62.00.00.00.e8.1d
.62.00.00.c7.04.24.0a.00.00.00.e8.11.62.00.00.b8.02.00.00.00.c9.c3.
======================
addr:4199430, len:37
55.89.e5.83.ec.18.c7.04.24.63.00.00.00.e8.f8.61.00.00.c7.04.24.0a.00.00.00.e8.ec
.61.00.00.b8.03.00.00.00.c9.c3.
Test End
The erroneous Output when compiled with "g++ -O2":
(a) function test_01 addr & len seem correct
(b) functions test_02, test_03 have negative lengths,
and fun. test_02 length is also incorrect.
======================
addr:4199416, len:36
83.ec.1c.c7.04.24.61.00.00.00.e8.c5.61.00.00.c7.04.24.0a.00.00.00.e8.b9.61.00.00
.b8.01.00.00.00.83.c4.1c.c3.
======================
addr:4199452, len:-72
83.ec.1c.c7.04.24.62.00.00.00.e8.a1.61.00.00.c7.04.24.62.00.00.00.e8.95.61.00.00
.c7.04.24.0a.00.00.00.e8.89.61.00.00.b8.02.00.00.00.83.c4.1c.c3.57.56.53.83.ec.2
0.8b.5c.24.34.8b.7c.24.30.89.5c.24.08.89.7c.24.04.c7.04.
======================
addr:4199380, len:-36
83.ec.1c.c7.04.24.63.00.00.00.e8.e9.61.00.00.c7.04.24.0a.00.00.00.e8.dd.61.00.00
.b8.03.00.00.00.83.c4.1c.c3.
Test End
This is happening even if I put a "#pragma GCC optimize 0" line in the source file, which is supposed to be the equivalent of a "g++ -O0" command line option.
I don't believe this is true: it is supposed to be the equivalent of attaching __attribute__((optimize(0))) to subsequently defined functions, which causes those functions to be compiled with a different optimisation level. But this does not affect what goes on at the top level, whereas the command line option does.
If you really must do horrible things that rely on top level ordering, try the -fno-toplevel-reorder option. And I suspect that it would be a good idea to add __attribute__((noinline)) to the functions in question as well.
In OSX during C++ program compilation with g++ I use
LD_FLAGS= -Wl,-stack_size,0x100000000
but in SUSE Linux I constantly get errors like:
x86_64-suse-linux/bin/ld: unrecognized option '--stack'
and similar.
I know that it is possible to use
ulimit -s unlimited
but this is not nice as not always can a single user do that.
How can I increase the stack size in Linux with GCC for a single application?
You can set the stack size programmatically with setrlimit, e.g.
#include <sys/resource.h>
int main (int argc, char **argv)
{
const rlim_t kStackSize = 16 * 1024 * 1024; // min stack size = 16 MB
struct rlimit rl;
int result;
result = getrlimit(RLIMIT_STACK, &rl);
if (result == 0)
{
if (rl.rlim_cur < kStackSize)
{
rl.rlim_cur = kStackSize;
result = setrlimit(RLIMIT_STACK, &rl);
if (result != 0)
{
fprintf(stderr, "setrlimit returned result = %d\n", result);
}
}
}
// ...
return 0;
}
Note: even when using this method to increase stack size you should not declare large local variables in main() itself, since you may well get a stack overflow as soon as you enter main(), before the getrlimit/setrlimit code has had a chance to change the stack size. Any large local variables should therefore be defined only in functions which are subsequently called from main(), after the stack size has successfully been increased.
Instead of stack_size, use --stack like so:
gcc -Wl,--stack,4194304 -o program program.c
This example should give you 4 MB of stack space. Works on MinGW's GCC, but as the manpage says, "This option is specific to the i386 PE targeted port of the linker" (i.e. only works for outputting Windows binaries). Seems like there isn't an option for ELF binaries.
This is an old topic, but none of the flags answered here worked for me. Anyway by I found out that -Wl,-z,stack-size=4194304 (example for 4MB) seems to work.
Consider using -fsplit-stack option https://gcc.gnu.org/wiki/SplitStacks
Change it with the ulimit bash builtin, or setrlimit(), or at login
with PAM (pam_limits.so).
It's a settable
user resource limit; see RLIMIT_STACK in setrlimit(2).
http://bytes.com/topic/c/answers/221976-enlarge-stack-size-gcc