gcc or javac slow at first startup - gcc

Can anyone explain why in linux when I start gcc or javac after some time of inactivity it takes a while for them to start. Subsequent invocations are way faster. Is there a way to ensure quick startup always? (This requirement may seem strange, but is necessary in my case). Ubuntu by the way.

Most likely, it's the time it takes for code pages to fault in. There are a few ways to avoid this delay if you really have to. The simplest would be to run gcc periodically. Another would be to install gcc to a RAM disk.
Another approach would be to make a list of which files are involved and then write a simple program to lock all those files into memory. You can use something like:strace -f gcc *rest of gcc command* 2>&1 | grep open | grep -v -- -1
Use a GCC command line that's typical of how you are using GCC.
You'll find libraries and binaries being opened in there. Make a full list in a file. Then write a program that calls mlockall(MCL_FUTURE) then reads in filenames from the file. For each file, mmap it into memory and read each byte. Then have the program just sleep forever (or until killed).
This will have the effect of forcing every page of every file in memory. You should check the total size of all these files and make sure it's not a significant fraction of the amount of memory you actually have!
By the way, there used to be something called a sticky bit that did something like this. If by some chance your platform supports it, just set it on all the files used. (Although it traditionally caused the files to be saved to swap, which on a modern system won't make things any faster.)

Related

Using binary breakpoints in GDB - how exact is the location?

I have some memorydumps from Linux Redhat GCC compiled programs like:
/apps/suns/runtime/bin/mardb82[0x40853b]
When I open mardb82 and put the breakpoint with break *0x40853b it will give me C filename/lineno which seems quite correct, but not completely.
Can I trust it, and what does it depend on? Is it sufficient if the source file in question is the same or does the files making up the executable have to be the same?
Can I find the locations in sources in some other way?
(Max debug info and sources are present, I haven't tried not having the sources present or passing them in)
When I open mardb82 and put the breakpoint with break *0x40853b it will give me C filename/lineno which seems quite correct, but not completely.
A faster way to get the filename/line:
addr2line -fe /path/to/mardb82 0x40853b
You didn't say where the ...bin/mardb82[0x40853b] line came from. Assuming it is a part of a crash stack, note that the instruction is usually the next after a CALL, so you may be interested in 0x40853b-5 (on *86 architectures) for all but the innermost level in the stack.
what does it depend on? Is it sufficient if the source file in question is the same or does the files making up the executable have to be the same?
The instruction address depends on the particular executable. Any change to source code comprising that executable, to compilation or linking flags, etc. etc. may cause the instructions to shift to a different address.

Why doesn't Linux cache object and/or ".so" files when using GNU Linker?

When linking executables (more than 200) in a large project, I get link rate 0.5 executables per second, even if I have ran the link stage a minute before. vmstat shows more than 20MB/s disk read rate.
But if I pre-cache the build directory using "tar cf /dev/null build-dir" once, I get consistent link rate of 4.8 executables per second and the disk read rate is basically zero.
Why doesn't Linux cache the object files and/or ".so" files when they are read by GNU Linker, but does so when they are read by tar? There is plenty of RAM (16GB). Kernel version is 4.4.146. CentOS 7.5.
It looks like an incorrect setting of vm.vfs_cache_pressure = 1000 was causing this misbehaviour. Setting it to 70 fixed the problem and restored good cache performance.
And the documentation explicitly recommends against increasing the value beyond 100. Unfortunately, the Internet is full of examples with insane values like 1000.

Windows equivalent to pause() syscall?

I would like to create a minimal Windows executable that does nothing - and is minimal in size.
All I care about is keeping a process entry in the task manager.
On Linux, this is very easy (it only takes 2 assembly instructions to use the pause
syscall). How can I achieve similar results on Windows?
I'm trying to keep the executable size to a minimum, I don't want to have 10kB executable that literally does nothing.
Is there a way to achieve this in assembly? As I mentioned, I'd rather not include huge libraries just to make the process "hang".
As Hans suggests in the comments, Sleep(INFINITE) is probably the simplest non-busy wait. It does however mean you have to kill the process with Task Manager to stop it.
Calling MessageBox followed by ExitProcess is probably less annoying if you need to start/stop this process multiple times.
You can probably get it down to 1 KiB with Visual C++ if you don't use the CRT (WinMainCRTStartup and compile with /Zl and smaller alignment)
You can get it slightly smaller with assembly but it is probably not worth it.

what do I do with an SIGFPE address in gdb?

While running an executable in gdb, I encountered the following error:
Program received signal SIGFPE, Arithmetic exception.
0x08158307 in radtra_ ()
How do I understand what line number and file does 0x08158307 without recompiling or otherwise modifying the source? if it helps, the source language was Fortran.
How do I understand what line number and file does 0x08158307 without recompiling or otherwise modifying the source?
That isn't easy. You could use GDB disassemble command, look for access to global variables and CALL instructions, and make a guess where inside radtra_ you are. This is harder the larger the routine is, the more optimizations compiler has applied to it, and the fewer calls and global variable accesses are performed.
If you can't guess, your only options are:
Rebuild the application adding -g flag, but leaving all other compile options unmodified, then use addr2line to translate the address to line number. (This is how you should build the application from the start.)
If you can't rebuild the entire application, rebuild just the source containing radtra_ (again with same flags, but add -g). You should be able to match the output from objdump -d radtra.o with the output from disassemble. Once you have a match, read output from readelf -wl radtra.o or objdump -g radtra.o to associate code offsets within radtra_ with source lines that code was generated from.
Hire an expert to guess for you. This wouldn't be cheap, as people skilled in this kind of reverse engineering are usually gainfully employed and value their time.

How to debug potential CPU/RAM errors in Bash script on Linux

I have a relatively simple bash script that reads from a set of static input files, stores the input in bash variables and then does a bunch of processing over said input by calling out to external scripts (e.g. written in Python, Go, other bash scripts etc.) and using the intermediate results.
Lately I have been experiencing an intermittent problem where a single character seems to be getting altered somewhere during the processing which then causes subsequent errors. Specifically, a lot of the processing I'm doing involves slicing up a list of comma-separated records, and one of the values on each line is a unix timestamp, e.g. 1354245000.
What seems to be happening is that occasionally one of these values will get altered slightly, so I end up with a timestamp like 13542458=2 or 13542458>2 or 13542458;2 coming out of one of the intermediate scripts. This then subsequently gets fed into another script, which throws an exception when it tries to parse the value to an integer.
In the title of this question, I've suggested that this might be a potential CPU/RAM error. I know the general folly in thinking errors are caused by low level things like hardware/compilers etcetera, but the nature of this particular error makes me think it may be possible, for the following reasons:
The input files are the same on each invocation of the script, and the script only fails on some invocations.
I cannot think of any sources of randomness in the source code prior to where the script is breaking. It's basically just slicing and dicing csv input.
I cannot think of any sources of concurrency in the source code -- even the Go scripts aren't actually written to run anything concurrently.
This problem has only arisen in the last week or so. Prior to this time, this error would never occur.
While I haven't documented every erroneous character, they seem to often be quite close in the ASCII table to numeric values (=, >, ; etc). That said, I guess the Hamming distance between two characters quite far apart can be small also with changes to a high order bit.
The script often breaks at a different stage on different runs. i.e. I have a number of separate Python scripts, and sometimes it'll make it past one script and then the error will be induced in another. Other times it'll be induced on an earlier script.
What I'd like to know is, is there any methodical way to either confirm or rule out a hardware error for this problem? Or if it is a hardware problem, is it possibly undetectable by the operating system?
A bit of further info on the machine:
Linux 64-bit, Ubuntu 12.04
Intel i7 processor
16GB DDR3 RAM
I'm hoping someone can either point me to a reliable way to verify whether the hardware is to blame or otherwise a sound reason as to what else might be the cause.
Try booting into Memtest to check your memory.
While it is highly unlikely that it will be hardware, if you have exhausted you standard software debug as suggested by #OliCharlesworth, here is an outline of hardware error investigation:
(1) check your log area for any `MCE` logs (machine check exceptions).
If you find any in either your log area (syslog) or sometimes in
the present working dir or /dir -- you have a hardware failure.
(2) check your log area for disk errors. e.g:
smartd[3963]: Device: /dev/sda [SAT], 34 Currently unreadable (pending) sectors
(3) check your drive integrity, e.g.: (as root) # `smartctl -a /dev/sda` if any abnormality, run:
smartctl -t short /dev/sda (change drive as required)
(4) download/install/boot to [memtest86](http://www.memtest86.com/download.htm)
(run the complete test)
If your cpu/motherboard has thrown no mce's, you have no disk error, your drive tests OK with smartctl and you have no memory errors with memtest86, then recheck the software debugging. While additional hardware errors can still be present (bad capacitors, etc..) the likelihood at this point is software. Good luck.

Resources