Descriptions of lttng kernel events - lttng

I have just started to explore using LTTng to diagnose a network performance problem and it looks like a great tool to use for this. I know I can get a list of events I can capture with lttng list -k but I can't find any documentation on what the events mean.
For example since I am interested in networking performance of an application it looks like I am interested in the events:
net_dev_xmit (loglevel: TRACE_EMERG (0)) (type: tracepoint)
net_dev_queue (loglevel: TRACE_EMERG (0)) (type: tracepoint)
netif_receive_skb (loglevel: TRACE_EMERG (0)) (type: tracepoint)
netif_rx (loglevel: TRACE_EMERG (0)) (type: tracepoint)
I can pretty much intuit what the difference between net_dev_xmit and net_dev_queue is, but what does netif_recieve_skb mean?
This is with Ubuntu 12.04 LTS.
If it turns out that the documentation is just the kernel source code then so be it -- but I didn't want to dig into that if a reference for this was somewhere around and i missed it.

I don't know if you are still interested, but just for the sake of completion, netif_recieve_skb tracepoint is in the netif_recieve_skb() function in the kernel. It's basically used to notify the kernel that a packet has being received and is in the socket buffer. It is similar to netif_rx() but is supposed to be used only by NAPI-compliant drivers. What the tracepoint in the function records can be seen here. Basically it's just some relevant stuff from the sk_buff struct.

Related

How to implement new instruction in linux KVM at unused x86 opcode

As a part of understanding virtualization, I am trying to extend the support of KVM and defin a new instruction. The instruction will use previously unused opcodes.
ref- ref.x86asm.net/coder32.html.
Now, lets say an instruction like 'CPUID' (which causes a vm-exit) and i want to add a new instruction, say - 'NEWCPUID', which is similar to 'CPUID' in priviledge and is trapped by hypervisor, but will differ in the implementation.
After going through some online resources, I was able to understand how to define new system calls, but I am not sure about which all files in linux source code do I need to add the code for NEWCPUID? Is there a better way than only relying on 'find' command?
I am facing below challenges:
1. Which all places in linux source code do I need to add code?
2. Not sure how this new instruction can be mapped to a previously unused opcode?
As I am completely new to this field and willing to learn this, can someone explain me in short how to go about this task? I will need the right direction to achieve this. If there is a reference/tutorial/blog describing the process, it will be of great help!
Here are answers to some of your questions:
... but I am not sure about which all files in linux source code do I need to add the code for NEWCPUID?
A - The right place to add emulation for KVM is arch/x86/kvm/emulate.c. Take a look at how opcode_table[] is defined and the hooks to the functions that they execute. The basic idea is the guest executes and undefined instruction such as "db 0xunused"; this is results in an exit since the instruction is undefined. In KVM, you look at the rip from the VMCS/VMCB and determine if it's an instruction KVM knows about (such as NEWCPUID) and then KVM calls x86_emulate_instruction().
...Is there a better way than only relying on 'find' command?
A - Yes, pick an example system call and then use a symbol cross reference such as cscope.
...n me in short how to go about this task?
A - As I mentioned in 1, first of all find a way for the guest to attempt to execute this unused opcode (such as the db trick). I think the assembler will trying to reject unknown opcodes. So, that the first step. Second, check whether your instruction causes an vmexit(). For this, you can use tracing. Tracing emits a lot of output, so, you have to use some filter options. If tracing is overwhelming, simply printk something in vmx_handle_exit (vmx.c). Finally, find a way to hook to your custom function from here. KVM already has handle_exception() to handle guest exceptions; that would be a good place to insert your custom function. See how this function calls emulate_instruction to emulate an exception to be injected to the guest.
I have deliberately skipped some of the questions since I consider them essential to figure out yourself in the process of learning. BTW, I don't think this may not be the best way to understand virtualization. A better way might be to write your own userspace hypervisor that utlizes kvm services via /dev/kvm or maybe just a standalone hypervisor.

How to debug copy-on-write?

We have some code that relies on extensive usage of fork. We started to hit the performance problems and one of our hypothesis is that we do have a lot of speed wasted when copy-on-write happens in the forked processes.
Is there a way to specifically detect when and how copy-and-write happens, to have a detailed insight into this process.
My platform is OSX but more general information is also appreciated.
There are a few ways to get this info on OS X. If you're satisfied with just watching information about copy-on-write behavior from the command-line, you can use the vm_stat tool with an interval. E.g., vm_stat 0.5 will print full statistics twice per second. One of the columns is the number of copy-on-write faults.
If you'd like to gather specific information in a more detailed way, but still from outside the actual running process, you can use the Instruments application that comes with OS X. This includes a set of tools for gathering information about a running process, the most useful of which for your case are likely to be the VM Tracker, Virtual Memory Trace, or Shared Memory instruments. These capture lots of useful information over the lifetime of a process. The application is not super intuitive, but it will do what you need.
If you'd like detailed information in-process, I think you'll need to use the (poorly documented) VM statistics API. You can request that the kernel fill a vm_statistics struct using the host_statistics routine. For example, running this code:
mach_msg_type_number_t count = HOST_VM_INFO_COUNT;
vm_statistics_data_t vmstats;
kern_return_t host_statistics(mach_host_self(), HOST_VM_INFO, (host_info_t) &vmstats, &count);
will fill the vmstats structure with information such as cow_faults, which gives the number of faults triggered by copy-on-write behavior. Check out the headers /usr/include/mach/vm_*, which declare the types and routines for gathering this information.

How to debug deadlock problems in kernel

I have a buggy kernel module which I am trying to fix. Basically when this module is running, it will cause other tasks to hang for more than 120 seconds. Since almost all the hung tasks are waiting for either mm->mmap_sem or some file system locks (i_node->i_mutex) I suspect that it has something to do with this module doesn't not grab the mmap_sem lock and some file-system level lock (like inote->i_mutex) in order, which could have caused some deadlock problem. Since my module does not try to grab those locks directly though, I assume it is some function I called that grab those locks. And now I am trying to figure out which function calls in my module is causing the problem.
However, I am having a hard time debugging it for the following reasons:
I don't know exactly which lock the hung task is trying to grab. I got the call trace of the hung task, and know at what point it hangs. Kernel also gives me some kind of information like:
"1 lock held by automount/3115:
0: (&type->i_mutex_dir_key#2){--..}, at: [] real_lookup+0x24/0xc5".
However, I want to know exact which lock a task holds, and exactly which lock it is trying to acquire in order to figure out the problem. As kernel doesn't provide the arguments of function calls along with the call trace, I find this information difficult to obtain.
I am using gdb andvmware to debug this, which allows me to set breakpoints, step into a function and such. However, as which task and at what point that task will hang is kind of un-deterministic, I don't really know where to set breakpoints and inspect. It will be great if I can somehow "attach" to the task which kernel reported to be blocked for more than 120 secs, and get some information about it.
So my questions are as following:
Where can I get, along with the call trace, the arguments of the functions in the call trace, in order to figure out exactly which lock a task is trying to grab.
Is it possible for me to use gdb to somehow "attach" to a hung task in a kernel? If not, is there some way for me to at least examine the data structure which represents that task? As I am having a hard time examining all the global data structure in kernel too. GDB always complains that "can't access memory 0x3200" or something similar.
It would also be very helpful if I can print out for every task in the kernel, what locks they are currently holding. Is there a way to do it?
Thank you very much!
Not answering your question directly, but hopefully this is more helpful - the Linux kernel has a built heavy duty lock validator called lockdep. Turn it on and let it run. If you have a lock order problem, it is likely to catch it and give you a detailed report.
See: http://www.mjmwired.net/kernel/Documentation/lockdep-design.txt
The kernel feature lockdep can help you in this regard. Check out my post on how to use it in your kernel: How to use lockdep feature in linux kernel for deadlock detection
Let me try.
1) Try KGDB
2) You mean a hung process?
http://www.ibm.com/developerworks/aix/library/au-unix-strace.html
3) Try the lsof package maybe.

Snoop interprocess communications

Has anyone tried to create a log file of interprocess communications? Could someone give me a little advice on the best way to achieve this?
The question is not quite clear, and comments make it less clear, but anyway...
The two things to try first are ipcs and strace -e trace=ipc.
If you want to log all IPC(seems very intensive), you should consider instrumentation.
Their are a lot of good tools for this, check out PIN in perticular, this section of the manual;
In this example, we show how to do
more selective instrumentation by
examining the instructions. This tool
generates a trace of all memory
addresses referenced by a program.
This is also useful for debugging and
for simulating a data cache in a
processor.
If your doing some heavy weight tuning and analysis, check out TAU (Tuning and analysis utilitiy).
Communication to a kernel driver can take many forms. There is usually a special device file for communication, or there can be a special socket type, like NETLINK. If you are lucky, there's a character device to which read() and write() are the sole means of interaction - if that's the case then those calls are easy to intercept with a variety of methods. If you are unlucky, many things are done with ioctls or something even more difficult.
However, running 'strace' on the program using the kernel driver to communicate can reveal just about all it does - though 'ltrace' might be more readable if there happens to be libraries the program uses for communication. By tuning the arguments to 'strace', you can probably get a dump which contains just the information you need:
First, just eyeball the calls and try to figure out the means of kernel communication
Then, add filters to strace call to log only the kernel communication calls
Finally, make sure strace logs the full strings of all calls, so you don't have to deal with truncated data
The answers which point to IPC debugging probably are not relevant, as communicating with the kernel almost never has anything to do with IPC (atleast not the different UNIX IPC facilities).

How to debug a program without a debugger? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
Interview question-
Often its pretty easier to debug a program once you have trouble with your code.You can put watches,breakpoints and etc.Life is much easier because of debugger.
But how to debug a program without a debugger?
One possible approach which I know is simply putting print statements in your code wherever you want to check for the problems.
Are there any other approaches other than this?
As its a general question, its not restricted to any specific language.So please share your thoughts on how you would have done it?
EDIT- While submitting your answer, please mention a useful resource (if you have any) about any concept. e.g. Logging
This will be lot helpful for those who don't know about it at all.(This includes me, in some cases :)
UPDATE: Michal Sznajderhas put a real "best" answer and also made it a community wiki.Really deserves lots of up votes.
Actually you have quite a lot of possibilities. Either with recompilation of source code or without.
With recompilation.
Additional logging. Either into program's logs or using system logging (eg. OutputDebugString or Events Log on Windows). Also use following steps:
Always include timestamp at least up to seconds resolution.
Consider adding thread-id in case of multithreaded apps.
Add some nice output of your structures
Do not print out enums with just %d. Use some ToString() or create some EnumToString() function (whatever suits your language)
... and beware: logging changes timings so in case of heavily multithreading you problems might disappear.
More details on this here.
Introduce more asserts
Unit tests
"Audio-visual" monitoring: if something happens do one of
use buzzer
play system sound
flash some LED by enabling hardware GPIO line (only in embedded scenarios)
Without recompilation
If your application uses network of any kind: Packet Sniffer or I will just choose for you: Wireshark
If you use database: monitor queries send to database and database itself.
Use virtual machines to test exactly the same OS/hardware setup as your system is running on.
Use some kind of system calls monitor. This includes
On Unix box strace or dtrace
On Windows tools from former Sysinternals tools like http://technet.microsoft.com/en-us/sysinternals/bb896645.aspx, ProcessExplorer and alike
In case of Windows GUI stuff: check out Spy++ or for WPF Snoop (although second I didn't use)
Consider using some profiling tools for your platform. It will give you overview on thing happening in your app.
[Real hardcore] Hardware monitoring: use oscilloscope (aka O-Scope) to monitor signals on hardware lines
Source code debugging: you sit down with your source code and just pretend with piece of paper and pencil that you are computer. Its so called code analysis or "on-my-eyes" debugging
Source control debugging. Compare diffs of your code from time when "it" works and now. Bug might be somewhere there.
And some general tips in the end:
Do not forget about Text to Columns and Pivot Table in Excel. Together with some text tools (awk, grep or perl) give you incredible analysis pack. If you have more than 32K records consider using Access as data source.
Basics of Data Warehousing might help. With simple cube you may analyse tons of temporal data in just few minutes.
Dumping your application is worth mentioning. Either as a result of crash or just on regular basis
Always generate you debug symbols (even for release builds).
Almost last but not least: most mayor platforms has some sort of command line debugger always built in (even Windows!). With some tricks like conditional debugging and break-print-continue you can get pretty good result with obscure bugs
And really last but not least: use your brain and question everything.
In general debugging is like science: you do not create it you discover it. Quite often its like looking for a murderer in a criminal case. So buy yourself a hat and never give up.
First of all, what does debugging actually do? Advanced debuggers give you machine hooks to suspend execution, examine variables and potentially modify state of a running program. Most programs don't need all that to debug them. There are many approaches:
Tracing: implement some kind of logging mechanism, or use an existing one such as dtrace(). It usually worth it to implement some kind of printf-like function that can output generally formatted output into a system log. Then just throw state from key points in your program to this log. Believe it or not, in complex programs, this can be more useful than raw debugging with a real debugger. Logs help you know how you got into trouble, while a debugger that traps on a crash assumes you can reverse engineer how you got there from whatever state you are already in. For applications that you use other complex libraries that you don't own that crash in the middle of them, logs are often far more useful. But it requires a certain amount of discipline in writing your log messages.
Program/Library self-awareness: To solve very specific crash events, I often have implemented wrappers on system libraries such as malloc/free/realloc which extensions that can do things like walk memory, detect double frees, attempts to free non-allocated pointers, check for obvious buffer over-runs etc. Often you can do this sort of thing for your important internal data types as well -- typically you can make self-integrity checks for things like linked lists (they can't loop, and they can't point into la-la land.) Even for things like OS synchronization objects, often you only need to know which thread, or what file and line number (capturable by __FILE__, __LINE__) the last user of the synch object was to help you work out a race condition.
If you are insane like me, you could, in fact, implement your own mini-debugger inside of your own program. This is really only an option in a self-reflective programming language, or in languages like C with certain OS-hooks. When compiling C/C++ in Windows/DOS you can implement a "crash-hook" callback which is executed when any program fault is triggered. When you compile your program you can build a .map file to figure out what the relative addresses of all your public functions (so you can work out the loader initial offset by subtracting the address of main() from the address given in your .map file). So when a crash happens (even pressing ^C during a run, for example, so you can find your infinite loops) you can take the stack pointer and scan it for offsets within return addresses. You can usually look at your registers, and implement a simple console to let you examine all this. And voila, you have half of a real debugger implemented. Keep this going and you can reproduce the VxWorks' console debugging mechanism.
Another approach, is logical deduction. This is related to #1. Basically any crash or anomalous behavior in a program occurs when it stops behaving as expected. You need to have some feed back method of knowing when the program is behaving normally then abnormally. Your goal then is to find the exact conditions upon which your program goes from behaving correctly to incorrectly. With printf()/logs, or other feedback (such as enabling a device in an embedded system -- the PC has a speaker, but some motherboards also have a digital display for BIOS stage reporting; embedded systems will often have a COM port that you can use) you can deduce at least binary states of good and bad behavior with respect to the run state of your program through the instrumentation of your program.
A related method is logical deduction with respect to code versions. Often a program was working perfectly at one state, but some later version is not longer working. If you use good source control, and you enforce a "top of tree must always be working" philosophy amongst your programming team, then you can use a binary search to find the exact version of the code at which the failure occurs. You can use diffs then to deduce what code change exposes the error. If the diff is too large, then you have the task of trying to redo that code change in smaller steps where you can apply binary searching more effectively.
Just a couple suggestions:
1) Asserts. This should help you work out general expectations at different states of the program. As well familiarize yourself with the code
2) Unit tests. I have used these at times to dig into new code and test out APIs
One word: Logging.
Your program should write descriptive debug lines which include a timestamp to a log file based on a configurable debug level. Reading the resultant log files gives you information on what happened during the execution of the program. There are logging packages in every common programming language that make this a snap:
Java: log4j
.Net: NLog or log4net
Python: Python Logging
PHP: Pear Logging Framework
Ruby: Ruby Logger
C: log4c
I guess you just have to write fine-grain unit tests.
I also like to write a pretty-printer for my data structures.
I think the rest of the interview might go something like this...
Candidate: So you don't buy debuggers for your developers?
Interviewer: No, they have debuggers.
Candidate: So you are looking for programmers who, out of masochism or chest thumping hamartia, make things complicated on themselves even if they would be less productive?
Interviewer: No, I'm just trying to see if you know what you would do in a situation that will never happen.
Candidate: I suppose I'd add logging or print statements. Can I ask you a similar question?
Interviewer: Sure.
Candidate: How would you recruit a team of developers if you didn't have any appreciable interviewing skill to distinguish good prospects based on relevant information?
Peer review. You have been looking at the code for 8 hours and your brain is just showing you what you want to see in the code. A fresh pair of eyes can make all the difference.
Version control. Especially for large teams. If somebody changed something you rely on but did not tell you it is easy to find a specific change set that caused your trouble by rolling the changes back one by one.
On *nix systems, strace and/or dtrace can tell you an awful lot about the execution of your program and the libraries it uses.
Binary search in time is also a method: If you have your source code stored in a version-control repository, and you know that version 100 worked, but version 200 doesn't, try to see if version 150 works. If it does, the error must be between version 150 and 200, so find version 175 and see if it works... etc.
use println/log in code
use DB explorer to look at data in DB/files
write tests and put asserts in suspicious places
More generally, you can monitor side effects and output of the program, and trigger certain events in the program externally.
A Print statement isn't always appropriate. You might use other forms of output such as writing to the Event Log or a log file, writing to a TCP socket (I have a nice utility that can listen for that type of trace from my program), etc.
For programs that don't have a UI, you can trigger behavior you want to debug by using an external flag such as the existence of a file. You might have the program wait for the file to be created, then run through a behavior you're interested in while logging relevant events.
Another file's existence might trigger the program's internal state to be written to your logging mechanism.
like everyone else said:
Logging
Asserts
Extra Output
&
your favorite task manager or process
explorer
links here and here
Another thing I have not seen mentioned here that I have had to use quite a bit on embedded systems is serial terminals.
You can cannot a serial terminal to just about any type of device on the planet (I have even done it to embedded CPUs for hydraulics, generators, etc). Then you can write out to the serial port and see everything on the terminal.
You can get real fancy and even setup a thread that listens to the serial terminal and responds to commands. I have done this as well and implemented simple commands to dump a list, see internal variables, etc all from a simple 9600 baud RS-232 serial port!
Spy++ (and more recently Snoop for WPF) are tremendous for getting an insight into Windows UI bugs.
A nice read would be Delta Debugging from Andreas Zeller. It's like binary search for debugging

Resources