Is there any way to analyze the RT characteristics of a linux kernel?
Just for fun I plan to study the behavior of a RT system on Raspberry Pi. I want to add events at each task swap, around each ISR etc. Those events shall contain the exact jiffy time, the processor, and the pid. The event information shall be stored on file. After the run I want to study the timing characteristics.
Of cause I want those measurements to disturb the system as little as possible.
Is there some kind of framework for doing this? Is it even possible to put events around ISRs (in a generic way)? I see this as a stackoverflow question as I'm willing to modify the code if necessary.
NB, I'm not looking for some kind of statistics view of aggregated data. I want it all! ;)
Have a look at SystemTap and dtrace. They do what you want and more.
https://sourceware.org/systemtap/
http://dtrace.org/blogs/about/
Related
Description
Our Application is based of DSP synthesizer mostly used to create music, written in C Language, and I want to create a system-wide feature to give visual feedback to the user so they can find out which DSP objects are the most CPU-hungry.
I researched a lot but I can't find a way to do implement this feature.
Can anyone guide me how can I implement this feature?
I just want someone to point me in the right direction!!
Thanks in Advance
I have tried to understand how Windows Task Manager works and how ps command in Linux works...
I also looked into Win32API but they all just show currently running processes, and My task is to find the CPU usage of DSP objects currently in use...
my naive approach would involve counting CPU cycles in each method of the object, but I have no idea if thats even the right place to start thinking about it
how about this: measure the time each block takes to do its thing?
the scheduler calls each perform-routine in the DSP graph.
so you just need to measure the time it takes for the perform-routine to return.
the longer it takes, the more CPU-hungry the object is (eventually scale the values by the block size)
I need a FPGA that can have 50 I/O pins. I'm going to use it as a MUX. I though about using MUX or CPLD but the the guy I'm designing this circuit for says that he might need more features in the future so it has to be a FPGA.
So I'm looking for one with enough design examples on the internet. Can you suggest anything (for example a family)?
Also if you could tell me what I should consider when picking, that would be great. I'm new to this and still learning.
This is a very open question, and the answer to it as stated can be very long, if possible at all given all the options. What I suggest to you is to make a list of all current and future requirements. This will help you communicate your needs (here and elsewhere) and force you, and the people you work with on this project, to think about them more carefully. Saying that "more features in the future" will be needed is meaningless; would you buy the most capable FPGA on the market? No.
When you've compiled this list and thought about the requirements, post them here again, and then you'd get plenty of help.
Another possibility to get feedback and help is to describe what you are trying to do/solve. Maybe an FPGA is not the best solution -- people here will tell you that.
I agree with Saar, but you have to go back one step further: when you decide which technology to target, keep in mind that an FPGA needs a lot of things to run, i.e. different voltages fore core, I/O, auxiliary, and probably more. Also you need some kind of configuration mechanism as an FPGA is in general (there are exceptions) SRAM based and therefore needs to be configured at startup. CPLDs are less flexible but much easier to handle...
Basically, what I want to achieve is to run a program in an environment which will give any value asked by the program on basis criteria decided by me. Say e.g., games regularly query system about the time so as to execute animations, now if I have a control over the values of time that are passed on to the program I could control the speed of the animation (and perhaps clear some difficult rounds easily :P).
On similar grounds keystrokes, mouse movements could also be controlled (I have tried Java's Robot class, but did not find it satisfactory: slows the system {or perhaps my implementation was bad}, the commands are executed on currently focused programs and can't be targeted specifically).
Any graceful way of doing it or some pointers on achieving it will be highly appreciated.
Would like to analyze a stream of events, sharing certain characteristics (s.a. a common source), and within a given time-window, ultimately to correlate those multiple events and draw some inference from same, and finally launch some action.
My limited knowledge of Complex-Event-Processing (CEP) tells me that, it is the ideal candidate for such things. However in my research so far I found people compare that with Rule-Engines, and Bayesian Classifier, and sometimes using a combination of those.
Wanted to know if there are --
some best-practices (ideally supported by performance data, and description of nature/type of events) to be followed, especially so in Erlang ?
does Erlang have a CEP framework of it's own ?
any Bayesian Classifier library available in Erlang ?
Esper from Java world seems to be quite close to what I'd like to do, but I'd prefer to keep my environment Erlang-only (or Erlang and C/C++ only) if possible.
Pointers, advice, guidance -- all welcome.
thanks,
IC
This appears to be under active development:
https://github.com/vascokk/rivus_cep
This may be a nonsolution for you, but anyway:
One of Erlangs strengths to play is its ability to act as glue between different systems. You let the Erlang VM sit in the middle and control a number of subsystems running in other processes. The robustness comes from the ability to restart those systems should they crash.
For a classification problem, it would seem to a certain extent that the classification could happen separately from the Erlang subsystem. In other words, you use the erlang:open_port/2 call to open a port to the other program and set up communication with it. The point is that your program will know if the port crashes and can act accordingly to the problem.
My limited knowledge of Erlang libraries and tools out there has no CEP-tools on the radar. Are they hard to write yourself?
We have some few new libs for Erlang on cep.
See below:
https://github.com/danmacklin/erlang_cep
https://github.com/darach/eep-erl
I have a big problem. My boss said to me that he wants two "magic black box":
1- something that receives a micropocessor like input and return, like output, the MIPS and/or MFLOPS.
2- something that receives a c code like input and return, like output, something that can characterize the code in term of performance (something like the necessary MIPS that a uP must have to execute the code in some time).
So the first "black box" I think could be a benchmark of EEMBC or SPEC...different uP, same benchmark that returns MIPS/MFLOPS of each uP. The first problem is OK (I hope)
But the second...the second black box is my nightmare...the only thingh that i find is to use profiling tool but I ask a particular profiling tool.
Is there somebody that know a profiling tool that can have, like input, simple c code and gives me, like output, the performance characteristics of my c code (or the times that some assembly instruction is called)?
The real problem is that we must choose the correct uP for a certai c code...but we want a uP tailored for our c code...so if we know a MIPS (and architectural structure of uP, memory structure...) and what our code needed
Thanks to everyone
I have to agree with Adam, though I would be a little more gracious about it. Compiler optimizations only matter in hotspot code, i.e. tight loops that a) don't call functions, and b) take a large percentage of time.
On a positive note, here's what I would suggest:
Run the C code on a processor, any processor. On that processor, find out what takes the most time.
You could use a profiler for this. The simple method I prefer is to just run it under a debugger and manually halt it, some number of times (like 10) and each time write down the call stack. I suppose there is something in the code taking a good percentage of the time, like 50%. If so, you will see it doing that thing on roughly that percentage of samples, so you won't have to guess what it is.
If that activity is something that would be helped by some special processor, then try that processor.
It is important not to guess. If you say "I think this needs a DSP chip", or "I think it needs a multi-core chip", that is a guess. The guess might be right, but probably not. It is probably the case that what takes the most time is something you never would guess, like memory management or I/O formatting. Performance issues are very good at hiding from you.
No. If someone made a tool that could analyse (non-trivial) source code and tell you its performance characteristics, it would be common place. i.e. everyone would be using it.
Until source code is compiled for a particular target architecture, you will not be able to determine its overall performance. For instance, a parallelising compiler targeting n processors might conceivably be able to change an O(n^2) algorithm to one of O(n).
You won't find a tool to do what you want.
Your only option is to cross-compile the code and profile it on an emulator for the architecture you're running. The problem with profiling high level code is the compiler makes a stack of optimizations that are non trivial and you'd need to know how the particular compiler did that.
It sounds dumb, but why do you want to fit your code to a uP and a uP to your code? If you're writing signal processing buy a DSP. If you're building a SCADA box then look into Atmel or ARM stuff. Are you building a general purpose appliance with a user interface? Look into PPC or X86 compatible stuff.
Simply put, choose a bloody architecture that's suitable and provides the features you need. Optimization before choosing the processor is retarded (very roughly paraphrasing Knuth).
Fix the architecture at something roughly appropriate, work out roughly the processing requirements (you can scratch up an estimate by hand which will always be too high when looking at C code) and buy a uP to match.