I'm trying to parallelize my model (I want to parallelize a single config run, not run multiple configs in parallel).
I'm using Omnet++ 4.2.2, but probably the version doesn't matter.
I've read the Parallel Distributed Simulation chapter of the Omnet++ manual
and the principle seems very straightforward:
simply assign different modules/submodules to different partitions.
Following the provided cqn example
*.tandemQueue[0]**.partition-id = 0
*.tandemQueue[1]**.partition-id = 1
*.tandemQueue[2]**.partition-id = 2
If I try to simulate relatively simple models everything works fine I can partition the model at wish.
However, when I start to run simulation that use Standardhost module, or modules that are interconnected using ethernet links that doesn't work anymore.
If i take for example the Inet provided example WiredNetWithDHCP (inet/examples/dhcp/eth), as experiment, lets say I want to run hosts in a different partition than the switch
I therefore assign the switch to a partition and everything else to another:
**.switch**.partition-id = 1
**.partition-id = 0
The different partitions are separated by links, there is delay, and therefore it should be possible to partition this way.
When I run the model, using the graphic interface, I can see that the model is correctly partitioned however the connections are somehow wrong and i get the following error message:
during network initialization: the input/output datarates differ
clearly datarates don't differ (and running the model sequentially works perfectly), by checking the error message this exception is triggered also by link not connected. This is indeed what happen. It seems that the gates are not correctly linked.
Clearly I'm missing something in the Link connection mechanism, should I partition somewhere else?
Due to the simplicity of the paradigm I feel like being an idiot but I'm not able to solve this issue by myself
Just to give a feedback,
It seems that directly it cannot be done, not the full INET as it is can be parallelized in short because it uses global variables in some places.
in this particular case, mac addresses assignment are one of the issues (uses a global variable), hence eth interface cannot be parallelized.
for more details refer to this paper explaining why this is not possible:
Enabling Distributed Simulation of OMNeT++ INET Models:
For reference/possible solution refer to authors webpage from aachen university, where you can download a complete copy of omnet++ and INET that can be parallelized:
project overview and code
Related
I would like to know if there is a proper method to track memory accesses
across multiple resources at once. For example I set up a simple dual core CPU
by advancing the simple.py from learning gem5 (I just added another
TimingSimpleCPU and made the port connections).
I took a look at the different debug options and found for example the
MemoryAccess flag (and others), but this seemed to only show the accesses at
the DRAM or one other resource component.
Nevertheless I imagine a way to track events across CPU, bus and finally memory.
Does this feature already exist?
What can I try next? Is it and idea to add my own --debug-flag or can I work
with the TraceCPU for my specified use?
I haven't worked much with gem5 yet so I'm not sure how to achieve this. Since until now I only ran in SE mode is the FS mode a solution?
Finally I also found the TraceCPUData flag in the --debug-flags, but running
this with my config script created no output (like many other flags btw. ...).
It seems that this is a --debug-flag for the TraceCPU, what kind of output does this flag create and can it help me?
I'm trying to model a business process using the spring state machine. So far I've been very sucessful with it but I'm stuck on trying to model a dynamic bit, where
the user is in state A
in that state he can create a short (predefined) task for a different user (a small state machine)
those users have to basically execute a state machine flow til the end
it should be possible to spawn many tasks concurently.
the user returns to state A once all created by him tasks have completed.
Here is a graphical representation of what I'm trying to achieve.
I think I could do this if I represent each task as a state machine and so on but I would prefer to avoid going that route as it would complicate the application. Ideally I would have just one state machine configuration.
In the spring reference I found the fork pseudo state to be maybe what I'm looking for however the offical example repo only covers a static configuration (https://github.com/spring-projects/spring-statemachine/blob/master/docs/src/reference/asciidoc/sm-examples.adoc#statemachine-examples-tasks) where each tasks are already defined (T1, T2, T3). For my application needs however I would want to be able to (at runtime) add "T4".
In essence I would like to know whether my requirements could be fullfilled with a single state machine and if I could use fork() for my needs. If its not the case I will welcome any advice that would push me in the right direction.
As I commented over the weekend, if you need a "dynamic" configuration then easiest way to do it is using "dynamic builder interfaces" which is same as in all other examples. It was basically added to be able to use SSM outside of a spring application context. Tasks recipe uses this model as it supports running a DAG of tasks using hierarchical regions and submachines.
You don't necessarily need fork as if parallel regions are entered using initial states it is equivalent. You however need join to wait parallel regions to join their execution.
While that recipe provide some background how thins can be done, we have hopefully something better in our roadmap which is supposed to add a dsl language which should make these kind of custom implementations a much easier to make.
I am using a Cyclone V on a SoCKit board (link here) (provided by Terasic), connecting an HSMC-NET daughter card (link here) to it in order to create a system that can communicate using Ethernet while communication that is both transmitted and received goes through the FPGA - The problem is, I am having a really, really hard time getting this system to work using Altera's Triple Speed Ethernet core.
I am using Qsys to construct the system that contains the Triple Speed Ethernet core, instantiating it inside a VHDL wrapper that also contains an instantiation of a packet generator module, connected directly to the transmit Avalon-ST sink port of the TSE core and controlled through an Avalon-MM slave interface connected to a JTAG to Avalon Master bridge core which has it's master port exported to the VHDL wrapper as well.
Then, using System Console, I am configuring the Triple Speed Ethernet core as described in the core's user guide (link here) at section 5-26 (Register Initialization) and instruct the packet generator module (also using System Console) to start and generate Ethernet packets into the TSE core's transmit Avalon-ST sink interface ports.
Although having everything configured exactly as described in the core's user guide (linked above) I cannot get it to output anything on the MII/GMII output interfaces, neither get any of the statistics counters to increase or even change - clearly, I am doing something wrong, or missing something, but I just can't find out what exactly it is.
Can any one please, please help me with this?
Thanks ahead,
Itamar
Starting the basic checks,
Have you simulated it? It's not clear to me if you are just simulating or synthesizing.
If you haven't simulated, you really should. If it's not working in SIM, why would it ever work in real life.
Make sure you are using the QIP file to synthesize the design. It will automatically include your auto generated SDC constraints. You will still need to add your own PIN constraints, more on that later.
The TSE is fairly old and reliable, so the obvious first things to check are Clock, Reset, Power and Pins.
a.) Power is usually less of problem on devkits if you have already run the demo that came with the kit.
b.) Pins can cause a whole slew of issues if they are not mapped right on this core. I'll assume you are leveraging something from Terasic. It should define a pin for reset, input clock and signal standards. Alot of times, this goes in the .qsf file, and you also reference the QIP file (mentioned above) in here too.
c.) Clock & Reset is a more likely culprit in my mind. No activity on the interface is kind of clue. One way to check, is to route your clocks to spare pins and o-scope them and insure they are what you think they are. Similarly, if you may want to bring out your reset to a pin and check it. MAKE SURE YOU KNOW THE POLARITY and you haven't been using ~reset in some places and non-inverted reset in others.
Reconfig block. Some Altera chips and certain versions of Quartus require you to use a reconfig block to configure the XCVR. This doesn't seem like your issue to me because you say the GMII is flat lined.
In my default Veins scenario (the one in the example) I need a second antenna on my car. In Car.ned I entered the following code (doing copy and paste from connections block):
nic2.upperLayerOut --> appl2.lowerLayerIn;
nic2.upperLayerIn <-- appl2.lowerLayerOut;
nic2.upperControlOut --> appl2.lowerControlIn;
nic2.upperControlIn <-- appl2.lowerControlOut;
veinsradioIn2 --> nic2.radioIn;
Now I have two antennas on my node (and they work!). But how can I decide who sends and who receives? In this way I just changed the topology of the network, but I can't handle the communications! I need to reach this scenario: node->node (first antenna) and node->RSU (second antenna). I think I should work on TraCIDemo11p.cc and TraCIDemoRSU11p.cc, but the code is immense and I get lost too easily. The final target is to make sure that these two antennas work with different protocols, but at the moment I make do with the same protocol and with these two different channels I mentioned earlier.
It's a bit difficult to give a concise answer to your question, because it has multiple components, but here are some important things you should look at:
First off: right now, what you've done is specified a car with two network interfaces (nic and nic2) and two separate applications (appl and appl2). I think, by your description, that is not what you want. I would suggest that your first step is to create an application interface that has connections to two network interfaces. This means creating the corresponding .ned file. You can use ./veins/modules/application/traci/TraCIDemo11p.ned as an example. Make sure to define your application object appl (in Car.ned) as that .ned file and connect both of these in the way you described. You'll then have 8 channels from your application to the two network interfaces (I'd call them appl.nic1LayerIn, appl.nic1ControlIn, appl.nic2LayerIn and so on).
After that, you will want to write logic that decides whether a particular message should go to the one network interface, or to the other, and put that code in your application's source. To communicate with the different network interfaces you'll just use the respective channels. To see how this works you'll need to dig in the veins source code a little bit: the code interacting with the channels is not directly in the TraCIDemo11p source, but somewhere in a super class there-of (I think it is BaseWaveApplLayer, but I'm not 100% sure). You could either modify those files to work with multiple antennae, or create new source files -- I'm not sure which one is less code, though.
Another thing to remember is that you'll need to provide the corresponding settings in the omnetpp.ini too (*.**.nic2..., analogous to *.**.nic...). I'm not sure what veins will do with two antennae at the same position (it might lead to some weird effects), but I also don't remember where the antenna position is specified.
I need to write a program which does 2 tasks at the same time for better efficiency & high response. First task is, for example, get vision data from a camera & process it.
Second task is, receive processed data from first task & do sth else with this data (robot control strategy). However, while robot control task is being performed, the camera data receiving should still be working.
Is there a solution for such type of programming in C++/C#?? I'm learning TBB, is it the right choice? However, I'm reading things like "loop parallelization", am I going in the right direction??
This links to a very common style in control programming where the computer is used as a central unit to connect to electronic devices (sensors) & actuators and all these devices are processed concurrently
No, your example of loop paralleling is using parallel programming to speed up the result of a calculation for one set of data.
What you need is multitasking. You didn't mention any target architecture. Assuming this will be an embedded system, like a microprocessor, you have several options. There are embedded micro-OSes like VXworks and uC-OS that allow you to do just what you are asking. These allow you to set up multiple "tasks" that run virtually concurrently. Of course true concurrency is impossible with one CPU, but the scheduler in these OSes is designed to be very deterministic, for quasi-real-time systems like you describe.
Sounds good to me! TBB OK, C# has useful threadpool etc. classes. Just one thing, if you haven't done anything like this before - it's all about the data, not the code. If you design the data flow correctly, the code will write itself, (well OK, not really:).