How to do many-to-many pub sub relation - zeromq

I have seen it mentioned that ZMQ supports a many-to-many PUB/SUB relation.
In this case what I would like is to have multiple subscribers to multiple publishers, (this is for a common bus style application), however what I am confused about is how to physically implement it since I have also seen it mentioned that there can only be a single bind and multiple connections to that bound socket.
Thus I am slightly confused as to how to achieve this.
I have seen that pgm may be a way to acheive this (since all members would connect to the same multicast address), but I am not sure how to physically do that...

Q : how to physically implement it
A PUB_A on a Computer A PUB_A.bind()-s, any SUB may .connect() there, onto A
A PUB_B on a Computer B PUB_B.bind()-s, any SUB may .connect() there, onto B
A rev_PUB_C on any host rev_PUB_C.connect()-s to a few or many SUB-s, who've had a previously successful SUB_xyz.bind() onto their respective local address(es)
And the merry-go-round goes on, as the distributed system designer likes it to let the show evolve.

Related

Need to c++ functions like handlemessage function for single or multi-radio wireless node interface

In interface element level of wireless nodes:
I know handleMessage() is called by the simulation kernel when the module receives a message.
Is there a similar function when a physical wireless link is established between two single or multi-radio nodes to communicate them is called? If there is no such function, how can I generate it?
Thanks
There is no such thing as physical wireless link. Radios transmit packets that may or may not be received on the other end. That physical wireless link is just an abstraction layered on top of the low level communication.
I.e. when do you consider a physical wireless link present? When data was exchanged between the two peer? Only one way or data should travel in both ways? How will node1 know that node2 received the sent data? Should it wait for acknowledgement? For how long? What is if the acknowledgement is lost during the transmission. And so on...
Providing reliable communication channel between two nodes is the responsibility of the Link Layer (or Mac layer if you want to call it that way like in Ieee80211Mac). So you should add your logic somewhere there, however you must define your own logic. Take a look at the handleLowerPacket() as a good place to insert your code.

What exactly does handleParkingUpdate() do?

I am trying to implement a VANET model for smart parking simulations. Trying to fully understand the TraCIDemo11pp.cc and files relevant to it and its proving quite difficult to get my head around the general structure of each module and the communications between them despite understanding the TicToc tutorial.
I understand how SUMO and OMNETPP are run in parallel, TraCIScenarioManager from OMNETPP communicates with the TraCI server in order to exchange information to SUMO etc. But I'm finding it hard to get my head around how the TraCIDemoApp is utilised.
The question is quite specific, but hoping an answer to it would let me figure out the rest. Any help would be appreciated!
Thanks,
Wesley
Veins comes with a very small demo example in the city of Erlangen:
Vehicles start at the parking lot of the university and drive to a location off-sight. After some time the first vehicle (node[0]) emulates an accident and stops driving. Therefore, it broadcasts this information which gets re-distributed via the RSU to all other vehicles in range. They, in turn, try to use an alternative route to their destination while re-broadcasting the information about the accident. Thus, newly spawned vehicles also get informed and immediately try to choose a different route to the destination.
All of this (i.e. accident, broadcasting, switching route) is implemented in the TraCIDemo* files which represent a VANET application running in a car or RSU utilizing the NIC (i.e. PHY & MAC) to do communication. See what policy is based vehicle rerouting in case of accident? for more information.
handleParkingUpdate() is used to react to a vehicle having switched it's state from driving to parking or vice versa. Depending on the current state and whether parked cars should be allowed to communicate in the simulation this method registers the vehicle's NIC module at the BaseConnectionManager which is involved in handling the actual wireless communication. For more details see this module or follow a packet from one application layer to another (i.e. twice through the networking stack and the wireless transmission).

Run a second antenna on my car with Veins

In my default Veins scenario (the one in the example) I need a second antenna on my car. In Car.ned I entered the following code (doing copy and paste from connections block):
nic2.upperLayerOut --> appl2.lowerLayerIn;
nic2.upperLayerIn <-- appl2.lowerLayerOut;
nic2.upperControlOut --> appl2.lowerControlIn;
nic2.upperControlIn <-- appl2.lowerControlOut;
veinsradioIn2 --> nic2.radioIn;
Now I have two antennas on my node (and they work!). But how can I decide who sends and who receives? In this way I just changed the topology of the network, but I can't handle the communications! I need to reach this scenario: node->node (first antenna) and node->RSU (second antenna). I think I should work on TraCIDemo11p.cc and TraCIDemoRSU11p.cc, but the code is immense and I get lost too easily. The final target is to make sure that these two antennas work with different protocols, but at the moment I make do with the same protocol and with these two different channels I mentioned earlier.
It's a bit difficult to give a concise answer to your question, because it has multiple components, but here are some important things you should look at:
First off: right now, what you've done is specified a car with two network interfaces (nic and nic2) and two separate applications (appl and appl2). I think, by your description, that is not what you want. I would suggest that your first step is to create an application interface that has connections to two network interfaces. This means creating the corresponding .ned file. You can use ./veins/modules/application/traci/TraCIDemo11p.ned as an example. Make sure to define your application object appl (in Car.ned) as that .ned file and connect both of these in the way you described. You'll then have 8 channels from your application to the two network interfaces (I'd call them appl.nic1LayerIn, appl.nic1ControlIn, appl.nic2LayerIn and so on).
After that, you will want to write logic that decides whether a particular message should go to the one network interface, or to the other, and put that code in your application's source. To communicate with the different network interfaces you'll just use the respective channels. To see how this works you'll need to dig in the veins source code a little bit: the code interacting with the channels is not directly in the TraCIDemo11p source, but somewhere in a super class there-of (I think it is BaseWaveApplLayer, but I'm not 100% sure). You could either modify those files to work with multiple antennae, or create new source files -- I'm not sure which one is less code, though.
Another thing to remember is that you'll need to provide the corresponding settings in the omnetpp.ini too (*.**.nic2..., analogous to *.**.nic...). I'm not sure what veins will do with two antennae at the same position (it might lead to some weird effects), but I also don't remember where the antenna position is specified.

Best form of IPC for a decentralized roguelike?

I've got a project to create a roguelike that in some way abstracts the UI from the engine and the engine from map creation, line-of-site, etc. To narrow the focus, i first want to just get the UI (player's client) and engine working.
My current idea is to make the client basically a program that decides what one character (player, monsters) will do for its turn and waits until it can move again. So each monster has a client, and so does the player. The player's client prints the map, waits for input, sends it to the engine, and tells the player what happened. The monster's client does the same except without printing the map and using AI instead of keyboard input.
Before i go any futher, if this seems somehow an obfuscated way of doing things, my goal is to learn, not write a roguelike. It's the journy, not the destination.
And so i need to choose what form of ipc fits this model best.
My first attempt used pipes because they're simplest and i wrote a
UI for the player and a program to pipe in instructions such as
where to put the map and player. While this works, it only allows
one client--communicating through stdin and out.
I've thought about making the engine a daemon that looks in a spool
where clients, when started, create unique-per-client temp files to
give instructions to the engine and recieve feedback.
Lastly, i've done a little introductory programing with sockets.
They seem like they might be the way to go, and would allow the game
to perhaps someday be run over a net. I'd like to, if possible, use
a simpler solution, and since i'm unfamiliar with them, it's more
error prone.
I'm always open to suggestions.
I've been playing around with using these combinations for a similar problem (multiple clients talking via a single daemon on the local box, with much of the intelligence shoved off into the clients).
mmap for sharing large data blobs, with unix domain sockets, messages queues, or named pipes for notification
same, but using individual files per blob instead of munging them all together in an mmap
same, but without the files or mmap (in other words, more like conventional messaging)
In general I like the idea of breaking things up into separate executables this way -- it certainly makes testing easier, for instance. I think the choice of method comes down to usage patterns -- how large are messages, how persistent does the data in them need to be, can you afford the cost of multiple trips through the network stack for a socket-based message, that sort of thing. The fact that you're sticking to Linux makes things easy in terms of what's available -- you don't need to worry about portability of message queues, for instance.
This one's also applicable: https://stackoverflow.com/a/1428542/1264797

centralized / distributed sharing

I would like to make a system whereby users can upload and download files. The system will have a centralized topography but will rely heavily on peers to transfer relevant data through the central node to other peers. Instead of peers holding entire files I would like for them to hold a compressed an encrypted portion of the whole data set.
Some client uploads file to server anonymously
I would like for the client to be able to upload using some sort of NAT (random ip), realizing that the server would not be able to send confirmation packets back to the client. Is ensuring data integrity feasible with a header relaying the total content length, and disregarding the entire upload if there is a mismatch?
Server indexes, compresses and splits the data into chunks adding identifying bytes to each chunk, encrypts it, and splits the data over the network while mapping the locations of each chunk.
The server will also update the file index for peers upon request. As more data is added to the system, I imagine that the compression can become more efficient. I would like to be able to push these new dictionary entries to peers so they can update both their chunks and the decompression system in the client software, without causing overt network strain. If encrypted, the chunks can be large without any client being aware of having part of x file.
Some client requests a file
The central node performs a lookup to determine the location of the chunks within the network and requests these chunks from peers. Once the chunks have been assembled, they are sent (still encrypted and compressed) to the client, who then translates the content into the decompressed file. It would be nice if an encrypted request could be made through a peer and relayed to a server, and onion routed through multiple paths with end-to-end encryption.
In the background, the server will be monitoring the stability and redundancy of the chunks, and if necessary will take on chunks that near extinction, and either hold them in it's own bank or redistribute them over the network if there are willing clients. In this way, the central node can shrink and grow as appropriate.
The goal is to have a network within which any client can upload or download data with no single other peer knowing who has done either, but with free and open access to all.
The system must be able to handle a massive amount of simultaneous connections while managing the peers and data library without loosing it's head.
What would be your optimal implementation?
Edit : Bounty opened.
Over the weekend, I implemented a system that does basically the above, minus part 1. For the upload, I just implemented SSL instead of forging the IP address. The system is weak in several areas. Files are split into 1MB chunks and encrypted, and sent to registered peers at random. The recipient(s) for each chunk are stored in the database. I fear that this will quickly grow too large to be manageable, but I also want to avoid having to flood the network with chunk requests. When a file is requested, the central node informs peers possessing the chunks that they need to send the chunks to x client (in p2p mode) or to the server (in direct mode), which then transfers the file down. The system is just one big hack, and written in ruby, which I imagine is not really up to the task. For the rewrite, I am considering using C++ with Boost.Asio.
I am looking for general suggestions regarding architecture and system design. I am not at all attached to my current implementation.
Current Topography
Server Handling client uploads,indexing, and content propagation
Server Handling client requests
Client for upload files and requesting files
Client Server accepting chunks and requests
I would like for the client not to have to have a persistent server running, but I can't think of a good way around it.
I would post some of the code but its embarassing. Thanks. Please ask any questions, the basic idea is to have a decent anonymous file sharing model combining the strengths of both the distributed and centralized model of content distribution. If you have a totally different idea, please feel free to post it if you want.
I would like for the client to be able
to upload using some sort of NAT
(random ip), realizing that the server
would not be able to send confirmation
packets back to the client. Is
ensuring data integrity feasible with
a header relaying the total content
length, and disregarding the entire
upload if there is a mismatch?
No, that's not feasible. If your packets are 1500 bytes, and you have 0.1% packetloss, the chance of a one megabyte file being uploaded without any lost packets is .999 ^ (1048576 / 1500) = 0.497, or under 50%. Further, it's not clear how the client would even know if the upload succeeded if the server has no way to send acknowledgements back to the client.
One way around the acknowledgement issue would be to use a rateless code, which allows the client to compute and send an effectively infinite number of unique blocks, such that any sufficiently large subset is enough to reconstruct the original file. This adds a large amount of complexity to both the client and server, however, and still requires some way to notify the client that the server has received the complete file.
It seems to me you're confusing several issues here. If your system has a centralized component to which your clients upload, why do you need to do NAT traversal at all?
For parts two and three of your question, you probably want to research Distributed Hash Tables and content-based addressing (but with major caveats explained here). Preventing the nodes from knowing the content of the files they store could be accomplished by, for example, encrypting the files with the first hash of their content, and storing them keyed by the second hash - this means that anyone who knows the hash of the file can retrieve it, but clients cannot decrypt the files they host.
In general, I would suggest starting by writing down a solid list of goals for the system you're designing, then looking for an architecture that suits those goals. In contrast, it sounds like you have some implicit goals, and have already picked a basic system architecture - which may not suit your full goals - based on that.
Sorry for arriving late at the generous 500 reputation party, but even if i am too late i would like to add a little of my research to your discussion.
Yes such a system would be nice, like Bittorrent but with encrypted files and hashes of the un-encrypted data. In BT you can add encrypted files of course, but then the hashes would be of the encrypted data and thus not possible to identify retrieval-sources without a centralized queryKey->hashCollection storage, i.e. a server that does all the work of identifying package-sources for every client. A similar system was attempted by Freenet (http://freenetproject.org/), although more limited than what you attempt.
For the NAT consideration let's first look at: aClient -> yourServer (and aClient->aClient later)
For the communication between a client and your server the NATs (and firewalls that shield the clients) are not an issue! Since the clients initiate the connection to your server (which has either fixed ip-address or a dns-entry (or dyndns)) you dont even have to think about NATs, the server can respond without an issue since, even if multiple clients are behind a single corporate firewall the firewall (its NAT) will look up with which client the server wants to communicate and forwards accordingly (without you having to tell it to).
Now the "hard" part: client -> client communication through firewalls/NAT: The central technique you can use is hole-punching (http://en.wikipedia.org/wiki/UDP_hole_punching). It works so well it is what Skype uses (from one client behind a corporate firewall to another; (if it does not succeed it uses a mirroring-server)). For this you need both clients to know the address of the other and then shoot some packets at eachother so how do they get eachother's addresses?: Your server gives the addresses to the clients (this requires that not only a requester but also every distributer open a connection to your server periodically).
Before i talk about your concern about data-integrity, the general distinction between packages and packets you could (and i guess should) make:
You just have to consider that you can separate between your (application-domain) packages (large) and the packets used for internet-transmission (small, limited by MTU among other things): It would be slow to have both be the same size, the size of a maximum tcp/ip packet is 576 (minus overhead; take a look here: http://www.comsci.us/datacom/ippacket.html ); you could do some experiments about what a good size for your packages is, my best guess is that 50k-1M would all be fine (but profiling would optimize that since we dont if most of the files you want to distribute are large or small).
About data-integrity: For your packages you definitely need a hash, i would recommend to directly take a cryptographic hash since this prevents tampering (in addition to corruption); you dont need to record the size of the packages since if the hash is faulty you have to re-transmit the package anyways. Bear in mind, that this kind of package-corruption is not very frequent if you use TCP/IP for packet transmission (yes, you can use TCP/IP even in your scenario), it automatically corrects (re-requests) transmission-errors. The huge advantage is that all computers and routers in between know TCP/IP and check for corruption automatically on every step in between the source and destination computer, so they can re-request the packet themselves which makes it very fast. They would not know about a packet-integrity-protocol you implement yourself so with that custom protocol the packet has to arrive at the destination before the re-request can even start.
For the next thought let's call the client which publishes a file the "publisher", i know this is kind of obvious, however it is important to distinguish this from "uploader", since the client does not need to upload the file to your server (just some info about it, see below).
Implementing the central indexing-server should be no problem, the problem would be that you plan to have it encrypt all the files itself instead of making the publisher do that heavy work (good encryption is very heavy lifting). The only problem with having the publisher (not the server) encrypt the data is, that you have to trust the publisher to give you reasonable search-keywords: theoretically it could give you a very attractive search-keyword every client desires together with a reference to bogus data (encrypted data is hard to distinguish from random data). But the solution to this problem is crowd-sourcing: make your server store a user-rating so downloaders can vote on files. The implementation of the table you need could be a regular-old hash-table of individual search-keywords to client-ID's (see below) who have that package. The publisher is at first the only client that holds the data, but every client that downloaded at least one of the packages should then be added to the hash-table's entry, so if the publisher goes offline and every package has been downloaded by at least one client everything continues working. Now critically the mapping client-ID->IP-Addresses is non-trivial because it changes often (e.g. every 24 hours for many clients); to compensate, you have to have another table on your server that makes this mapping and make the clients contact the server periodically (e.g. every hour) to tell it its IP-address. I would recommend using a crypto-hash for client-ID's so that it is impossible for one client to trash this table by telling you fake ID's.
For any questions and criticism, please comment.
I am not sure having one central point of attack (the central server) is a very good idea. That of course depends on the kind of problems you want to be able to handle. It also limits your scalability a lot

Resources