What is pump.io? - social-networking

Recently I have been looking into the development of social networks and I often find references to pump.io. There is however very limited information available on what pump.io actually is. The official website says nothing more than: "It's a stream server that does most of what people really want from a social network." I found some more information on this website (http://slid.es/evanp/understanding-pumpio/fullscreen#/) but that still doesn't say a lot to me.
Could someone please provide an elaborate discussion on what pump.io actually is (and does) to someone who does not know anything about (activity) stream servers? Maybe the better question is: "What is an activity stream server?"

Yeah, the term is one a lot of people are unfamiliar with and it makes a couple of distinctions that aren't immediately obvious even if you use and post to a pump.io site.
pump.io, as it is distributed, is really two programs with different sets of functions. One is the Activity Stream Server and the other is the Web Client.
At the risk of being pedantic, let me define each of the words. I know you know what the words mean, but I hope the specific contexts/usage will help:
Server: a program which distributes information (usually) across a
network.
Stream: a (usually) chronological series of some sorts of pieces of information.
Activity: a description or depiction of something someone is doing.
The Activity Stream Server is a program which distributes (server) a chronological series (stream) of posts about stuff people do (activities).
The distinction is important because the website part of a pump.io website is a client for the pump server—essentially no different from a desktop or smartphone pump.io client. It listens to the pump's stream of posts and sends new posts to the pump using the same API and data formats that standalone applications—or other pumps—do.
You could actually totally decouple the Web Client and have a fully-functioning pump.io instance without any website. Users on other pump sites could see your posts and you could see theirs, and you could comment back and forth. It would make no difference.

ActivityStream is a JSON-based data format to describe "activities". The specification of ActivityStream 2.0 can be found at https://www.w3.org/TR/activitystreams-core/ and the vocabulary of activities at https://www.w3.org/TR/activitystreams-vocabulary/. To get the feeling of how the data format looks like you can have a look at the few examples at https://www.w3.org/TR/activitystreams-core/#examples. More examples can be found throughout the two specifications.

pump.io is an activity stream server that does most of what people
really want a social network server to do.
That's a pretty packed sentence, I understand, but I can try to unwind
it a little.
"Activities" are the things we do in our on-line or off-line
life—waking up in the morning, going for a run, tasting a beer,
uploading a photo, adding a friend, eating a burrito, joining a group,
liking a blog post.
pump.io uses a simple JSON format to represent all these kinds of
activities and many more. It organizes activities into streams—time
ordered lists of activities, with the newest first. Most streams are
organized by theme, like: all the things that my friends did, or all
the things that I did, or all the things anyone has done to this
picture.
Programmers use a simple API to connect to a pump.io server and add
new activities. pump.io automatically organizes the activities into
streams and makes sure the activities get to the people who are
interested in them.
And, really, that's what we want from a social network
Behrenshausen, B. (2013). 'Interview with Evan Prodromou, lead developer of pump.io'. Retrieved from: https://opensource.com/life/13/7/pump-io

If you peer a few centimeters down the page on the official website, you'll see:
What's it for? I post something and my followers see it. That's the
rough idea behind the pump.
There's an API defined in the API.md file. It uses activitystrea.ms
JSON as the main data and command format.
You can post almost anything that can be represented with activity
streams -- short or long text, bookmarks, images, video, audio,
events, geo checkins. You can follow friends, create lists of people,
and so on.
The software is useful for at least these scenarios:
Mobile-first social networking
Activity stream functionality for an existing app
Experimenting with social software
Those last 3 items hopefully answer your question.
Currently, you can:
install the nodejs-based pump.io server
(or) sign up for an account on a public service
post notes and pictures with configurable permissions
log in to web and client applications using your webfinger ID

Related

How to handle request traffic of a background location update application

I am working on a family networking app for Android that enables family members to share their location and track location of others simultaneously. You can suppose that this app is similar with Life360 or Sygic Family Locator. At first, I determined to use a MBaaS and then I completed its coding by using Parse. However, I realized that although a user read and write geolocation data per minute (of course, in some cases geolocation data is sent less frequently), the request traffic exceeds my forward-looking expectations. For this reason, I want to develop a well-grounded system but I have some doubts about whether Parse can still do its duty if number of users increases to 100-500k.
Considering all these, I am looking for an alternative method/service to set such a system. I think using a backend service like Parse is a moderate solution but not the best one. What are the possible ways to achieve this from bad to good? To exemplify, one of my friends say that I can use Sinch which is an instant messaging service in background between users that set the price considering number of active users. Nevertheless, it sounds weird to me, I have never seen such a usage of an instant messaging service as he said.
Your comments and suggestions will be highly appreciated. Thank you in advance.
Well sinch wouldn't handle location updates or storing of location data, that would be parse you are asking about.
And since you implied that the requests would be to much for your username maybe I wrongly assumed price was the problem with parse.
But to answer your question about sending location data I would probably throttle it if I where you to aile or so. No need for family members to know down to the feet in realtime. if there is a need for that I would probably inement a request method instead and ask the user for location when someone is interested.

Where to begin with SNMP agent implementation?

before I start I realise there are a few SNMP related questions here already but not many seem to have been answered - that could mean I'm asking in the wrong place but I don't know where else to go at the moment.
I've been reading up as best I can on SNMP for a couple of days but am finding it difficult to get my head around what is meant to be happening. The idea is eventually we will integrate SNMP into our Java application server which will allow the end users to incorporate it into their pre-existing Network Management Systems(NMS).
Unfortunately I'm feeling entirely confused by what is meant to be going on. From what I understood from talking to the end users (which was unfortunately before any research) was that the monitoring allows their existing NMS to give their admin guys a view of the vital statistics in a tree type display, giving them feedback regarding different parts of the system at a high level and allowing them to dig down into specific subsystems.
From reading around we would implement an 'Agent' which has several defined interfaces allowing for GET requests etc to be processed and responded to. That makes sense but I am at a loss to work out what the format of the communication is - there don't seem to be any specific examples of what any of the messages look like, how the information is encoded.
More of my confusion though is regarding Management Information Base(MIB). I had, wrongly, assumed that the interface of the agent would allow for the monitored attributes to be requested and then in turn the values for those attributes requested. Allowing any new Agent to be started and detected without any configuration on the NMS end (with the exception of authentication in v3). This, if I understand correctly, is not the case and the Agent must instead define MIBs which can be used by the NMS to determine those attributes. My confusion is increased when people start referring to thousands of existing MIBs and that they can be reused which I don't understand. Is the intention that a single MIB definition can be used to say describe how a particular attribute of a network device (something simple like internet connected on a router:yes/no) for many different devices? If so I don't believe that our software would allow the monitoring of anything common to any other device/system but should we be looking for already exising MIBs? At the moment I don't really see any good rational for such a system, surely it would be easier for the Agent to export that information - so I'd appreciate it if someone could enlighten me!
I think it would help if I was able to setup a simple SNMP agent and some sort of client, I could begin to see the process and eventually inspect the communication between the two but am finding it difficult to find anywhere that provides any information on doing such a thing. Nagios has been recommended to us as a test 'client'/NMS but their 'get started quick' section recommends downloading a 600Mb virtual machine - surely there is a quicker way to get started?
Any help or suggestions will be appreciated, I have been through the Wiki page but it doesn't seem to go into much detail about the MIBs and the having not had to deal with anything like the referenced RFCs before, while they may contain all of the information they seem completely impenetrable to me at the moment. Or if there are any books that can be recommended for an overview and implementation of v3?
Thanks for reading and even more thanks if you think you can help!
It seems to me that you read all SNMP information piece by piece in an disorganized way. This is highly not recommended and of course lead you to confusion.
What about forgetting what you have learnt so far and dive into a good book such as Essential SNMP?
http://shop.oreilly.com/product/9780596008406.do
Click the Google Preview icon to preview it please.
You could not depend on a network forum to tell you the ABCs, as that's impractical I find out.
The communications interface is SNMP. That's the protocol used for transmission (usually on top of UDP). The thing that services information requests is an SNMP Agent. The thing that sends information requests is an SNMP Manager.
The definition of what information should be made available by the Agent, and requested by the Manager, goes in a MIB. A MIB is the "glue", a directory of what sort of things any particular system can/should offer. It maps numeric codes to names and types that allow us to make sense of the data, much like how a phone directory maps phone numbers to people's names and addresses.
Generally you would create and ship and use your own MIBs that can describe aspects specific to your own product, but you are supposed to service some standard information requests as well, which are defined in existing MIBs. Yes there are thousands of other pre-existing MIBs and the likelihood that you need more than one or two of these is remote. They are typically published versions of MIBs for existing products.
The conventional way to "toy around" is to install Net-SNMP (a software suite that includes an agent implementation and allows you to "bolt on" your own logic and your own MIBs fairly easily) then examine the results using a packet capturer like Wireshark.
For a fuller implementation in production you may stick with Net-SNMP, or write your own Agent software, or do what I did and create a hybrid of the two that's a little more flexible and performant but uses Net-SNMP's backend for handling all the low-level SNMP stuff.
Your first step, though, is to read a book or some other teaching material that can clear all your misconceptions, because guesswork won't cut it.
I had success using the samples from this page. Both the shell and Perl NetSNMP code was very straightforward to implement and query.

Performance logging/monitoring API/product

I'm not sure how to categorize this question, so let me just explain what I would like and hopefully it will make sense.
I'm after a product (with an API) which I can send different numbers to with tags, and it will take care of all the monitoring/logging stuff.
So for example, say I have a program that downloads a file from a website every 10 seconds. I would like to monitor how long each of these downloads is taking. It is quite easy in my application to time how long it takes. I would now like to send this number and tag (e.g., tag='download time', value = '1.234') to a 3rd party product. The 3rd party product will now store this value/tag for me. The product will have a website I can go to, and configure a bunch of things. So in this example, I could setup an alert like "if 'download time' > 5 send me an email". I could also visit a website, and view a graph of the logged values and maybe some random statistics (e.g., how often the value has been in the warning/error zone).
I think that's about it. Sure it wouldn't be too hard to do this myself, but I'm no web designer and it'd end up looking pretty ugly. The more user friendly this kind of product is the more willing users will be to look at the data and actually monitor stuff.
Does such a service exist?
EDIT: Products similar to this: http://dashboard.kpilibrary.com/. This is pretty much exactly what I was after, but am still searching around.
There are many monitoring tools out there. Nagios or RHQ (http://rhq-project.org/) come to mind. Most of the tools work a little different: rather than throwing stuff at them, they have plugins that actively go out and do something to do the measuring. In your example, the plugin would download the file and then report the measurement data to the central server, which can then show you graphs or run alerts on it.
On Windows, you can use this:
http://technet.microsoft.com/en-us/library/cc771692%28WS.10%29.aspx
(Windows Performance Monitor)
It pretty much does what you are looking for:
Passively collects performance data (E.g. CPU Usage)
Can be fed App specific performance metrics (E.g. download time)
Can alert you on various thresholds
Has a reporting interface for analyzing metrics
EDIT : http://technet.microsoft.com/en-us/library/cc749249.aspx , more documentation on this.
This answer is specific to Windows.
If you are looking to analyze events from various systems and you also what the opportunity to create your own events you should consider ETW.
The ETW system allows you to consume data events from any number of sub-systems. You can look at an exhaustive list of built in providers by running the following command:
logman query providers
The beauty of ETW is that you also have the opportunity to create your own providers and push your own data into the resulting report. This is a high-performance logging mechanism and is used by Windows itself for many performance investigations.
The resulting report will be an ETL file. This is a standard file that can be viewed using xPerf, ships with Windows SDK, or the build-in ETL analyzer, tracerpt.exe.

is it reasonable to protect drm'd content client side

Update: this question is specifically about protecting (encipher / obfuscate) the content client side vs. doing it before transmission from the server. What are the pros / cons on going in an approach like itune's one - in which the files aren't ciphered / obfuscated before transmission.
As I added in my note in the original question, there are contracts in place that we need to comply to (as its the case for most services that implement drm). We push for drm free, and most content providers deals are on it, but that doesn't free us of obligations already in place.
I recently read some information regarding how itunes / fairplay approaches drm, and didn't expect to see the server actually serves the files without any protection.
The quote in this answer seems to capture the spirit of the issue.
The goal should simply be to "keep
honest people honest". If we go
further than this, only two things
happen:
We fight a battle we cannot win. Those who want to cheat will succeed.
We hurt the honest users of our product by making it more difficult to use.
I don't see any impact on the honest users in here, files would be tied to the user - regardless if this happens client or server side. This does gives another chance to those in 1.
An extra bit of info: client environment is adobe air, multiple content types involved (music, video, flash apps, images).
So, is it reasonable to do like itune's fairplay and protect the media client side.
Note: I think unbreakable DRM is an unsolvable problem and as most looking for an answer to this, the need for it relates to it already being in a contract with content providers ... in the likes of reasonable best effort.
I think you might be missing something here. Users hate, hate, hate, HATE DRM. That's why no media company ever gets any traction when they try to use it.
The kicker here is that the contract says "reasonable best effort", and I haven't the faintest idea of what that will mean in a court of law.
What you want to do is make your client happy with the DRM you put on. I don't know what your client thinks DRM is, can do, costs in resources, or if your client is actually aware that DRM can be really annoying. You would have to answer that. You can try to educate the client, but that could be seen as trying to explain away substandard work.
If the client is not happy, the next fallback position is to get paid without litigation, and for that to happen, the contract has to be reasonably clear. Unfortunately, "reasonable best effort" isn't clear, so you might wind up in court. You may be able to renegotiate parts of the contract in the client's favor, or you may not.
If all else fails, you hope to win the court case.
I am not a lawyer, and this is not legal advice. I do see this as more of a question of expectations and possible legal interpretation than a technical question. I don't think we can help you here. You should consult with a lawyer who specializes in this sort of thing, and I don't even know what speciality to recommend. If you're in the US, call your local Bar Association and ask for a referral.
I don't see any impact on the honest users in here, files would be tied to the user - regardless if this happens client or server side. This does gives another chance to those in 1.
Files being tied to the user requires some method of verifying that there is a user. What happens when your verification server goes down (or is discontinued, as Wal-Mart did)?
There is no level of DRM that doesn't affect at least some "honest users".
Data can be copied
As long as client hardware, standalone, can not distinguish between a "good" and a "bad" copy, you will end up limiting all general copies, and copy mechanisms. Most DRM companies deal with this fact by a telling me how much this technology sets me free. Almost as if people would start to believe when they hear the same thing often enough...
Code can't be protected on the client. Protecting code on the server is a largely solved problem. Protecting code on the client isn't. All current approaches come with stingy restrictions.
Impact works in subtle ways. At the very least, you have the additional cost of implementing client-side-DRM (and all follow-up cost, including the horde of "DMCA"-shouting lawyer gorillas) It is hard to prove that you will offset this cost with the increased revenue.
It's not just about code and crypto. Once you implement client-side DRM, you unleash a chain of events in Marketing, Public Relations and Legal. A long as they don't stop to alienate users, you don't need to bother.
To answer the question "is it reasonable", you have to be clear when you use the word "protect" what you're trying to protect against...
For example, are you trying to:
authorized users from using their downloaded content via your app under certain circumstances (e.g. rental period expiry, copied to a different computer, etc)?
authorized users from using their downloaded content via any app under certain circumstances (e.g. rental period expiry, copied to a different computer, etc)?
unauthorized users from using content received from authorized users via your app?
unauthorized users from using content received from authorized users via any app?
known users from accessing unpurchased/unauthorized content from the media library on your server via your app?
known users from accessing unpurchased/unauthorized content from the media library on your server via any app?
unknown users from accessing the media library on your server via your app?
unknown users from accessing the media library on your server via any app?
etc...
"Any app" in the above can include things like:
other player programs designed to interoperate/cooperate with your site (e.g. for flickr)
programs designed to convert content to other formats, possibly non-DRM formats
hostile programs designed to
From the article you linked, you can start to see some of the possible limitations of applying the DRM client-side...
The third, originally used in PyMusique, a Linux client for the iTunes Store, pretends to be iTunes. It requested songs from Apple's servers and then downloaded the purchased songs without locking them, as iTunes would.
The fourth, used in FairKeys, also pretends to be iTunes; it requests a user's keys from Apple's servers and then uses these keys to unlock existing purchased songs.
Neither of these approaches required breaking the DRM being applied, or even hacking any of the products involved; they could be done simply by passively observing the protocols involved, and then imitating them.
So the question becomes: are you trying to protect against these kinds of attack?
If yes, then client-applied DRM is not reasonable.
If no (for example, you're only concerned about people using your app, like Apple/iTunes does), then it might be.
(repeat this process for every situation you can think of. If the adig nswer is always either "client-applied DRM will protect me" or "I'm not trying to protect against this situation", then using client-applied DRM is resonable.)
Note that for the last four of my examples, while DRM would protect against those situations as a side-effect, it's not the best place to enforce those restrictions. Those kinds of restrictions are best applied on the server in the login/authorization process.
If the server serves the content without protection, it's because the encryption is per-client.
That being said, wireshark will foil your best-laid plans.
Encryption alone is usually just as good as sending a boolean telling you if you're allowed to use the content, since the bypass is usually just changing the input/output to one encryption API call...
You want to use heavy binary obfuscation on the client side if you want the protection to literally hold for more than 5 minutes. Using decryption on the client side, make sure the data cannot be replayed and that the only way to bypass the system is to reverse engineer the entire binary protection scheme. Properly done, this will stop all the kids.
On another note, if this is a product to be run on an operating system, don't use processor specific or operating system specific anomalies such as the Windows PEB/TEB/syscalls and processor bugs, those will only make the program even less portable than DRM already is.
Oh and to answer the question title: No. It's a waste of time and money, and will make your product not work on my hardened Linux system.

How to implement a secure distributed social network?

I'm interested in how you would approach implementing a BitTorrent-like social network. It might have a central server, but it must be able to run in a peer-to-peer manner, without communication to it:
If a whole region's network is disconnected from the internet, it should be able to pass updates from users inside the region to each other
However, if some computer gets the posts from the central server, it should be able to pass them around.
There is some reasonable level of identification; some computers might be dissipating incomplete/incorrect posts or performing DOS attacks. It should be able to describe some information as coming from more trusted computers and some from less trusted.
It should be able to theoretically use any computer as a server, however, optimizing dynamically the network so that typically only fast computers with ample internet work as seeders.
The network should be able to scale to hundreds of millions of users; however, each particular person is interested in less than a thousand feeds.
It should include some Tor-like privacy features.
Purely theoretical question, though inspired by recent events :) I do hope somebody implements it.
Interesting question. With the use of already existing tor, p2p, darknet features and by using some public/private key infrastructure, you possibly could come up with some great things. It would be nice to see something like this in action. However I see a major problem. Not by some people using it for file sharing, BUT by flooding the network with useless information. I therefore would suggest using a twitter like approach where you can ban and subscribe to certain people and start with a very reduced set of functions at the beginning.
Incidentally we programmers could make a good start to accomplish that goal by NOT saving and analyzing to much information about the users and use safe ways for storing and accessing user related data!
Interesting, the rendezvous protocol does something similar to this (it grabs "buddies" in the local network)
Bittorrent is a mean of transfering static information, its not intended to have everyone become producers of new content. Also, bittorrent requires that the producer is a dedicated server until all of the clients are able to grab the information.
Diaspora claims to be such one thing.

Resources