NEFilterProvider record network activity - macos

NEFilterProvider, or more specifically its 2 subclasses NEFilterDataProvider and NEFilterPacketProvider, has the functionality to allow or deny network activity. However, I couldn't find any way to log in the activity, for debugging purposes.
I know the documentation says this:
it runs in a very restrictive sandbox. The sandbox prevents the Filter
Data Provider extension from moving network content outside of its
address space by blocking all network access, IPC, and disk write
operations.
but is there any trick to log this anyway in debug mode? Maybe using os_log or something like that?

yes, you can use os_log and read the output in the Console app. if you want to workaround the privacy restrictions (while developing/testing), use the %{public} prefix, like so...
import os.log
// ...somewhere in the provider class
os_log("something i want to log %{public}#", someVar)
you're right, the documentation is really, really lacking for this area, other than the SimpleFirewall sample code, and wwdc video. i have an app in production using NEFilterDataProvider but it about cost me my sanity to figure out how to put it all together. at some point i'm going to try to write some blog posts or make a demo repo to try to help create a central community resource to share knowledge and fill in the gaps in the documentation with hard-won knowledge.

Related

Should I make my CouchDB database server public-facing?

I'm new to CouchDb and am trying to comprehend how to properly make use of it. I'm coming from MongoDB where I would always write a web layer and put it in front of mongo so that I could allow users to access the data inside of it, etc. In fact, this is how I've used all databases for every web site that I've ever written. So, looking at Couch, I see that it's native API is HTTP and that it has built in things like OAuth support, and other features that hint to me that perhaps I should no longer have my code layer sitting in front of Couch, but instead write Views and things and just give out accounts to Couch to my users? I'm thinking in terms of like an HTTP-based API for a site of mine, or something that users would consume my data through. Opening up Couch like this seems odd to me, though. Is OAuth, in Couch's sense, meant more for remote access for software that I'd write and run internal to my own network "officially", or is it literally meant for the end users?
I know there might be things that could only be done through a code layer on top of CouchDB, like if you wanted additional non-database related things to occur during API requests, also. So thinking along those lines I think I will still need a code layer, anyway.
Dealer's choice.
Nodejitsu has a great writeup on this sort of topic here.
Not knowing your application specifics I'll take a broad approach...
Back-end
If you want to prevent users from ever seeing your database then make it back-end. You can pipe everything through something like node.js and present only what the user needs to see and they'll never know anything about the database.
See Resource View Presenter
Front-end
If you are not concerned about data security, you can host an entire app on CouchDB; see CouchApp. This approach has the benefit of using the replication mechanism to control publishing your site/data. The drawback here is that you will almost certainly run into some technical limitations that will require moving CouchDB closer to the backend.
Bl-end
Have the app server present the interface and the client pull the data from the database separately. This gives the most flexibility but can be a bag of hurt because even with good design this could lead to supportability and scalability issues.
My recommendation
Use CouchDB on the backend. If you need mobile clients to synchronize then use a secondary DB publicly exposed for this purpose and selectively sync this data to wherever it needs to go.
Simply put, no.
There's no way to secure Couch properly on a public facing site. There's no way to discriminate access at a fine enough granular level. If someone has access to any of the data, they have access to all of the data.
Not all data on a site is meant for public consumption, save for the most trivial of sites.

Where to begin with SNMP agent implementation?

before I start I realise there are a few SNMP related questions here already but not many seem to have been answered - that could mean I'm asking in the wrong place but I don't know where else to go at the moment.
I've been reading up as best I can on SNMP for a couple of days but am finding it difficult to get my head around what is meant to be happening. The idea is eventually we will integrate SNMP into our Java application server which will allow the end users to incorporate it into their pre-existing Network Management Systems(NMS).
Unfortunately I'm feeling entirely confused by what is meant to be going on. From what I understood from talking to the end users (which was unfortunately before any research) was that the monitoring allows their existing NMS to give their admin guys a view of the vital statistics in a tree type display, giving them feedback regarding different parts of the system at a high level and allowing them to dig down into specific subsystems.
From reading around we would implement an 'Agent' which has several defined interfaces allowing for GET requests etc to be processed and responded to. That makes sense but I am at a loss to work out what the format of the communication is - there don't seem to be any specific examples of what any of the messages look like, how the information is encoded.
More of my confusion though is regarding Management Information Base(MIB). I had, wrongly, assumed that the interface of the agent would allow for the monitored attributes to be requested and then in turn the values for those attributes requested. Allowing any new Agent to be started and detected without any configuration on the NMS end (with the exception of authentication in v3). This, if I understand correctly, is not the case and the Agent must instead define MIBs which can be used by the NMS to determine those attributes. My confusion is increased when people start referring to thousands of existing MIBs and that they can be reused which I don't understand. Is the intention that a single MIB definition can be used to say describe how a particular attribute of a network device (something simple like internet connected on a router:yes/no) for many different devices? If so I don't believe that our software would allow the monitoring of anything common to any other device/system but should we be looking for already exising MIBs? At the moment I don't really see any good rational for such a system, surely it would be easier for the Agent to export that information - so I'd appreciate it if someone could enlighten me!
I think it would help if I was able to setup a simple SNMP agent and some sort of client, I could begin to see the process and eventually inspect the communication between the two but am finding it difficult to find anywhere that provides any information on doing such a thing. Nagios has been recommended to us as a test 'client'/NMS but their 'get started quick' section recommends downloading a 600Mb virtual machine - surely there is a quicker way to get started?
Any help or suggestions will be appreciated, I have been through the Wiki page but it doesn't seem to go into much detail about the MIBs and the having not had to deal with anything like the referenced RFCs before, while they may contain all of the information they seem completely impenetrable to me at the moment. Or if there are any books that can be recommended for an overview and implementation of v3?
Thanks for reading and even more thanks if you think you can help!
It seems to me that you read all SNMP information piece by piece in an disorganized way. This is highly not recommended and of course lead you to confusion.
What about forgetting what you have learnt so far and dive into a good book such as Essential SNMP?
http://shop.oreilly.com/product/9780596008406.do
Click the Google Preview icon to preview it please.
You could not depend on a network forum to tell you the ABCs, as that's impractical I find out.
The communications interface is SNMP. That's the protocol used for transmission (usually on top of UDP). The thing that services information requests is an SNMP Agent. The thing that sends information requests is an SNMP Manager.
The definition of what information should be made available by the Agent, and requested by the Manager, goes in a MIB. A MIB is the "glue", a directory of what sort of things any particular system can/should offer. It maps numeric codes to names and types that allow us to make sense of the data, much like how a phone directory maps phone numbers to people's names and addresses.
Generally you would create and ship and use your own MIBs that can describe aspects specific to your own product, but you are supposed to service some standard information requests as well, which are defined in existing MIBs. Yes there are thousands of other pre-existing MIBs and the likelihood that you need more than one or two of these is remote. They are typically published versions of MIBs for existing products.
The conventional way to "toy around" is to install Net-SNMP (a software suite that includes an agent implementation and allows you to "bolt on" your own logic and your own MIBs fairly easily) then examine the results using a packet capturer like Wireshark.
For a fuller implementation in production you may stick with Net-SNMP, or write your own Agent software, or do what I did and create a hybrid of the two that's a little more flexible and performant but uses Net-SNMP's backend for handling all the low-level SNMP stuff.
Your first step, though, is to read a book or some other teaching material that can clear all your misconceptions, because guesswork won't cut it.
I had success using the samples from this page. Both the shell and Perl NetSNMP code was very straightforward to implement and query.

getting started with Single Sign On / Windows Authentication

First off, The Problem:
We have a Web App with a Flash front-end that talks to our ASP.NET web service via SOAP which then deals with all of our server side code (C#).
Right now, we implement a simple user sign on in our application, storing the info in our MSSQL DB.
A client has requested what I understand to be Windows authentication through our application using the currently logged in user.
So, I have been tasked with investigating this. Nobody, including myself, has any experience in this area.
I have been reading up on some basic Active Directory information, and some simple tutorials. I understand how to get access to the directory using ADSI through code. What I'm really interested in seeing is how the entire thing should be architected. I don't want to throw together a hacky solution.
Does anyone know of a good tutorial for this kind of thing or have any advice on getting started? More importantly, does this even sound viable?
I know I haven't given much information, but feel free to ask and I will provide answers.
Thanks.
Edit:
Will, to give you an idea of the scope of this, the network will include every computer in a large hospital. So yes, this is huge. Clearly I need to start small. I would like to come up with something that will work at my office first. Maybe ~10 Windows computers on a single domain. One Domain Controller.
I am also open to any good books on the subject.
If you are going to tie into Active Directory you will want to take a look at the System.DirectoryServices namespace. The implementations can vary wildly depending on your system architecture, but this should give you a good starting point.
Enjoy!

How to implement a secure distributed social network?

I'm interested in how you would approach implementing a BitTorrent-like social network. It might have a central server, but it must be able to run in a peer-to-peer manner, without communication to it:
If a whole region's network is disconnected from the internet, it should be able to pass updates from users inside the region to each other
However, if some computer gets the posts from the central server, it should be able to pass them around.
There is some reasonable level of identification; some computers might be dissipating incomplete/incorrect posts or performing DOS attacks. It should be able to describe some information as coming from more trusted computers and some from less trusted.
It should be able to theoretically use any computer as a server, however, optimizing dynamically the network so that typically only fast computers with ample internet work as seeders.
The network should be able to scale to hundreds of millions of users; however, each particular person is interested in less than a thousand feeds.
It should include some Tor-like privacy features.
Purely theoretical question, though inspired by recent events :) I do hope somebody implements it.
Interesting question. With the use of already existing tor, p2p, darknet features and by using some public/private key infrastructure, you possibly could come up with some great things. It would be nice to see something like this in action. However I see a major problem. Not by some people using it for file sharing, BUT by flooding the network with useless information. I therefore would suggest using a twitter like approach where you can ban and subscribe to certain people and start with a very reduced set of functions at the beginning.
Incidentally we programmers could make a good start to accomplish that goal by NOT saving and analyzing to much information about the users and use safe ways for storing and accessing user related data!
Interesting, the rendezvous protocol does something similar to this (it grabs "buddies" in the local network)
Bittorrent is a mean of transfering static information, its not intended to have everyone become producers of new content. Also, bittorrent requires that the producer is a dedicated server until all of the clients are able to grab the information.
Diaspora claims to be such one thing.

Logging vs. Debugging

Background: I've inherited a web application that is intended to create on-the-fly connections between local and remote equipment. There are a tremendous number of moving parts recently: the app itself has changed significantly; the development toolchain was just updated; and both the local and remote equipment have been "modified" to support those changes.
The bright side is that it has a reasonable logging system that will write debug messages to a file, and it will log to both the file and a real-time user screen. I have an opportunity to re-work the entire log/debug mechanism.
Examples:
All messages are time-stamped and prefixed with a severity level.
Logs are for the customer. They record the system's responses to his/her requests.
Any log that identifies a problem also suggests a solution.
Debugs are for developers and Tech Support. They reveal the system internals.
Debugs indicate the function and/or line that generated them.
The customer can adjust the debug level on the fly to set the verbosity.
Question: What best practices have you used as a developer, or seen as a consumer, that generate useful logs and debugs?
Edit: Many helpful suggestions so far, thanks! To clarify: I'm more interested in what to log: content, format, etc.--and the reasons for doing so--than specific tools.
What was it about the best logs you've seen that made them most helpful?
Thanks for your help!
Don't confuse Logging, Tracing and Error Reporting, some people I know do and it creates one hell of a log file to grep through in order to get the information I want.
If I want to have everything churned out, I seperate into the following:
Tracing -> Dumps every action and step, timestamped, with input and
output data of that stage (ugliest and
largest file)
Logging -> Log the business process steps only, client does enquiry so log
the enquiry criteria and output data
nothing more.
Error Reporting / Debugging -> Exceptions logged detailing where it
occurred, timestamped, input/output
data if possible, user information etc
That way if any errors occurred and the Error/Debug log doesn't contain enough information for my liking I can always do a grep -A 50 -B 50 'timestamp' tracing_file to get more detail.
EDIT:
As has also been said, sticking to standard packages like the built in logging module for python as an example is always good. Rolling your own is not a great idea unless the language does not have one in it's standard library. I do like wrapping the logging in a small function generally taking the message and value for determining which logs it goes to, ie. 1 - tracing, 2 - logging, 4 - debugging so sending a value of 7 drops to all 3 etc.
The absolutley most valueable thing done with any logging framework is a "1-click" tool that gathers all logs and mail them to me even when the application is deployed on a machine belonging to a customer.
And make good choices at what to log so you can roughly follow the main paths in your application.
As frameworks I've used the standards (log4net, log4java, log4c++)
do NOT implement your own logging framework, when there already is a good one out-of-the-box. Most people who do just reinvent the wheel.
Some people never use a debugger but logs everything. That's different philosophies, you have to make your own choice. You can find many advices like these, or this one. Note that these advice are not language related...
Coding Horror guy got an interesting post about logging problem and why abusive logging could be a time waste in certain conditions.
I simply believe logging is for tracing things that could remain in production. Debug is for development. Maybe it's a too simple way of seeing things, cause some people use logs for debugging because they can't stand debuggers. But debugger-mode can be a waste of time too: you don't have to use it like a sort of test case, because it's not written down and will disappear after debug session.
So I think my opinion about this is :
logging for necessary and useful traces through development and production environments, with development and production levels, with the use of a log framework (log4 family tools)
debugging-mode for special strange cases when things are going out of control
test cases are important and can save time spend in infernal labyrinthine debugging sessions, used as an anti-regression method. Note that most of the people don't use test cases.
Coding horror said resist to the tendency of logging everything. That's right, but I've already seen a hudge app that does the exact contrary in a pretty way (and through a database)...
I would just setup your logging system to have multiple logging levels, on the services I write I have a logging/audit for almost every action and it's assigned a audit level 1-5 the higher the number the more audit events you get.
The very basic logging: starting, stopping, and restarting
Basic logging: Processing x number of files etc
Standard logging: Beginning to Processing, Finished processing, etc
Advanced logging: Beginning and ending of every stage in Processing
Everything : every action taken
you set the audit level in a config file so it can be changed on the fly.
Some general rules-of-thumb I have found to be useful in server-side applications:
requestID - assign a request ID to each incoming (HTTP) request and then log that on every log line, so you can easily grep those logs later by that ID and find all relevant lines. If you think it is very tedious to add that ID to every log statement, then at least java logging frameworks have made it transparent with the use of Mapped Diagnostic Context (MDC).
objectID - if your application/service deals with manipulating some business objects that have primary key, then it is useful to attach also that primary key to diagnostic context. Later, if someone comes with question "when was this object manipulated?" you can easily grep by the objectID and see all log records related to that object. In this context it is (sometimes) useful to actually use Nested Diagnostic Context instead of MDC.
when to log? - at least you should log whenever you cross an important service/component boundary. That way you can later reconstruct the call-flow and drill down to the particular codebase that seems to cause the error.
As I'm a Java developer, I will also give my experience with Java APIs and frameworks.
API
I'd recommend to use Simple Logging Facade for Java (SLF4J) - in my experience, it is the best facade to logging:
full-featured: it has not followed the least-common denominator approach (like commons-logging); instead, it is using degrade gracefully approach.
has adapters for practically all popular Java logging frameworks (e.g. log4j)
has solutions available on how to redirect all legacy logging APIs (log4j, commons-logging) to SLF4J
Implementation
The best implementation to use with SLF4J is logback - written by the same guy who also created SLF4J API.
Use an existing logging format, such as that used by Apache, and you can then piggyback on the many tools available for analysing the format.

Resources