Why do I sometimes get mangled replies with concurrent NSURLSession requests - macos

I am working on an OS X (Yosemite) app which downloads two types of csv data (call them type A and type B) from the internet asynchronously using the NSURLSession API .There are multiple requests for each type of csv. Each request is it's own dedicated session wrapped in a custom class. There is one base request class with a subclass for each type. (In hindsight maybe not an ideal design but irrelevant for my issue I think).
The app is constructed such that each type of csv data is downloaded in a sequential queue. Only one request of each type can be active at a time but both types can occur simultaneously and both use the main thread for delegate callbacks. All of this works fine usually.
The issue I am seeing is that sometimes with heavy traffic I get "cross hearing", i.e. I sometimes get a response back to a type B request that is reported as completed successfully but it contains a number of type B cvs lines and then some type A lines tagged on after - so I sometimes (rarely) get type A data in my type B requests. (or the other way around).
Basically it look like the "switching" logic in Apples API gets confused about which incoming packet belongs to what request/session. The two different request types goes to different URLs but they are related and it may be that they both in the end resolve to the same IP, I am not sure about that. I wonder if there may be something related to the packet headers if they come from the same server that makes it difficult to determine what request they belong to (I'm not good enough at the internet protocols to know if this is a sensible guess). If that is the case then the solution must be to ensure all requests are in one queue so that they cannot be active simultaneously, but I do not want to do that large architecture change before I am confident there is no other workaround.
I looked for similar questions and found this old question (Why is my data getting corrupted when I send requests asynchronously in objective c for iOS?) which appears to describe the exact same issue but unfortunately it has no answer. Other than that I found nothing similar so I guess I am doing something stupid here but it would be good to know why this issue occurs before I start changing the architecture to fix it.
Has anyone seen this before and know what the cause and workaround is?
I did not include any code as I felt there was no point given it appears to be an architecture issue and if I added code it would need to be a lot. However I will be happy to add whatever you suggest if that helps understand the question.
Edit:
The relevant (I hope) code added below. Note objects are one shot only. The parameters for the request are injected by the init method and the NSURLSession is used for a single task only. Hence the session is invalidated after launch and the NSMutableData array released after parsing of the data.
-(BOOL)executeRequest {
NSURLSessionConfiguration *theConfig = [NSURLSessionConfiguration ephemeralSessionConfiguration];
NSURLSession *theSession = [NSURLSession sessionWithConfiguration:theConfig delegate:self delegateQueue:[NSOperationQueue mainQueue]];
NSURLRequest *theRequest = [NSURLRequest requestWithURL:self.queryURL cachePolicy: NSURLRequestReloadIgnoringLocalCacheData timeoutInterval:BSTTIMEOUT];
NSURLSessionDataTask *theTask = [theSession dataTaskWithRequest:theRequest];
if(!theTask) {
return NO;
}
[theTask resume];
[theSession finishTasksAndInvalidate];
self.internetData = [NSMutableData dataWithCapacity:0];
return YES;
}
-(void)URLSession:(NSURLSession *)session dataTask:(NSURLSessionDataTask *)dataTask didReceiveData:(NSData *)data {
[self.internetData appendData:data];
return;
}
-(void)URLSession:(NSURLSession *)session task:(NSURLSessionTask *)task didCompleteWithError:(NSError *)error {
if((error)||(![self parseData]))
{
self.internetData = nil;
if(!error) {
NSDictionary *errorDictionary = #{ NSLocalizedDescriptionKey : #"Parsing of internet data failed", NSLocalizedFailureReasonErrorKey : #"Bad data was found in received buffer"};
error = [NSError errorWithDomain:NSCocoaErrorDomain code:EIO userInfo:errorDictionary];
}
NSDictionary* ui = [NSDictionary dictionaryWithObject:error forKey:#"Error"];
[[NSNotificationCenter defaultCenter] postNotificationName:[self failNotification] object:self userInfo:ui];
return;
}
[[NSNotificationCenter defaultCenter] postNotificationName:[self successNotification] object:self];
return;
}

First of all: You should not create a new session for each request. This are no sessions anymore. From the docs:
With the NSURLSession API, your app creates one or more sessions, each of which coordinates a group of related data transfer tasks. For example, if you are writing a web browser, your app might create one session per tab or window, or one session for interactive use and another session for background downloads. Within each session, your app adds a series of tasks, each of which represents a request for a specific URL (following HTTP redirects if necessary).
Second: Where do you store the session et al., so it is not deallocated?
Your main problem: Obviously you start new requests while requests are potentially running. But you have only one NSMutableData instance that receives the data in -URLSession:task:didReceiveData:: Many requests, one storage … Of course that mixes up.

I finally managed to track down my (stupid) error. For future reference the issue was caused by a failure to realise that the data coming back was not zero terminated.
Most of the data requested in my case is XML and the NSXMLParserclass wants a NSDatawithout extra trailing zeros so that works well.
But the requests which occasionally failed uses a CSV format where the data passes over a NSStringwhich is created by [NSString stringWithUTF8String] which expects a zero terminated c style string as input. This was the main culprit. Often it worked as it should. Sometimes it failed outright and sometimes it just did a buffer overrun and got some of the previous request data that was in the same memory area. These were the cases I noticed when posting the question.
Thus the solution is to switch to the use of [[NSString alloc] initWithData: encoding:NSUTF8StringEncoding] which works with non null-terminated NSDatabuffers.

Related

GCDAsyncSocket didReadData only gets called once

I am trying to set up a Java server talking to a iPhone client using GCDAsyncSocket. For some reason my client code on the iPhone is not reading back all of the data.
I see didReadData gets called the first time, but never again. Ideally, I need to mimic the functionality of the HTTP protocol where it sends a header and then the payload. The size of the payload would be in the header. But that wasn't working, so I simplified my code even further in hopes of finding the issue. Below is the code, and below that the output.
client:
- (BOOL) sendString:(NSString *) string
{
[asyncSocket writeData:[string dataUsingEncoding:NSUTF8StringEncoding]
withTimeout:-1 tag:TAG_PAYLOAD];
[asyncSocket readDataToLength:1 withTimeout:(-1) tag:TAG_HEADER];
}
- (void) socket:(GCDAsyncSocket *) sock didReadData:(NSData *)data
withTag:(long)tag
{
NSString *str = [[NSString alloc] initWithData:data
encoding:NSUTF8StringEncoding];
NSLog(#"Read data %# with tag: %ld", str, tag);
if(tag == TAG_HEADER)
{
//TODO - parse the header, get the fields
[sock readDataToLength:3 withTimeout:-1 tag:TAG_PAYLOAD];
//[sock readDataToData:[GCDAsyncSocket CRLFData] withTimeout:-1
tag:TAG_PAYLOAD];
}
else
{
NSLog(#"Payload... %#", str);
NSLog(#"Tag: %ld", tag);
}
}
Java server:
BufferedReader in = new BufferedReader(new InputStreamReader(clientSocket.getInputStream()));
PrintWriter out = new PrintWriter(new OutputStreamWriter(clientSocket.getOutputStream()));
String clientCommand = in.readLine();
System.out.println("Client Says :" + clientCommand);
out.println("Andrew\r\nAUSTIN\r\n");
out.flush();
However, on the client, the only thing I get back is A. The exact output is:
Read data A with tag: 10
My question is:
How come the didReadData method is never called again? There should be more of "Andrew\r\nAustin" recv'd back on the client. But it just hangs. The readDataToData and readDataToLength both seem to never get the full string.
I noticed the CRLF defined in GCDAsyncSocket.h is not \r\n but instead the hex values. Does this matter? Thats why I tried the readDataToLength method but that still failed. But I would like to know if this matters cross-platform or not.
Thanks.
OK - so I figured it out after pulling out what little hair I have left.
What is happening is that I have client code above in a separate class outside of the view. Practically all of the examples I came across had the GCDAsyncSocket stuff handled inside the view. It works great in there! I really didn't want to do this because on each view I need to send/read data and didn't want to duplicate my work. By placing an NSLog() line in the dealloc method of this helper class, called SocketComm, I was able to see it was getting deallocated before it was firing. So I needed to change the way I was calling my helper class. I declare SocketComm* sockComm a strong property in the viewController.h file and allocated it in the viewDidLoad() method. This means that it stays in scope the whole time. Of course, this means I need to deallocate it manually and do some other housekeeping things.
I still am not sure if this is the best way to handle this situation either, as far as memory management goes. Because now I will have to alloc this on every viewDidLoad method. It seems like it should be simpler than this, but here we are. And I still don't know why it never read the data the first time (my only guess is that the GCDAsyncSocket library or the iphone software detected a dead thread when the parent that spawned it got deallocated and decided to terminate it - but this is only a guess as I have just started objective-c).
This would also explain why sometimes it would work and sometimes it wouldn't. It seemed like it was in a race condition. Not sure if the above code I originally posted resulted in a race condition exactly, but some things I would try would work, and then the next time fail. It never read more than the first time though, and only about half the time would it even read that. Sometimes it wouldn't even send the data out over the socket!
In summation (and for whoever else comes looking for an answer):
Always check your memory management. I had to place an NSLog in dealloc() of the SocketComm helper class to fully see what was happening, and as soon as I did that I knew what the culprit was.
If you get weird results where sometimes it works and sometimes it doesn't, check your memory management. For me, sometimes it would do the first read and sometimes it wouldn't. This lead me to believe the thread was getting terminated.
If I find a better way to do this I will come back and update this answer.
Memory management. Let me repeat: memory management.

Should didReceiveResponse always be called for NSURLSessionUploadTasks with custom delegates?

I'm investigating using NSURLSessionUploadTasks to manage the background uploading of a few files. The session is created using:
_urlsession = [NSURLSession sessionWithConfiguration:[NSURLSessionConfiguration backgroundSessionConfiguration:identifier] delegate:self delegateQueue:nil];
This is created within a class that conforms to URLSessionDataTaskDelegate, and specifically defines:
– URLSession:dataTask:didReceiveResponse:completionHandler:
– URLSession:dataTask:didBecomeDownloadTask:
– URLSession:dataTask:didReceiveData:
And logs to the console each time one of these delegates is called.
Then, an upload task is created with the following code:
NSString *urlString = [NSString stringWithFormat:#"%#%#?filename=%#", HOST, UPLOAD_PATH, filename];
NSMutableURLRequest *attachmentUploadRequest = [NSMutableURLRequest requestWithURL:[NSURL URLWithString:urlString]];
attachmentUploadRequest.HTTPMethod = #"POST";
[attachmentUploadRequest addValue:#"application/binary" forHTTPHeaderField:#"Content-Type"];
NSURLSessionTask* task = [_urlsession uploadTaskWithRequest:attachmentUploadRequest fromFile:filePath];
task.taskDescription = 'upload';
However, the sequence of delegate callbacks that I get is not as expected:
URLSession:didReceiveChallenge:completionHandler:]:196: Respond with <NSURLCredential: 0x1cf4fe00>:
URLSession:task:didSendBodyData:totalBytesSent:totalBytesExpectedToSend:]:282: Task 'upload' sent 32768 bytes
URLSession:task:didSendBodyData:totalBytesSent:totalBytesExpectedToSend:]:282: Task 'upload' sent 48150 bytes
URLSession:dataTask:didReceiveData:]:222: Task 'upload' got some data:
Notably, the body data is sent, as expected, but then it switches immediately to didReceiveData delegate callbacks, with no didReceiveResponse callback beforehand. This is an unexpected behavior for me: I'd expected to receive information about the response so that I can properly set up data, or better yet, convert the task to a download task to save the response to a file.
If the upload task is submitted in a default URL session, then didReceiveResponse is called, and I can successfully convert the task to a background download task.
I can't find any indications in Apple's documentation for whether or not didReceiveResponse should be called for NSURLSessionUploadTasks that are in the background. It seems that they should: the documentation for NSURLSessionUploadTask indicates that it is a subclass of NSURLSessionDataTask with small modifications in behavior, but neither of the listed differences involves not sending the didReceiveResponse callback. None of the background-session-specific docs mention this limitation.
Is this a bug, or have a missed/misinterpreted some piece of the documentation that explain that upload tasks in the background do not call didReceiveResponse?
I asked Apple engineers about this during recent Tech Talks. They followed up and gave the following response - not entirely satisfactory, and I feel like they should document this behavior if it is different than any other HTTP handling flow. Especially since the foreground behavior does get the didReceiveData, but doesn't get the didReceiveResponse. At the very least they need to document this non-obvious behavior.
"The way things work today is that we don’t send the didReceiveResponse callback for background uploads to avoid waking the app if it’s not already running. The drawback is that the app cannot choose to convert the background upload into a download task when the response is received. Our decision was based on expecting the response data for a file upload would be small and therefore delivering the response data to the client as NSData instead of a downloaded file would be fine."

Design strategy to save in the background with MagicalRecord

Recently I started a new app requiring just one store (no document based app). For some time I was quite happy thinking I could finally get rid of throwing around the NSManagedObjectContext... until I wanted to save in the background :-(
Now I am confused about my own code. For example:
- (void)awakeFromInsert
{
[super awakeFromInsert];
[self resetCard];
self.creationDate = TODAY;
self.dictionary = [Dictionary activeDictionary];
NSNotificationCenter *center = [NSNotificationCenter defaultCenter];
[center postNotificationName:NOTE_NEWCARD object:self];
}
[Dictionary activeDictionary] is a NSManagedObject static function returning a pointer to a NSManagedObject created in the main thread. That will cause a cross/context error during the background save. Because my program always read from the same store, I thought I could avoid writing this:
[Dictionary activeDictionaryWithContext:...]
I suppose that with MagicalRecord, as long as I work always with the same backend is is possible to avoid passing the context pointer. Which function should I use to get that context?
[NSManagedObjectContext MR_defaultContext]
[NSManagedObjectContext MR_context]
[NSManagedObjectContext MR_contextForCurrentThread]
In the example the object sends itself within a notification, something almost granted to cause more conflicts.
In the case of the notification should I always send only the objectID?
It seems to me that my objects should issue side effect operations/notifications only if they are running in the main context. However some of those side operations change my object graph creating new instances of other entities.
Can I safely omit the two problematic function calls I have mentioned if I save with [MagicalRecord MR_saveAll] ?
Should I assume that the objects of the new background saving context will be an exact copy of the ones in my main thread without calling those extra functions?
Now I am having problems because I never expected awakeFromInsert to run several times for the same object of the same store. I was thinking about something like this:
- (void)awakeFromInsert
{
[super awakeFromInsert];
if ([self managedObjectContext] == [NSManagedObjectContext MR_defaultContext]) {
[self resetCard];
self.creationDate = TODAY;
self.dictionary = [Dictionary activeDictionary];
NSNotificationCenter *center = [NSNotificationCenter defaultCenter];
[center postNotificationName:NOTE_NEWCARD object:self];
}
}
That should make my awakeFromInsert code run only once, but not in the background saving context. I am concerned about losing information if I do so
While you can certainly send your object in a notification that way, I would recommend against that. Remember, even with the new parent-child contexts in CoreData, NSManagedObjects are NOT thread safe. If you create or import objects, you will need to save them prior to using them in another context.
MagicalRecord provides a relatively simple API for background saving:
[MagicalRecord saveInBackgroundWithBlock:^(NSManagedObjectContext *localContext){
MyEntity *newEntity = [MyEntity MR_createInContext:localContext];
//perform other entity operations here
}];
This block does all the work for you, without worrying about setting up the NSManagedObjectContext properly.
Another reason you should not pass NSManagedObjects across a notification is that you do not know what thread the notification will be received on. This can potentially lead to a crash, because, again, NSManagedObjects are NOT thread safe.
Another alternative to the notification approach you present is to add an observer to NSManagedObjectContextDidSaveNotification, and merge your changes on that notification. This will fire only after your objects are saved, and are safe for crossing contexts through either the parent-child relationship, or the persistent store (the old way).

How can I make multiple calls to initWithContentsOfURL without it eventually returning the wrong stuff?

I'm doing multiple levels of parsing of web pages where I use information from one page to drill down and grab a "lower" page to parse. When I get to the lowest level of my hierarchy, I no longer hit a new page, I basically hit the same one (with different parameters) and make SQL database entries.
If I don't slow things down (by putting a sleep(1)) before that inner loop, initWithContentsOfURL eventually returns a kind of stub piece of HTML. Here's the code I use to get my HTML nodes:
NSError *err = nil;
NSString* webStringURL = [sURL stringByAddingPercentEscapesUsingEncoding: NSUTF8StringEncoding];
NSData *contentData = [[[NSData alloc] initWithContentsOfURL: [NSURL URLWithString: webStringURL]
options: 0
error: &err] autorelease];
NSString *dataString = [[[NSString alloc] initWithData: contentData
encoding: NSISOLatin1StringEncoding] autorelease];
NSData *data = [dataString dataUsingEncoding: NSUTF8StringEncoding];
TFHpple *xPathDoc = [[[TFHpple alloc] initWithHTMLData: data] autorelease];
It works fine with 4 levels of looping. In faxt, it can run 24/7 with no real memory leak problem. It only dies when I have a connection issue. That is as long as I put in the sleep(1) before the inner-most loop.
It's like it's too fast and initWithContentsOfURL can't keep up. I suppose I could try to do something asynchronous but this is not for user-consumption and the direct synchronous looping works just fine... almost. I've tried different ways of slowing things down. Pausing for one second on a regular basis works but if I take that out, it starts getting bogus data after about 10 times through the inner loop. Is there a way to handle this properly?
I don't think it's a problem of initWithContentsOfURL; rather, I suspect it's the server or network that is unable to respond that quickly.
The following assumes that's the case.
If you want to receive network errors and/or server response errors, you need to use NSURLConnection. There's no way to get notified about the error from initWithContentsOfURL. If you know what is the stub page, or if you know a magic string in the successful response, you can check the returned NSData against those.

Suggestions needed for architecting my code

Background
I'm writing an part of my app that has no UI. It sits in the background watching what you do and timing your work.
There should be no overlapping times, and there should be no breaks in the time data. If there are either of these things, the app has a bug somewhere and I need to be notified.
What I Want
A class called JGDataIntegrityController that does the following:
Check the data store for duplicate times. Scan since the last Duplicate Report Date stored in NSUserDefaults.
If duplicate times are found, build a report.
Send the report.
If the sending isn't successful, then exit. Otherwise continue.
Remove the duplicates
Update the last Duplicate Report Date in NSUserDefaults
Repeat the above for data breaks.
What I've Got
I've made a base class that does all the hard work of sending the report.
Class Diagram http://synapticmishap.co.uk/ReportClasses.jpg
JGReportSender has the following code:
-(void)postReport:(NSString *)report {
NSMutableDictionary *form = // Dictionary Holding Report;
NSURLRequest *request = [NSURLRequest requestWithURL:#"http://postURL" postForm:form];
[NSURLConnection connectionWithRequest:request delegate:self];
}
Where I'm Getting Stuck
What should I do when the report has been sent?
The delegate methods:
-(void)connectionDidFinishLoading:(NSURLConnection *)connection
-(void)connection:(NSURLConnection *)connection didFailWithError:(NSError*)error
are called when the report has been sent. But how should I communicate with JGDataIntegrityController?
My Crap Idea
My idea is to have a reportStatus NSNumber property in JGReportSender. Then when the delegate methods get called, this is updated.
reportStatus = 1 means "report sent OK".
reportStatus = 2 means "problem sending report".
Then I could add an observer for reportStatus for JGDataDuplicateReportSender and JGDataBreakReportSender. This would then handle the report sending error or continue on.
Any Good Ideas?
I get the feeling this is a really messy way of doing this. I also feel like I'm overlooking something really obvious.
Any ideas how to do this in a neat way?
Update
I totally forgot to mention - this will be a 100% opt in feature. It'll be disabled by default. It'll also have 3 levels of privacy - from "a data break occurred" through to "a data break occurred after this application was active with this document path". And the reports will also be anonymous.
I'm conscious of all the privacy concerns - this is so I can make the software better, not so I can spy on people!
Give the report sender a delegate property and protocol, with at least two methods: reportSenderDidSucceed: and reportSender:failedWithError:. The report sender will send the latter message from its connection:didFailWithError: method, passing along the error object it got.
I do hope you'll make this feature optional. Expect lots of angry/curious email from users (not to mention public warnings of “don't use this app because it phones home” on web pages) if you don't.
Just a quick note to say if anyone wants a good tutorial on implementing your own delegates as Peter is suggesting I do, I found this one:
http://cocoadevcentral.com/articles/000075.php
Check it out. It's excellent!

Resources