Most efficient way to pull first non-whitespace line from NSTextView? - cocoa

What is the most efficient way to pull the first non-whitespace line from an NSTextView?
For example, if the text is:
\n
\n
\n
This is the text I want \n
\n
Foo bar \n
\n
The result would be "This is the text I want".
Here is what I have:
NSString *content = self.textView.textStorage.string;
NSInteger len = [content length];
NSInteger i = 0;
// Scan past leading whitespace and newlines
while (i < len && [[NSCharacterSet whitespaceAndNewlineCharacterSet] characterIsMember:[content characterAtIndex:i]]) {
i++;
}
// Now, scan to first newline
while (i < len && ![[NSCharacterSet newlineCharacterSet] characterIsMember:[content characterAtIndex:i]]) {
i++;
}
// Grab the substring up to that newline
NSString *resultWithWhitespace = [content substringToIndex:i];
// Trim leading and trailing whitespace/newlines from the substring
NSString *result = [resultWithWhitespace stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
Is there a better, more efficient way?
I'm thinking of putting this in the -textStorageDidProcessEditing: NSTextStorageDelegate method so I can get it as the text is edited. That's why I'd like the method to be as efficient as possible.

Just use NSScanner which is designed for this sort of thing:
NSString* output = nil;
NSScanner* scanner = [NSScanner scannerWithString:yourString];
[scanner scanCharactersFromSet:[NSCharacterSet whitespaceAndNewlineCharacterSet] intoString:NULL];
[scanner scanUpToCharactersFromSet:[NSCharacterSet newlineCharacterSet] intoString:&output];
output = [output stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
Note that it's much faster if you can scan up to a particular character rather than a character set:
[scanner scanUpToString:#"\n" intoString:&output];

Related

Truncate the last line of multi-line NSTextField

I'm trying to create a text field similar to Finder's file labels. I would like the last (second) line to be truncated in the middle.
I started with a multi-line NSTextField.
However, calling [self.cell setLineBreakMode:NSLineBreakByTruncatingMiddle]; results in a the text field showing only a single truncated line (no line breaks anymore).
Here is what it looks like in Finder:
If you want to wrap text like finder labels, using two labels doesn't do you any good since you need to know what the maximum breakable amount of text is on the first line. Plus, if you're building something that will display a lot of items two labels will overburden the GUI needlessly.
Set your NSTextField.cell like this:
[captionLabel.cell setLineBreakMode: NSLineBreakByCharWrapping];
Then find the code for "NS(Attributed)String+Geometrics" (Google it, it's out there). You must #import "NS(Attributed)String+Geometrics.h"
to measure text. It monkey patches NSString and NSAttributedString
I include the following code to wrap text exactly how Finder does in its captions. Using one label below the icon it assumes that, like Finder, there will be two lines of caption.
First this is how you will call the following code in your code:
NSString *caption = self.textInput.stringValue;
CGFloat w = self.captionLabel.bounds.size.width;
NSString *wrappedCaption = [self wrappedCaptionText:self.captionLabel.font caption:caption width:w];
self.captionLabel.stringValue = wrappedCaption ? [self middleTruncatedCaption:wrappedCaption withFont:self.captionLabel.font width:w] : caption;
Now for the main code:
#define SINGLE_LINE_HEIGHT 21
/*
This is the way finder captions work -
1) see if the string needs wrapping at all
2) if so find the maximum amount that will fit on the first line of the caption
3) See if there is a (word)break character somewhere between the maximum that would fit on the first line and the begining of the string
4) If there is a break character (working backwards) on the first line- insert a line break then return a string so that the truncation function can trunc the second line
*/
-(NSString *) wrappedCaptionText:(NSFont*) aFont caption:(NSString*)caption width:(CGFloat)captionWidth
{
NSString *wrappedCaption = nil;
//get the width for the text as if it was in a single line
CGFloat widthOfText = [caption widthForHeight:SINGLE_LINE_HEIGHT font:aFont];
//1) nothing to wrap
if ( widthOfText <= captionWidth )
return nil;
//2) find the maximum amount that fits on the first line
NSRange firstLineRange = [self getMaximumLengthOfFirstLineWithFont:aFont caption:caption width:captionWidth];
//3) find the first breakable character on the first line looking backwards
NSCharacterSet *notAlphaNums = [NSCharacterSet alphanumericCharacterSet].invertedSet;
NSCharacterSet *whites = [NSCharacterSet whitespaceAndNewlineCharacterSet];
NSRange range = [caption rangeOfCharacterFromSet:notAlphaNums options:NSBackwardsSearch range:firstLineRange];
NSUInteger splitPos;
if ( (range.length == 0) || (range.location < firstLineRange.length * 2 / 3) ) {
// no break found or break is too (less than two thirds) far to the start of the text
splitPos = firstLineRange.length;
} else {
splitPos = range.location+range.length;
}
//4) put a line break at the logical end of the first line
wrappedCaption = [NSString stringWithFormat:#"%#\n%#",
[[caption substringToIndex:splitPos] stringByTrimmingCharactersInSet:whites],
[[caption substringFromIndex:splitPos] stringByTrimmingCharactersInSet:whites]];
return wrappedCaption;
}
/*
Binary search is great..but when we split the caption in half, we dont have far to go usually
Depends on the average length of text you are trying to wrap filenames are not usually that long
compared to the captions that hold them...
*/
-(NSRange) getMaximumLengthOfFirstLineWithFont:(NSFont *)aFont caption:(NSString*)caption width:(CGFloat)captionWidth
{
BOOL fits = NO;
NSString *firstLine = nil;
NSRange range;
range.length = caption.length /2;
range.location = 0;
NSUInteger lastFailedLength = caption.length;
NSUInteger lastSuccessLength = 0;
int testCount = 0;
NSUInteger initialLength = range.length;
NSUInteger actualDistance = 0;
while (!fits) {
firstLine = [caption substringWithRange:range];
fits = [firstLine widthForHeight:SINGLE_LINE_HEIGHT font:aFont] < captionWidth;
testCount++;
if ( !fits ) {
lastFailedLength = range.length;
range.length-= (lastFailedLength - lastSuccessLength) == 1? 1 : (lastFailedLength - lastSuccessLength)/2;
continue;
} else {
if ( range.length == lastFailedLength -1 ) {
actualDistance = range.length - initialLength;
#ifdef DEBUG
NSLog(#"# of tests:%d actualDistance:%lu iteration better? %#", testCount, (unsigned long)actualDistance, testCount > actualDistance ? #"YES" :#"NO");
#endif
break;
} else {
lastSuccessLength = range.length;
range.length += (lastFailedLength-range.length) / 2;
fits = NO;
continue;
}
}
}
return range;
}
-(NSString *)middleTruncatedCaption:(NSString*)aCaption withFont:(NSFont*)aFont width:(CGFloat)captionWidth
{
NSArray *components = [aCaption componentsSeparatedByString:#"\n"];
NSString *secondLine = [components objectAtIndex:1];
NSString *newCaption = aCaption;
CGFloat widthOfText = [secondLine widthForHeight:SINGLE_LINE_HEIGHT font:aFont];
if ( widthOfText > captionWidth ) {
//ignore the fact that the length might be an odd/even number "..." will always truncate at least one character
int middleChar = ((int)secondLine.length-1) / 2;
NSString *newSecondLine = nil;
NSString *leftSide = secondLine;
NSString *rightSide = secondLine;
for (int i=1; i <= middleChar; i++) {
leftSide = [secondLine substringToIndex:middleChar-i];
rightSide = [secondLine substringFromIndex:middleChar+i];
newSecondLine = [NSString stringWithFormat:#"%#…%#", leftSide, rightSide];
widthOfText = [newSecondLine widthForHeight:SINGLE_LINE_HEIGHT font:aFont];
if ( widthOfText <= captionWidth ) {
newCaption = [NSString stringWithFormat:#"%#\n%#", [components objectAtIndex:0], newSecondLine];
break;
}
}
}
return newCaption;
}
Cheers!
PS Tested in prototype works great probably has bugs...find them
I suspect there are two labels there. The top one contains the first 20 characters of a file name, and the second contains any overflow, truncated.
The length of the first label is probably restricted based on the user's font settings.

NSTask Output Formatting

I'm using an NSTask to grab the output from /usr/bin/man. I'm getting the output but without formatting (bold, underline). Something that should appear like this:
Bold text with underline
(note the italic text is actually underlined, there's just no formatting for it here)
Instead gets returned like this:
BBoolldd text with _u_n_d_e_r_l_i_n_e
I have a minimal test project at http://cl.ly/052u2z2i2R280T3r1K3c that you can download and run; note the window does nothing; the output gets logged to the Console.
I presume I need to somehow interpret the NSData object manually but I have no idea where to start on that. I'd ideally like to translate it to an NSAttributedString but the first order of business is actually eliminating the duplicates and underscores. Any thoughts?
What is your actual purpose? If you want to show a man page, one option is to convert it to HTML and render it with a Web view.
Parsing man’s output can be tricky because it is processed by groff using a terminal processor by default. This means that the output is tailored to be shown on terminal devices.
One alternative solution is to determine the actual location of the man page source file, e.g.
$ man -w bash
/usr/share/man/man1/bash.1.gz
and manually invoke groff on it with -a (ASCII approximation) and -c (disable colour output), e.g.
$ gunzip -c /usr/share/man/man1/bash.1.gz | groff -c -a -Tascii -man
This will result in an ASCII file without most of the formatting. To generate HTML output,
$ gunzip -c /usr/share/man/man1/bash.1.gz | groff -Thtml -man
You can also specify these options in a custom configuration file for man, e.g. parseman.conf, and tell man to use that configuration file with the -C option instead of invoking man -w, gunzip, and groff. The default configuration file is /private/etc/man.conf.
Also, you can probably tailor the output of the terminal device processor by passing appropriate options to grotty.
Okay, here's the start of my solution, though I would be interested in any additional (easier?) ways to do this.
The output returned from the Terminal is UTF-8 encoding, but the NSUTF8StringEncoding doesn't interpret the string properly. The reason is the way NSTask output is formatted.
The letter N is 0x4e in UTF-8. But the NSData corresponding to that is 0x4e 0x08 0x4e. 0x08 corresponds to a Backspace. So for a bold letter, Terminal prints letter-backspace-letter.
For an italic c, it's 0x63 in UTF-8. The NSData contains 0x5f 0x08 0x63, with 0x5f corresponding to an underscore. So for italics, Terminal prints underscore-backspace-letter.
I really don't see any way around this at this point besides just scanning the raw NSData for these sequences. I'll probably post the source to my parser here once I finish it, unless anybody has any existing code. As the common programming phrase goes, never write yourself what you can copy. :)
Follow-Up:
I've got a good, fast parser together for taking man output and replacing the bold/underlined output with bold/underlined formatting in an NSMutableAttributedString. Here's the code if anybody else needs to solve the same problem:
NSMutableIndexSet *boldChars = [[NSMutableIndexSet alloc] init];
NSMutableIndexSet *underlineChars = [[NSMutableIndexSet alloc] init];
char* bBytes = malloc(1);
bBytes[0] = (char)0x08;
NSData *bData = [NSData dataWithBytes:bBytes length:1];
free(bBytes); bBytes = nil;
NSRange testRange = NSMakeRange(1, [inputData length] - 1);
NSRange bRange = NSMakeRange(0, 0);
do {
bRange = [inputData rangeOfData:bData options:(NSDataSearchOptions)NULL range:testRange];
if (bRange.location == NSNotFound || bRange.location > [inputData length] - 2) break;
const char * buff = [inputData bytes];
if (buff[bRange.location - 1] == 0x5f) {
// it's an underline
//NSLog(#"Undr %c\n", buff[bRange.location + 1]);
[inputData replaceBytesInRange:NSMakeRange(bRange.location - 1, 2) withBytes:NULL length:0];
[underlineChars addIndex:bRange.location - 1];
testRange = NSMakeRange(bRange.location, [inputData length] - (bRange.location));
} else if (buff[bRange.location - 1] == buff[bRange.location + 1]) {
// It's a bold
//NSLog(#"Bold %c\n", buff[bRange.location + 1]);
[inputData replaceBytesInRange:NSMakeRange(bRange.location - 1, 2) withBytes:NULL length:0];
[boldChars addIndex:bRange.location - 1];
testRange = NSMakeRange(bRange.location, [inputData length] - (bRange.location));
} else {
testRange.location = bRange.location + 1;
testRange.length = [inputData length] - testRange.location;
}
} while (testRange.location <= [inputData length] - 3);
NSMutableAttributedString *str = [[NSMutableAttributedString alloc] initWithString:[[NSString alloc] initWithData:inputData encoding:NSUTF8StringEncoding]];
NSFont *font = [NSFont fontWithDescriptor:[NSFontDescriptor fontDescriptorWithName:#"Menlo" size:12] size:12];
NSFont *boldFont = [[NSFontManager sharedFontManager] convertFont:font toHaveTrait:NSBoldFontMask];
[str addAttribute:NSFontAttributeName value:font range:NSMakeRange(0, [str length])];
__block NSUInteger begin = [underlineChars firstIndex];
__block NSUInteger end = begin;
[underlineChars enumerateIndexesUsingBlock:^(NSUInteger idx, BOOL *stop) {
if (idx - end < 2) {
// it's the next item to the previous one
end = idx;
} else {
// it's a split, so drop in the accumulated range and reset
[str addAttribute:NSUnderlineStyleAttributeName value:[NSNumber numberWithInt:NSSingleUnderlineStyle] range:NSMakeRange(begin, (end-begin)+1)];
begin = idx;
end = begin;
}
if (idx == [underlineChars lastIndex]) {
[str addAttribute:NSUnderlineStyleAttributeName value:[NSNumber numberWithInt:NSSingleUnderlineStyle] range:NSMakeRange(begin, (end-begin)+1)];
}
}];
begin = [boldChars firstIndex];
end = begin;
[boldChars enumerateIndexesUsingBlock:^(NSUInteger idx, BOOL *stop) {
if (idx - end < 2) {
// it's the next item to the previous one
end = idx;
} else {
// it's a split, so drop in the accumulated range and reset
[str addAttribute:NSFontAttributeName value:boldFont range:NSMakeRange(begin, (end-begin)+1)];
begin = idx;
end = begin;
}
if (idx == [underlineChars lastIndex]) {
[str addAttribute:NSFontAttributeName value:boldFont range:NSMakeRange(begin, (end-begin)+1)];
}
}];
Another method would be to convert the man page to PostScript source code, run that through the PostScript-to-PDF converter, and put that into a PDFView.
The implementation would be similar to Bavarious's answer, just with different arguments to groff (-Tps instead of -Thtml).
This would be the slowest solution, but also probably the best for printing.

Cocoa - Trim all leading whitespace from NSString

(have searched, but not been able to find a simple solution to this one either here, or in Cocoa docs)
Q. How can I trim all leading whitespace only from an NSString? (i.e. leaving any other whitespace intact.)
Unfortunately, for my purposes, NSString's stringByTrimmingCharactersInSet method works on both leading and trailing.
Mac OS X 10.4 compatibility needed, manual GC.
This creates an NSString category to do what you need. With this, you can call NSString *newString = [mystring stringByTrimmingLeadingWhitespace]; to get a copy minus leading whitespace. (Code is untested, may require some minor debugging.)
#interface NSString (trimLeadingWhitespace)
-(NSString*)stringByTrimmingLeadingWhitespace;
#end
#implementation NSString (trimLeadingWhitespace)
-(NSString*)stringByTrimmingLeadingWhitespace {
NSInteger i = 0;
while ((i < [self length])
&& [[NSCharacterSet whitespaceCharacterSet] characterIsMember:[self characterAtIndex:i]]) {
i++;
}
return [self substringFromIndex:i];
}
#end
This is another solution using Regular Expressions (requires iOS 3.2):
NSRange range = [string rangeOfString:#"^\\s*" options:NSRegularExpressionSearch];
NSString *result = [string stringByReplacingCharactersInRange:range withString:#""];
And if you want to trim the trailing whitespaces only you can use #"\\s*$" instead.
This code is taking blanks.
NSString *trimmedText = [strResult stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
NSLog(#"%#",trimmedText);
Here is a very efficient (uses CoreFoundation) way of doing it (Taken from kissxml):
- (NSString *)trimWhitespace {
NSMutableString *mStr = [self mutableCopy];
CFStringTrimWhitespace((CFMutableStringRef)mStr);
NSString *result = [mStr copy];
[mStr release];
return [result autorelease];
}
NSString *myText = #" foo ";
NSString *trimmedText = [myText stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceCharacterSet]];
NSLog(#"old = [%#], trimmed = [%#]", myText, trimmedText);
Here's what I would do, and it doesn't involve categories!
NSString* outputString = inputString;
NSRange range = [inputString rangeOfCharacterFromSet: [NSCharacterSet whitespaceCharacterSet]
options:0];
if (range.location == 0)
outputString = [inputString substringFromIndex: range.location + range.length];
This is much less code.
I didn't really have much time to test this, and I'm not sure if 10.4 contains the UTF8String method for NSString, but here's how I'd do it:
NSString+Trimming.h
#import <Foundation/Foundation.h>
#interface NSString (Trimming)
-(NSString *) stringByTrimmingWhitespaceFromFront;
#end
NSString+Trimming.m
#import "NSString+Trimming.h"
#implementation NSString (Trimming)
-(NSString *) stringByTrimmingWhitespaceFromFront
{
const char *cStringValue = [self UTF8String];
int i;
for (i = 0; cStringValue[i] != '\0' && isspace(cStringValue[i]); i++);
return [self substringFromIndex:i];
}
#end
It may not be the most efficient way of doing this but it should work.
str = [str stringByReplacingOccurrencesOfString:#" " withString:#""];

cocoa: NSString not removing all the characters

I have an int and for some reason it isn't working after 16 or so. Here's my code:
NSArray *sortedArray;
sortedArray = [doesntContainAnother sortedArrayUsingFunction:firstNumSort context:NULL];
int count2 = [sortedArray count];
//NSLog(#"%d", count2);
int z = 0;
while (z < count2) {
NSString *myString = [sortedArray objectAtIndex:z];
NSString *intstring = [NSString stringWithFormat:#"%d", z];
NSString *stringWithoutSpaces;
stringWithoutSpaces = [[myString stringByReplacingOccurrencesOfString:intstring
withString:#""] mutableCopy];
[hopefulfinal addObject:stringWithoutSpaces];
NSLog(#"%#", [hopefulfinal objectAtIndex:z]);
z++;
}
Edit: It's not the int, it's the stringWithoutSpaces line... I can't figure out what's causing it.
So it (the NSLog, see above the z++) looks like this:
"Here"
"whatever"
"17 whatevere"
"18 this"
etc.
I'm guessing this is related to your earlier question Sort NSArray’s by an int contained in the array, and that you're trying to strip the leading number and whitespace from an array that looks like the one you had in that question:
"0 Here is an object"
"1 What the heck, here's another!"
"2 Let's put 2 here too!"
"3 Let's put this one right here"
"4 Here's another object"
Without know the full input, I'd guess that your code is likely failing because the leading numbers and the value of z are getting out of sync. Since you don't seem to actually care what the leading number is and just want to vamoose it, I'd recommend a different approach that scans for leading digits and extracts the substring from the position where those digits end:
NSArray *array = [NSArray arrayWithObjects:#"1 One",
#"2 Two",
#"5 Five",
#"17 Seventeen",
nil];
NSMutableArray *results = [NSMutableArray array];
NSScanner *scanner;
NSCharacterSet *whitespace = [NSCharacterSet whitespaceCharacterSet];
for (NSString *item in array) {
scanner = [NSScanner scannerWithString:item];
[scanner scanInteger:NULL]; // throwing away the BOOL return value...
// if string does not start with a number,
// the scanLocation will be 0, which is good.
[results addObject:[[item substringFromIndex:[scanner scanLocation]]
stringByTrimmingCharactersInSet:whitespace]];
}
NSLog(#"Resulting array is: %#", results);
// Resulting array is: (
// One,
// Two,
// Five,
// Seventeen
// )
)

Best way to escape characters like newline and double-quote in NSString

Say I have an NSString (or NSMutableString) containing:
I said "Hello, world!".
He said "My name's not World."
What's the best way to turn that into:
I said \"Hello, world!\".\nHe said \"My name\'s not World.\"
Do I have to manually use -replaceOccurrencesOfString:withString: over and over to escape characters, or is there an easier way? These strings may contain characters from other alphabets/languages.
How is this done in other languages with other string classes?
stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding
I don't think there is any built-in method to "escape" a particular set of characters.
If the characters you wish to escape is well-defined, I'd probably stick with the simple solution you proposed, replacing the instances of the characters crudely.
Be warned that if your source string already has escaped characters in it, then you'll probably want to avoid "double-escaping" them. One way of achieving this would be to go through and "unescape" any escaped character strings in the string before then escaping them all again.
If you need to support a variable set of escaped characters, take a look at the NSScanner methods "scanUpToCharactersFromSet:intoString:" and "scanCharactersFromSet:intoString:". You could use these methods on NSScanner to cruise through a string, copying the parts from the "scanUpTo" section into a mutable string unchanged, and copying the parts from a particular character set only after escaping them.
This will escape double quotes in NSString:
NSString *escaped = [originalString stringByReplacingOccurrencesOfString:#"\"" withString:#"\\\""];
So you need to be careful and also escape the escape character...
I think in cases like these, it's useful to operate on a character at a time, either in UniChars or UTF8 bytes. If you're using UTF-8, then vis(3) will do most of the work for you (see below). Can I ask why you want to escape a single-quote within a double-quoted string? How are you planning to handle multi-byte characters? In the example below, I'm using UTF-8, encoding 8-bit characters using C-Style octal escapes. This can also be undone by unvis(3).
#import <Foundation/Foundation.h>
#import <vis.h>
#interface NSString (Escaping)
- (NSString *)stringByEscapingMetacharacters;
#end
#implementation NSString (Escaping)
- (NSString *)stringByEscapingMetacharacters
{
const char *UTF8Input = [self UTF8String];
char *UTF8Output = [[NSMutableData dataWithLength:strlen(UTF8Input) * 4 + 1 /* Worst case */] mutableBytes];
char ch, *och = UTF8Output;
while ((ch = *UTF8Input++))
if (ch == '\'' || ch == '\'' || ch == '\\' || ch == '"')
{
*och++ = '\\';
*och++ = ch;
}
else if (isascii(ch))
och = vis(och, ch, VIS_NL | VIS_TAB | VIS_CSTYLE, *UTF8Input);
else
och+= sprintf(och, "\\%03hho", ch);
return [NSString stringWithUTF8String:UTF8Output];
}
#end
int
main(int argc, const char *argv[])
{
NSAutoreleasePool *pool = [NSAutoreleasePool new];
NSLog(#"%#", [#"I said \"Hello, world!\".\nHe said \"My name's not World.\"" stringByEscapingMetacharacters]);
[pool drain];
return 0;
}
This is a snippet I have used in the past that works quite well:
- (NSString *)escapeString:(NSString *)aString
{
NSMutableString *returnString = [[NSMutableString alloc] init];
for(int i = 0; i < [aString length]; i++) {
unichar c = [aString characterAtIndex:i];
// if char needs to be escaped
if((('\\' == c) || ('\'' == c)) || ('"' == c)) {
[returnString appendFormat:#"\\%c", c];
} else {
[returnString appendFormat:#"%c", c];
}
}
return [returnString autorelease];
}
Do this:
NSString * encodedString = (NSString *)CFURLCreateStringByAddingPercentEscapes(
NULL,
(CFStringRef)unencodedString,
NULL,
(CFStringRef)#"!*'();:#&=+$,/?%#[]",
kCFStringEncodingUTF8 );
Reference: http://simonwoodside.com/weblog/2009/4/22/how_to_really_url_encode/
You might even want to look into using a regex library (there are a lot of options available, RegexKit is a popular choice). It shouldn't be too hard to find a pre-written regex to escape strings that handles special cases like existing escaped characters.

Resources