What is the benefit of NSScanner's charactersToBeSkipped? - cocoa

I have the string #" ILL WILL KILLS ", and I'm using NSScanner's scanUpToString:intoString: to find every occurrence of "ILL". If it's accurate, it will NSLog 4, 9, and 14.
My string begins with 4 spaces, which I realize are members of the NSScanner's default charactersToBeSkipped NSCharacterSet. If I set charactersToBeSkipped to nil, as in the example below, then this code accurately finds the 3 occurrences of "ILL".
NSScanner* scanner = [NSScanner scannerWithString:#" ILL WILL KILLS "] ;
scanner.charactersToBeSkipped = nil ;
NSString* scannedCharacters ;
while ( TRUE ) {
BOOL didScanUnignoredCharacters = [scanner scanUpToString:#"ILL" intoString:&scannedCharacters] ;
if ( scanner.isAtEnd ) {
break ;
}
NSLog(#"Found match at index: %tu", scanner.scanLocation) ;
// Since stopString "ILL" is 3 characters long, advance scanLocation by 3 to find the next "ILL".
scanner.scanLocation += 3 ;
}
However, if I don't nullify the default charactersToBeSkipped, here's what happens:
scanner is initialized with scanLocation == 0.
scanUpToString executes for the 1st time, it "looks past" 4 empty spaces and "sees" ILL at index 4, so it immediately stops. scanLocation is still 0.
I believe that I found a match, and I increment scanLocation by 3.
scanUpToString executes for the 2nd time, it "looks past" 1 empty space and "sees" ILL at index 4, so it immediately stops. scanLocation is still 3.
To me, it's a design flaw that scanner stopped at scanLocation == 0 the first time, since I expected it to stop at scanLocation == 4. If you believe that the above code can be rewritten to accurately NSLog 4, 9, and 14 without settings charactersToBeSkipped to nil, then please, show me how. For now, my opinion is that charactersToBeSkipped exists solely to make NSScanners more difficult to use.

For now, my opinion is that charactersToBeSkipped exists solely to make NSScanners more difficult to use.
Then you aren't very imaginative. The "benefit" of charactersToBeSkipped is to… wait for it… skip characters. For example, if you have a string like #" 8 9 10 ", you can scan those three integers using -scanInt: three times. You don't have to care about the precise amount of whitespace that separates them.
Given the task you describe, where you're just looking for instances of a string within a string, NSScanner is probably not the right tool. You probably want to use -[NSString rangeOfString:options:range:].
The docs for -scanUpToString:intoString: are fairly clear. If stopString is the first string in the receiver (taking into account that charactersToBeSkipped will be skipped), then the method returns NO, meaning it didn't scan anything. Consequently, the scan location won't be changed.
The return value indicates success or failure. If the stop string is next (ignoring characters to be skipped), then there's nothing to scan "up to" the stop string; the scanner is already at the stop string, so the method fails.

Related

Slice indexing weird edge case

In Go slice indices use half-open range. For a[low:high] the slice includes a[low] but exclude a[high]. Ref: https://tour.golang.org/moretypes/7. Another way to think of it is it goes from a[low] to a[high-1] inclusive.
But take a look at this code. You can run it at: https://play.golang.com/p/DSLs7V8gom0
func main() {
t := []int{5}
fmt.Println(t)
fmt.Println(t[0:0])
fmt.Println(t[1:1])
}
If you ran the code you can see that both t[0:0] and t[1:1] are valid and allowed.
t[0:0] means includes t[0] (which exists) but stop before t[0] (which is t[-1] which does not exist).
t[1:1] means includes t[1] (which does not exist) but stop before t[1] (which is t[0] which exists).
In both cases one of the indices refer to something that doesn't exist, i.e. index out of range ==> yet syntactically perfect and allowed in Golang!!
Yet if you try any other index, such as t[2], the compiler will expectedly complain loudly.
You need to think of a range as starting at the lower value and stopping when you reach the upper value. The upper can never be less than the lower but they can be any valid index or the one past the end - ie 0 or 1 in your example.
So t[0:0] means start then stop immediately giving a zero-length range. Eg:
t[0:0] and t[1:1] are valid zero length slices
t[0:1] is a slice of length 1
t[2:2] and t[1:0] are invalid
I explain in detail about the advantages of half-open ranges and not making zero a special case in my blog - eg see http://devmethodologies.blogspot.com/2012/12/asymmetric-bounds-and-zero-based.html

Vb6 Case a to b in Select Case(Switch) Enum

I saw this kind of code in vb6.
Private Enum enmMain
STEP_INIT = 1
STEP_RUN = 2
STEP_SLEEP = 3
STEP_SUSPEND = 4
STEP_ERROR = 5
End Enum
Private mStep As enmMain
Select Case mStep
Case Is <= enmMain.STEP_RUN
'Do something
Case enmMain.STEP_RUN To enmMain.STEP_ERROR
'Do something
I don't understand this:
Case enmMain.STEP_RUN To enmMain.STEP_ERROR
If it goes into that case when it meets this condition:
the latest value is STEP_RUN
current value is STEP_ERROR
How does it work?
I am posting on mobile can't write clean.
It means that the case statement will be satisfied by all values of mStep that are between 2 and 5, inclusive.
So there is an imprecision in code. Because the value STEP_RUN appears in an inclusive test twice (see the <= operator). Which behaviour is intended for STEP_RUN, the first or the second? You need to figure it out by understanding the program's logic.
Well, let's read the manual:
If testexpression matches any Case expressionlist expression, the statements following that Case clause are executed up to the next Case clause, or, for the last clause, up to End Select. Control then passes to the statement following End Select. If testexpression matches an expressionlist expression in more than one Case clause, only the statements following the first match are executed.
Select Case will run the first block that matches, and the criteria you can use to match are much more flexible than those allowed in many other languages. Case Is <= enmMain.STEP_RUN Will run for any value of mStep that is less than or equal to 2, and Case enmMain.STEP_RUN To enmMain.STEP_ERROR would run for any value between 2 and 5 inclusive.
Now it seems like somebody didn't quite understand what that meant, though, or at least wrote it in a confusing way, because for a value of 2 only the first Case would run, since as the section I quoted says only the first match is executed.
So the end result is that first 'Do something will run on values of 2 or less, and the second 'Do something will run on values of 3, 4, or 5.

Playground shows strange count of execution

let intArray = [5]
intArray.allSatisfy{$0<0} //false, of course, but why 2 times?
There is no more operators, as far as I can see...
It's not telling you how many times the block was executed, but rather how many outputs were called on that line. Since the block returns false and the function allSatisfy returns false, that's 2 outputs in 1 line. You'll notice that no matter the size of the array you get the same value, and if you expand the code, i.e.
intArray.allSatisfy {
$0 < 0
}
you don't see 2 times.
Playgrounds doesn't seem to give a counter for executions of closures/functions passed as parameters, rather only for regular for-loops instead.

Determine if the number of open brackets "(" equal the close brackets ")"

Given a string in the following format:
"(1 AND (2 OR 3) AND 4)"
What is the fastest way to determine if the number of "open" brackets "(" equal the "close" brackets ")".
NOTE: The string could be several hundred characters longs.
Just use a simple counter that starts with 0.
When you encounter "(", increase it by one. When you encounter ")", decrease by one.
If the counter isn't 0 at the end, you've got a mismatch.
Also, as others have mentioned, if the counter ever becomes negative, this means a situation such as )( has occured. Signal an error and stop further parsing.
Initialise counter to zero.
Iterate through the characters of the string.
a. On an opening parenthesis, increase the counter.
b. On a closing parenthesis, decrease the counter.
c. Error out if the counter is negative.
Error out if the counter is not equal to zero after the loop.
This also catches cases like )(, which do have a matching number of opening and closing parens, but should probably be considered erroneous anyway.
If you're trying to count that the number of ( match the number of ), just run through the string once maintaining a counters, incrementing if you see a ( and decrementing if you see a ). This is O(n) and you can't do better; you have to inspect every character.
However, I suspect you meant to ask a different question. Namely, how do you tell if the ( balance with the ). In this case, you maintain a stack and you push whenever you see a ( and you pop when you see a ). If ever you try to pop when the stack is empty, the parentheses are not balanced. If the stack is not empty when you reach the end of the input string, the parentheses are not balanced.
Of course, you can just mimic this with a counter, but it's more natural to think about from the perspective of a stack.
It's O(n). There is no way around it. Here is a rough idea.
For i=0 to string.length
if string[i] == ')'
add to rightBracketCount
else if string[i] == '('
add to leftBracketCount
end for
compare rightBracketCount to leftBracketCount
:)
You can
int left = "your string".split("(").length()
int right = "your string".split(")").length()
boolean ok = (left == right)
Of course this is stupid, but it is just another way

How do I use CGEventKeyboardSetUnicodeString with multiple characters?

I'm trying to use event taps to create an OS X program that will listen for Yiddish typed in transliteration and post the result in Hebrew characters. I made a very short program to test one things I'd have to do: http://pastie.org/791398
As is, the program successfully replaces every typed 'q' with 'w':
if(inputString[0] == 'q') { inputString[0] = 'w'; }
But how does one post a string of more than one character? For instance, if someone types 'sh' you'd presumably have to post a backspace (to delete the character that was posted for 's' alone) and then post the character that corresponds to 'sh'. However, this code results in only a backspace being posted:
else if(inputString[0] == 'm') { inputString[0] = '\b'; inputString[1] = 'n'; }
I apologize if these are basic questions; I have read all the documentation I could find, but I might not have understood it all. It's also possible that I'm going about this entirely the wrong way.
Ideally you should be using an input method instead of a program with event taps, most likely using Input Method Kit if you don't need to support pre-10.5. Using event taps for this purpose is inherently a bad idea because the user can change where he/she is typing with the mouse as well as the keyboard. So if the user typed a "s" in one text field followed by a "h" in another, you wouldn't be able to tell the difference.
That said, here's a direct answer to your question.
The string is length-counted, so you can't just provide the incoming length (1); the second character will be ignored. However, most applications also don't like to get more than a single character per event, so they'll just discard the remaining characters. (Terminal is a notable exception.)
So what you can do is simply post a second event with the second character in it.
else if(inputString[0] == 'm') {
inputString[0] = 'n';
CGEventKeyboardSetUnicodeString(event, 1, inputString);
CGEventPost(kCGSessionEventTap, event);
inputString[0] = '\b';
}
In the general case (simulating > 2 keypresses) you'll need to create an event for each character you want to insert. This mailing list post includes a simple example.
This is how I send a string to the first responder ( foreground application )
// 1 - Get the string length in bytes.
NSUInteger l = [string lengthOfBytesUsingEncoding:NSUTF16StringEncoding];
// 2 - Get bytes for unicode characters
UniChar *uc = malloc(l);
[string getBytes:uc maxLength:l usedLength:NULL encoding:NSUTF16StringEncoding options:0 range:NSMakeRange(0, l) remainingRange:NULL];
// 3 - create an empty tap event, and set unicode string
CGEventRef tap = CGEventCreateKeyboardEvent(NULL,0, YES);
CGEventKeyboardSetUnicodeString(tap, string.length, uc);
// 4 - Send event and tear down
CGEventPost(kCGSessionEventTap, tap);
CFRelease(tap);
free(uc);

Resources