Proper validation with [NSScanner: scanInteger] in Cocoa - cocoa

I am converting a string to signed integer via NSScanner:scanInteger, but it seems to be accepting values such as '123abc' as '123', instead of throwing a error on invalid input.
I can do my own custom validation, but would prefer to find an API which will do the conversion and fail on '123abc'.
By the way, 'abc123' does fail with scanInteger, which is good.

I don't think that using a scanner is the way to do this -- you could, but there are easier ways. I would use the NSString method rangeOfCharactersFromSet: to check for non-digit characters.
NSCharacterSet *notDigits = [[NSCharacterSet decimalDigitCharacterSet] invertedSet];
NSUInteger nonDigits = [enteredString rangeOfCharacterFromSet:notDigits].length;
If nonDigits is not zero, then the user has entered something other than a digit. If you want to allow decimal points then you would have to create your own set that contains everything other than digits and the decimal point.

Related

padding the middle of a string

I have a Textbox field that takes in a string with a character limit of 10. I would like to implement a short hand version because there are a lot of zeros in the string that have to be entered. so an example of the string is T000028999. but id like to key in T28999 and have the zeros padded between the "T" and the "28999" and show up as the T000028999 string in the Textbox field.
Is this even possible?
I've tried searching examples on google and have only found ways to pad the beginning and end of the string.
You want to keep the first character, so you can use oldString.Chars(0) to get that.
You want the remainder of the string: oldString.Substring(1), and you can pad it to the width you require with a character of your choice with PadLeft, like this:
Dim newString = oldString.Chars(0) & oldString.Substring(1).PadLeft(9, "0"c)
It would be a good idea to check that oldString is at least 1 character long before doing that otherwise the .Chars(0) will give an error.
Alternatively you could insert a new string of the required quantity of "0"s:
Dim newString = oldString.Insert(1, New String("0"c, 10 - oldString.Length))
A good place to perform the formatting would be in the control's Validating event handler. (The TextChanged event handler would not be a good place because it would interfere with the user's typing.)
Refs:
String.Chars[Int32] Property
String.Substring Method
String.PadLeft Method
String.Insert(Int32, String) Method
String Constructors

Cocoa: Extracting "A" from "Æ"

I have a bunch of NSStrings from which I would like to grab the first character of and match them up in the range A-Z and # as a catch all for things that don't apply.
Different graphemes (I believe that's the correct word after some wiki'ing) have been giving me trouble. For example, I would like to extract A from "Æ".
I have taken a look at CFStringTransform, normalize and fold but none of had the desired effect.
Is there a reliable way of doing this? All the strings I'm working with are UTF8 if that makes a difference.
Æ cannot be broken down into components. It is not a compound glyph of A+E, but is a separate glyph. Compound glyphs are things like a+`
The thing about "Æ" is that it is an ascii character in itself. Not a combination of two different characters so you can't extract the A from it because it is only 1 Character.
Edit:
Although you could perform a check to see if the String equals "Æ" and if it does tell it to switch it with "A" or convert it to its dec, form and subtract 81 which would give you an "A".
Did you want to get rid of all æ?
This should work if you do.
NSString *string = #"Æaæbcdef";
string = [string stringByReplacingOccurrencesOfString:#"æ" withString:#"a"];
string = [string stringByReplacingOccurrencesOfString:#"Æ" withString:#"A"];
Edit
Rereading, you only seem to want the first character:
NSString *string = #"Æaæbcdef";
NSString *firstChar = [string substringToIndex:1];
firstChar = [firstChar stringByReplacingOccurrencesOfString:#"æ" withString:#"a"];
firstChar = [firstChar stringByReplacingOccurrencesOfString:#"Æ" withString:#"A"];
NSString *finalString = [NSString stringWithFormat:#"%#%#", firstChar, [string substringFromIndex:1]];

Unicode with format

I want to add a bunch of Emoji icons to an array. From my earlier question I found out how to write the Emoji icons in an NSString.
Now I want to make a loop and add these icons to an array. This should be fairly easy as the unicodes are in certain ranges so something like the following should do it:
for (int i = 0; i < 10; i++)
[someArray addObject:[NSString stringWithFormat:#"\U0001F43%i", i]];
Problem is, when doing so I get an error saying:
Incomplete universal character name.
Does anyone know of a way to do this?
That's because the escape sequence \Uxxxxxxxx is evaluated by the compiler which replaces it with the corresponding Unicode code point. Then when the method stringWithFormat: will replace the format specifier %i with the decimal representation of i. The final string is the concatenation of the characters corresponding to \Uxxxxxxxx and the characters representing i. stringWithFormat: replaces characters with other characters ; it doesn't alter existing characters.
But the problem is, here the compiler sees an incomplete escape sequence as you only wrote 7 hexadecimal digits. So it's not able to generate the string and raises an error.
The solution is to generate the character (a simple integer value) at runtime and create a string with it using +[NSString stringWithCharacters:length].
But if you look in the headers, you'll see that NSString stores its characters as unichar which is defined as an unsigned short, i.e a 16 bits-long value, whereas the Unicode code point U+1F430 (🐰) requires at least 17 bits.
So you cannot use a single unichar character to represent that code point. But don't worry: you can use two characters to represent it.
You're lost? Here the explanation! Unicode doesn't define characters, it defines code points which are arbitrary integers values in the range U+0000 – U+10FFFF. Then, the implementation decides how to represent those code point using characters. The implementation may use any data type it wants as characters as long as it manages to represent all valid code points. The simplest solution would be to use 32 bits-long integers but that would require too much memory as most of the code point you use are in the first Unicode plan (U+0000 – U+FFFF). So NSString stores the code points with the UTF-16 encoding which uses 16 bits-long characters.
In UTF-16, every code point beyond U+FFFF is stored using a pair of characters (known as a surrogate pair) in the range 0xD800 – 0xDFFF (the corresponding code points are explicitly reserved in the Unicode standard).
In conclusion, any valid Unicode code point may be represented using one or two unichar characters. The method to do so is described there. And here is a simple implementation:
static NSString *stringWithCodePoint(uint32_t codePoint)
{
// NOTE: As I edited the answer, you'll find a simpler implementation of
// this function below
unichar characters[2];
NSUInteger length;
if ( codePoint <= 0xD7FF || (codePoint >= 0xE000 && codePoint <= 0xFFFF) ) {
characters[0] = codePoint;
length = 1;
}
if ( codePoint >= 0x10000 && codePoint <= 0x10ffff ) {
codePoint -= 0x10000;
characters[0] = 0xD800 + (codePoint >> 10);
characters[1] = 0xDC00 + (codePoint & 0x3ff);
length = 2;
}
else {
length = 0; // invalid code point
}
return [NSString stringWithCharacters:characters length:length];
}
Now that we can generate a string from any valid code point, we just need to update the code to use the function we wrote before:
for (int i = 0; i < 10; i++)
[someArray addObject:stringWithCodePoint(0x0001F430 + i)];
EDIT: I just figured out a simpler method to get a NSString from a code point. It works by using -[NSString initWithBytes:length:encoding:] and the NSUTF32StringEncoding encoding:
static NSString *stringWithCodePoint(uint32_t codePoint)
{
NSString *string = [[NSString alloc] initWithBytes:&codePoint length:4 encoding:NSUTF32StringEncoding];
// You may remove the next 3 lines if you use ARC
#if ! __has_feature(objc_arc)
[string autorelease];
#endif
return string;
}
Note this similar question. As one of its answers explains, backslash escapes in a string literal are evaluated at compile time. If you want to make a Unicode character using a \Uxxxx escape, the xxxx all need to be numbers in the string literal.
What you can do instead, as per another answer is use the format specifier %C -- not together with the \Uxxxx escape, but on its own -- and pass in the full character code as an integer. (Actually, a wchar_t, which is a 32-bit integer on Mac OS X now, which you'll need since the character code you're looking for is more than 16 bits long.) To put this together with a base, you can just add the integers:
wchar_t base = 0x0001F430; // unfamiliar? we start with 0x for hexadecimal integers
for (int i = 0; i < 10; i++)
[someArray addObject:[NSString stringWithFormat:#"%C", base + i]];
There's also stringWithCharacters: but that explicitly takes a (16-bit) unichar, so you'd need to use a character sequence to encode your emoji in UTF-16.
Use %C instead of %i
so:
[someArray addObject:[NSString stringWithFormat:#"\U0001F43%C", i]];

NSFileManager contentsOfDirectoryAtPath encoding problem with samba path

i mount a SMB path using this code
urlStringOfVolumeToMount = [urlStringOfVolumeToMount stringByAddingPercentEscapesUsingEncoding:NSMacOSRomanStringEncoding];
NSURL *urlOfVolumeToMount = [NSURL URLWithString:urlStringOfVolumeToMount];
FSVolumeRefNum returnRefNum;
FSMountServerVolumeSync( (CFURLRef)urlOfVolumeToMount, NULL, NULL, NULL, &returnRefNum, 0L);
Then, i get the content of some paths :
NSMutableArray *content = (NSMutableArray *)[[NSFileManager defaultManager] contentsOfDirectoryAtPath:path error:&error];
My problem is every path in "content" array containing special chars (ü for example) give me 2 chars encoded : ü becomes u¨
when i log bytes using :
[contentItem dataUsingEncoding:NSUTF8StringEncoding];
it gives me : 75cc88 which is u (75) and ¨(cc88)
What i expected is the ü char encoded in utf-8. In bytes, it should be c3bc
I've tried to convert my path using ISOLatin1 encoding, MacOSRoman... but as long as the content path already have 2 separate chars instead of one for ü, any conversion give me 2 chars encoded...
If someone can help, thanks
My configuration : localized in french and using snow leopard.
urlStringOfVolumeToMount = [urlStringOfVolumeToMount stringByAddingPercentEscapesUsingEncoding:NSMacOSRomanStringEncoding];
Unless you specifically need MacRoman for some reason, you should probably be using UTF-8 here.
NSMutableArray *content = (NSMutableArray *)[[NSFileManager defaultManager] contentsOfDirectoryAtPath:path error:&error];
My problem is every path in "content" array containing special chars (ü for example) give me 2 chars encoded : ü becomes u¨
You're expecting composed characters and getting decomposed sequences.
Since you're getting the pathnames from the file-system, this is not a problem: The pathnames are correct as you're receiving them, and as long as you pass them to something that does Unicode right, they will display correctly as well.
Well, four years later I'm struggling with the same thing but for åäö in my case.
Took a lot of time to find the simple solution.
NSString has the necessary comparator built in.
Comparing aString with anotherString where one comes from the array returned by NSFileManagers contentsOfDirectoryAtPath: is as simple as:
if( [aString compare:anotherString] == NSOrderedSame )
The compare method takes care of making both the strings into a comparable canonical format. In effect making them "if they look the same, they are the same"

Filtering/Sanitizing User Input. Best Approach

To clarify, I want to check for valid characters. For first name, last name I want to check for [A-Za-z]. For Email I want to check for chars valid for email.
What algorithm would I use to CHECK user input? Specifically, I'd like to CHECK the user's first name, last name and email address for valid characters before I add to database.
Obviously, something like this isn't sufficient:
if ( [firstName.text isEqualToString:#""] || [lastName.text isEqualToString:#""] || [emailAddress.text isEqualToString:#""]) { ... }
Revised for your edit:
What you probably want to look at is NSCharacterSet. It lets you define explicit sets of characters, which you can then use to test strings for presence of those characters. A trivial example:
NSCharacterSet* nonAlphaNumericSet = [[NSCharacterSet alphanumericCharacterSet] invertedSet];
if ([someString rangeOfCharacterFromSet:nonAlphaNumericSet].location != NSNotFound)
{
NSLog(#"It's not alphanumeric!");
}

Resources