sscanf skips capital 'N' letter - windows

I have got a strange sscanf problem with a capital letter 'N'(maybe I do not understand something correct me please):
Example 1:
char cBuff[128];
sscanf("GUIDNameNENE","%*[GUIDName]%127s" ,cBuff);
returns cBuff:ENE
Example 2:
char cBuff[128];
sscanf("GUIDNamenENE","%*[GUIDName]%127s" ,cBuff);
returns cBuff:nENE
Example 3:
char cBuff[128];
sscanf("GUIDNaMENE","%*[GUIDNa]%127s" ,cBuff);
returns cBuff:ENE
I have tried many other variants but still always skips capital N.
Where is the problem?
Thank you in advance!

%[GUIDName] is not a weird way of quoting and matching an exact string. It defines a set of characters that will match. They will match in any order, and they will match repeatedly.
The longest match for the set %[GUIDName] in your input is GUIDNameN.
You could of course say %*[G]%*[U]%*[I]%*[D]%*[N]%*[a]%*[m]%*[e] and that would not eat any of the characters GUIDNam, but it would still eat multiple es.

I would guess the reason it skips the capital N is because it's part of the set of characters that you ignore. The key point is that what you specify between the brackets are a set of characters to match, not in a fixed order, but rather that sscanf tries to match the longest string consisting of only the characters after the '[' up to the first matching ']'. If I recall correct.
You could try specifying the size for the set of characters to be skipped like this:
sscanf("GUIDNameNENE","%*8[GUIDName]%127s" ,cBuff);
But that will of course only work if the string always is eight characters long and if it is you could choose to just ignore the eight initial characters like this:
sscanf("GUIDNameNENE","%*8s%127s" ,cBuff);

Related

Discard contractions from string

I have a special use case where I want to discard all the contractions from the string and select only words followed by alphabets which do not contain any special character.
For eg:
string = "~ ASAP ASCII Achilles Ada Stackoverflow James I'd I'll I'm I've"
string.scan(/\b[A-z][a-z]+\b/)
#=> ["Achilles", "Ada", "Stackoverflow", "James", "ll", "ve"]
Note: It's not discarding the whole word I'll and I've
Can someone please help how to discard the whole word which contains contractions?
Try this Regex:
(?:(?<=\s)|(?<=^))[a-zA-Z]+(?=\s|$)
Explanation:
(?:(?<=\s)|(?<=^)) - finds the position immediately preceded by either start of the line or by a white-space
[a-zA-Z]+ - matches 1+ occurrences of a letter
(?=\s|$) - The substring matched above must be followed by either a whitespace or end of the line
Click for Demo
Update:
To make sure that not all the letters are in upper case, use the following regex:
(?:(?<=\s)|(?<=^))(?=\S*[a-z])[a-zA-Z]+(?=\s|$)
Click for Demo
The only thing added here is (?=\S*[a-z]) which means that there must be atleast one lowercase letter
I know that there's an accepted answer already, but I'd like to give my own shot:
(?<=\s|^)\w+[a-z]\w*
You can test it here. This regex is shorter and more efficient (157 steps against 315 from the accepted answer).
The explanation is rather simple:
(?<=\s|^)- This is a positive look behind. It means that we want strings preceded by a whitespace character or the start of the string.
\w+[a-z]\w* - This one means that we want strings composed by letters only (word characters) containing least one lowercase letter, thus discarding words which are whole uppercase. Along with the positive look behind, the whole regex ends up discarding words containing special characters.
NOTE: this regex won't take into account one-letter words. If you want to accomplish that, then you should use \w*[a-z]\w* instead, with a little efficiency cost.

Regex incorrectly matching punctuation (including spaces)

I am trying to check if a string contains at least one lowercase letter, uppercase letter, and a number, but not punctuation (including spaces).
For example
4aBc8Fk3 should match
4aBc 8.;3 should not match
I tried the following, but it matches spaces:
^(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9]).{6,}[^[:punct:]]$
Any ideas how to not match strings containing punctuation including spaces?
The regular expression you have got there does the following for as far as I understand (I'm not familiar with the ruby variety, and still quite new to regex myself; this will give you an idea, but may not be 100% correct):
Go to the beginning of the string
Ensure the string matches any number of any characters followed by a lowercase letter, e.g. --a
Ensure the string matches any number of any characters followed by an uppercase letter, e.g.--aA
Ensure the string matches any number of any characters followed by a number, e.g. --aA0
If that is all true, make sure the beginning of the string is followed by at least 6 random characters, e.g.--aA0-
Ensure that is followed by a single non-punctuation character (although this is the part I'm not sure about, as I haven't used character classes before, and don't know if it's [^[:punct:]] or [^:punct:]), e.g. --aA0-c
Ensure that is followed directly by the end of the string
Now, the lookaheads would also allow a different order of occurrences, e.g. 0---Aa, as long as the string contains any characters followed by what they are looking for.
What you probably want is ^[a-zA-Z0-9]{6,}$, i.e. at least six characters, with the characters being letters and numbers (though that would also allow aaaaaa, for example).
Maybe try ^(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9])[a-zA-Z0-9]{6,}$ to make sure each group is present, and to get alpha-numerical characters (at least six of them) only.
I always use a tool such as http://www.regexpal.com/ to slowly build up my regex and to see where I go wrong, deconstructing a "bad" regex until I get to a "good" one, then slowly adding to it again.
Hope that helps. :)
P.S.: I'm still a bit unclear how many characters you want to match in total, i.e. if the string is fixed length or not...?

Regex that allows for A-z, 0-9, and dashing in the middle, never on the ends?

I'm working to create a ruby regex that meets the following conditions:
Supported:
A-Z, a-z, 0-9, dashes in the middle but never starting or ending in a dash.
At least 5, no more than 500 characters
So far I have:
[0-9a-z]{5,500}
Any suggestions on how to update to meet the criteria above?
Thanks
[A-Za-z\d][-A-Za-z\d]{3,498}[A-Za-z\d]
If you are willing to treat _ as a letter also, it's even simpler:
\w[-\w]{3,498}\w
This should work:
[0-9A-Za-z][0-9A-Za-z-]{3,498}[0-9A-Za-z]
Here you go:
/^[0-9A-Za-z][0-9A-Za-z\-]{3,498}[0-9A-Za-z]$/
or if you want the beginning and end to be only 0-9,A-Z,a-z (instead of non dash) then:
Explanation:
The first ^ matches beginning of string.
The next [] matches a A-Z,a-z,0-9
The next [] matches 3 to 498 chars of A-Z,a-z,0-9,dash. Note that we match 3 to 498 chars because we match one char in the beginning and one in the end.
The next [^] is again a A-Z,a-z,0-9.
And lastly we match $ for the end of the string.
This assumes that there are either always dashes or never dashes. It also assumes only one dash is allowed between alphanumeric characters. It's the only way I can think off hand to limit characters instead of number of instances of the string.
(([0-9a-zA-Z]{4,499})|([0-9a-zA-Z][\d]?){2,249})[0-9a-zA-Z]
Assuming there's no limit to the number of adjacent dashes allowed, this would work:
[0-9a-zA-Z][0-9a-zA-Z\d]{3,498}[0-9a-zA-Z]

Please Help me about preg match Validation

I need to validate a password it should have the following requirements:
The password should have at least 8 characters
The password should have at least 1 uppercase, 1 lowercase, 1 number, and 1 special character
The password should have no continues character(ex. 12345 or abcd)
Please help me to do this.. any suggestions will be a big help.
Thank you
Iterate string. If the character is uppercase then set bool isUppercase to true... If character is special character then set bool isSpecialCharacter to true. If the difference between this character and previous character is 1 then you have two consecutive characters, and you can stop iterating then (set bool haveConsecutiveCharacters to true).
The thing about consecutive characters is that if one of them is special character then they are not really consecutive (consider 'Z' and '[' that are next to each other in ASCII table).
After iterating check if all booleans are true and there are no consecutive characters.
If you really want a regexp for this, you'll have to use assertions :
/^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[\W\D\S]).{8,}$/
Now, the hard part is no consecutive characters. I suggest doing it with a loop instead of doing it with a regexp (actually, I don't know how to do it with a regexp).

Regular expression Unix shell script

I need to filter all lines with words starting with a letter followed by zero or more letters or numbers, but no special characters (basically names which could be used for c++ variable).
egrep '^[a-zA-Z][a-zA-Z0-9]*'
This works fine for words such as "a", "ab10", but it also includes words like "b.b". I understand that * at the end of expression is problem. If I replace * with + (one or more) it skips the words which contain one letter only, so it doesn't help.
EDIT:
I should be more precise. I want to find lines with any number of possible words as described above. Here is an example:
int = 5;
cout << "hello";
//some comments
In that case it should print all of the lines above as they all include at least one word which fits the described conditions, and line does not have to began with letter.
Your solution will look roughly like this example. In this case, the regex requires that the "word" be preceded by space or start-of-line and then followed by space or end-of-line. You will need to modify the boundary requirements (the parenthesized stuff) as needed.
'(^| )[a-zA-Z][a-zA-Z0-9]*( |$)'
Assuming the line ends after the word:
'^[a-zA-Z][a-zA-Z0-9]+|^[a-zA-Z]$'
You have to add something to it. It might be that the rest of it can be white spaces or you can just append the end of line.(AFAIR it was $ )
Your problem lies in the ^ and $ anchors that match the start and end of the line respectively. You want the line to match if it does contain a word, getting rid of the anchors does what you want:
egrep '[a-zA-Z][a-zA-Z0-9]+'
Note the + matches words of length 2 and higher, a * in that place would signel chars too.

Resources