I have a filepath that can have both / and \ and multiple of them, for example
\\abc/tr\record.csv
or
\\re/nst/opr\etc/some/nov\
I would like to get the first 2 pieces of it (abc and tr in the first example and re and nst in the second)
How can I do this in visual C++? (windows)
With regexes or some msdn function? Maybe there is an msdn function to normalize the filepath to all \ or all / and only one of them? (I cant use extra libs like boost)
I think you can simply "parse" the string in a loop if you are sure that you only ever need encounter \ and / as path separator. Yes, you can use a regex or something more complicated, but it's a pretty simple problem as is.
Something like this - note, the code is untested and most likely won't compile, but it'd illustrate the algorithm:
std::string::citerator beg = path.cbegin();
std::string::citerator end = path.cend();
unsigned separator_count = 0;
while(beg != end && separator_count < 4)
{
if (*beg == '/' || *beg == '\\')
++separator_count;
++beg;
}
std::string extracted_path = (separator_count == 3)
? path.substr(path.cbegin(), std::distance(path.cbegin(), beg))
: std::string();
Related
I absolutely disagree that this question is a duplicate! I am asking for an efficiency way to replace hundreds of words at once. This is an algorithm question! All the provided links are about to replace one word. Should I repeat that expensive operation hundreds of times? I'm sure that there are better ways as a suffix tree where I sort out html while building that tree. I removed that regex tag since for no good reason you are focusing on that part.
I want to translate a given set of words (more then 100) and translate them. My first idea was to use a simple regular expression that works better then expected. As sample:
const input = "I like strawberry cheese cake with apple juice"
const en2de = {
apple: "Apfel",
strawberry: "Erdbeere",
banana: "Banane",
/* ... */}
input.replace(new RegExp(Object.keys(en2de).join("|"), "gi"), match => en2de[match.toLowerCase()])
This works fine on the first view. However it become strange if you words which contains each other like "pineapple" that would return "pineApfel" which is totally nonsense. So I was thinking about checking word boundaries and similar things. While playing around I created this test case:
Apple is a company
That created the output:
Apfel is a company.
The translation is wrong, which is somehow tolerable, but the link is broken. That must not happen.
So I was thinking about extend the regex to check if there is a quote before. I know well that html parsing with regex is a bad idea, but I thought that this should work anyway. In the end I gave up and was looking for solutions of other devs and found on Stack Overflow a couple of questions, all without answers, so it seems to be a hard problem (or my search skills are bad).
So I went two steps back and was thinking to implement that myself with a parser or something like that. However since I have multiple inputs and I need to ignore the case I was thinking what the best way is.
Right now I think to build a dictionary with pointers to the position of the words. I would store the dict in lower case so that should be fast, I could also skip all words with the wrong prefix etc to get my matches. In the end I would replace the words from the end to the beginning to avoid breaking the indices. But is that way efficiency? Is there a better way to achieve that?
While my sample is in JavaScript the solution must not be in JS as long the solution doesn't include dozens of dependencies which cannot be translated easy to JS.
TL;DR:
I want to replace multiple words by other words in a case insensitive way without breaking html.
You may try a treeWalker and replace the text inplace.
To find words you may tokenize your text, lower case your words and map them.
const mapText = (dic, s) => {
return s.replace(/[a-zA-Z-_]+/g, w => {
return dic[w.toLowerCase()] || w
})
}
const dic = {
'grodzi': 'grodzila',
'laaaa': 'forever',
}
const treeWalker = document.createTreeWalker(
document.body,
NodeFilter.SHOW_TEXT
)
// skip body node
let currentNode = treeWalker.nextNode()
while(currentNode) {
const newS = mapText(dic, currentNode.data)
currentNode.data = newS
currentNode = treeWalker.nextNode()
}
p {background:#eeeeee;}
<p>
grodzi
LAAAA
</p>
The link stay untouched.
However mapping each word in an other language is bound to fail (be it missing representation of some word, humour/irony, or simply grammar construct). For this matter (which is a hard problem on its own) you may rely on some tools to translate data for you (neural networks, api(s), ...)
Here is my current work in progress solution of a suffix tree (or at least how I interpreted it). I'm building a dictionary with all words, which are not inside of a tag, with their position. After sorting the dict I replace them all. This works for me without handling html at all.
function suffixTree(input) {
const dict = new Map()
let start = 0
let insideTag = false
// define word borders
const borders = ' .,<"\'(){}\r\n\t'.split('')
// build dictionary
for (let end = 0; end <= input.length; end++) {
const c = input[end]
if (c === '<') insideTag = true
if (c === '>') {
insideTag = false
start = end + 1
continue
}
if (insideTag && c !== '<') continue
if (borders.indexOf(c) >= 0) {
if(start !== end) {
const word = input.substring(start, end).toLowerCase()
const entry = dict.get(word) || []
// save the word and its position in an array so when the word is
// used multiple times that we can use this list
entry.push([start, end])
dict.set(word, entry)
}
start = end + 1
}
}
// last word handling
const word = input.substring(start).toLowerCase()
const entry = dict.get(word) || []
entry.push([start, input.length])
dict.set(word, entry)
// create a list of replace operations, we would break the
// indices if we do that directly
const words = Object.keys(en2de)
const replacements = []
words.forEach(word => {
(dict.get(word) || []).forEach(match => {
// [0] is start, [1] is end, [2] is the replacement
replacements.push([match[0], match[1], en2de[word]])
})
})
// converting the input to a char array and replacing the found ranges.
// beginning from the end and replace the ranges with the replacement
let output = [...input]
replacements.sort((a, b) => b[0] - a[0])
replacements.forEach(match => {
output.splice(match[0], match[1] - match[0], match[2])
})
return output.join('')
}
Feel free to leave a comment how this can be improved.
initlNum453 = List[];
num1 = 2;
(*Pt1, initial work to make initlNum full of good variables*)
algorithmicNum = 1;
For[i7 = 1, i7 <= (num1 + 1)^2, i7++,
AppendTo[initlNum453, algorithmicNum];
If[((algorithmicNum) == (num1 + 1)), algorithmicNum = 1,
algorithmicNum++];
];
(*Pt2, delete unneeded variables*)
deleteValue = 1;
Do[
Delete[initlNum453, deleteValue];
deleteValue = (deleteValue + num1 + 2);
, {num1 + 1}
]
Here's a snippet of the code I'm trying to make (it involves pseudo-automating Lagrange polynomials). It should be simple; the first part creates a series of numbers in a list, and then the second should be delete a particular section (e.g., the 1,4,7 if n=2).
For some reason, one of the following occurs:
No Error, but the elements in the list remains the same/no elements get deleted
Taking out the semicolon says that the "Tag Times in ___ is Protected"-- can someone explain what exactly this means?
When putting this into a Module, the error states that the expression .' cannot be used as a part specification. Use Key[.`] instead.
In any case, I don't understand why something as simple as this is just doesn't work on Mathematica. The "Delete" function works outside of a for/do loop, but doesn't inside-- can someone explain why or tell me what I did wrong?
Thanks for your help! I appreciate it!
You need to write something like
initlNum453 = Delete[initlNum453, deleteValue]
When the expression is too long and cannot fit into the display area or for source code neatness, we will write the long expression in multi-line. I tried to check related programming rules but I cannot find some related about the symbol should put at end of line or first of next line.
For example long IF expression in C# code,
if (!double.TryParse(text4Hours.Text, out outputValue)||
!double.TryParse(text4Hours.Text, out outputValue)||
!double.TryParse(text10Hours.Text, out outputValue)){ }
OR
if (!double.TryParse(text4Hours.Text, out outputValue)
||!double.TryParse(text4Hours.Text, out outputValue)
||!double.TryParse(text10Hours.Text, out outputValue)){ }
The parallel symbol || can be put at end of line or first of next line, any rules states that we must follow the first or second one? What are the advantages to do like that?
Another example is
answer = amount4Hours * double.Parse(text4Hours.Text) +
amount8Hours * double.Parse(text8Hours.Text) -
break45 * double.Parse(text8Hours.Text) +
amount10Hours * double.Parse(text10Hours.Text) -
break45 * double.Parse(text10Hours.Text);
OR
answer = amount4Hours * double.Parse(text4Hours.Text)
+amount8Hours * double.Parse(text8Hours.Text)
-break45 * double.Parse(text8Hours.Text)
+amount10Hours * double.Parse(text10Hours.Text)
-break45 * double.Parse(text10Hours.Text);
I know we should try to avoid have long expression but if consider the reason of display area of the text editor is small or other issues, then this question will still pop out.
Thanks.
I believe any answers for this question will be opinion based, but I also believe that most people would prefer the first approach. Imagine something like:
if (expression1 &&
expression243787 &&
expression3 &&
expression938533485934 &&
expression7247)
Now imagine that you catch a bug on your code and realize that the logic error is in this expression, where all && operators should be || instead. It might just be an opinion, but I do think that it would be much easier to fix that if the code was like this:
if (expression1
&& expression243787
&& expression3
&& expression938533485934
&& expression7247)
This is just an example that illustrates why I think this practice makes code refactoring easier, besides other benefits, like readability.
Putting the symbol at the start of the next line will show the symbol more easily. If at the end of the previous line, the symbol sequence is harder to follow.
I need to go to the registry and check a programs installed version. I am using perl to a whole lot of others things but the registry checking part isn't working. The program version has to be 9.7 and up so it could be 9.8 or 9.7.5.
When I install the program it shows 9.7.4 but I just need the 9.7 to be checked.
Bellow is me going to DisplayVersion which is a REG_SZ which shows 9.7.4.
OR
I could use VersionMajor and VersionMinor together which is a REG_DWORD. Which for Major is 9 and Minor is 7.
$ProgVersion= `$rootpath\\system32\\reg\.exe query \\\\$ASSET\\HKLM\\SOFTWARE\\Wow6432Node\\Microsoft\\Windows\\CurrentVersion\\Uninstall\\{9ACB414D-9347-40B6-A453-5EFB2DB59DFA} \/v DisplayVersion`;
if ($ProgVersion == /9.7/)
This doesn't work I could make it 9.200 and it still works. I tried to use this instead and it still wouldn't work. This next part is assuming that a new client needs to be install if it goes from 9.7. I was trying to use Great than or equal to, but it didn't work.
$ProgVersionMajor= `$rootpath\\system32\\reg\.exe query \\\\$ASSET\\HKLM\\SOFTWARE\\Wow6432Node\\Microsoft\\Windows\\CurrentVersion\\Uninstall\\{9ACB414D-9347-40B6-A453-5EFB2DB59DFA} \/v VersionMajor`;
$ProgVersionMinor= `$rootpath\\system32\\reg\.exe query \\\\$ASSET\\HKLM\\SOFTWARE\\Wow6432Node\\Microsoft\\Windows\\CurrentVersion\\Uninstall\\{9ACB414D-9347-40B6-A453-5EFB2DB59DFA} \/v VersionMinor`;
if (($ProgVersionMajor=~ /9/) && ($ProgVersionMinor=~ /7/))
Any help on doing this correctly or fixing what I am doing.
Several things:
You don't mention it, but are you using the Perl module Win32::TieRegistry? If not, you should. It'll make handling the Windows registry much easier.
In the Perl documentation, you can look at Version String under Scalar Value Constructors. This will make manipulating version strings much, much easier. Version strings have either more than one decimal place in them, or start with the letter v. I always prefix them with v to make it obvious what it is.
Here's a sample program below showing you how they can be used in comparisons:
#! /usr/bin/env perl
#
use strict;
use warnings;
my $version = v4.10.3;
for my $testVersion (v3.5.2, v4.4.1, v5.0.1) {
if ($version gt $testVersion) {
printf qq(Version %vd is greater than test %vd\n), $version, $testVersion;
}
else {
printf qq(Version %vd is less than test %vd\n), $version, $testVersion;
}
}
Note that I can't just print version strings. I have to use printf and sprintf and use the %vd vector decimal format to print them out. Printing version strings via a regular print statement can cause all sorts of havoc since they're really unicode representations. You put them in a print statement and you don't know what you're getting.
Also notice that you do not put quotes around them! Otherwise, you'll just make them regular strings.
NEW ANSWER
I was trying to find a way to convert a string into a v-string without downloading an optional package like Perl::Version or (Version), and I suddenly read that v-strings are deprecated, and I don't want to use a deprecated feature.
So, let's try something else...
We could simply divide up version numbers into their constituent components as arrays:
v1.2.3 => $version[0] = 1, $version[1] = 2, $version[2] = 3
By using the following bit of code:
my #version = split /\./, "9.7.5";
my #minVersion = split /\./, "9.7"
Now, we can each part of the version string against the other. In the above example, I compare the 9 of #version with the 9 of #version, etc. If #version was 9.6 I would have compared the 6 in #version against the 7 in #minVersion and quickly discovered that #minVersion is a higher version number. However, in both the second parts are 7. I then look at the third section. Whoops! #minVersion consists of only two sections. Thus, #version is bigger.
Here's a subroutine that does the comparison. Note that I also verify that each section is an integer via the /^\d+$/ regular expression. My subroutine can return four values:
0: Both are the same size
1: First Number is bigger
2: Second Number is bigger
undef: There is something wrong.
Here's the program:
my $minVersion = "10.3.1.3";
my $userVersion = "10.3.2";
# Create the version arrays
my $result = compare($minVersion, $userVersion);
if (not defined $results) {
print "Non-version string detected!\n";
}
elsif ($result == 0) {
print "$minVersion and $userVersion are the same\n";
}
elsif ($result == 1) {
print "$minVersion is bigger than $userVersion\n";
}
elsif ($result == 2) {
print "$userVersion is bigger than $minVersion\n";
}
else {
print "Something is wrong\n";
}
sub compare {
my $version1 = shift;
my $version2 = shift;
my #versionList1 = split /\./, $version1;
my #versionList2 = split /\./, $version2;
my $result;
while (1) {
# Shift off the first value for comparison
# Returns undef if there are no more values to parse
my $versionCompare1 = shift #versionList1;
my $versionCompare2 = shift #versionList2;
# If both are empty, Versions Matched
if (not defined $versionCompare1 and not defined $versionCompare2) {
return 0;
}
# If $versionCompare1 is empty $version2 is bigger
if (not defined $versionCompare1) {
return 2;
}
# If $versionCompare2 is empty $version1 is bigger
if (not defined $versionCompare2) {
return 1;
}
# Make sure both are numeric or else there's an error
if ($versionCompare1 !~ /\^d+$/ or $versionCompare2 !~ /\^\d+$/) {
return;
}
if ($versionCompare1 > $versionCompare2) {
return 1;
}
if ($versionCompare2 > $versionCompare1) {
return 2;
}
}
}
Using Win32::TieRegistry
You said in your answer you didn't use Win32::TieRegistry. I just want to show you what it can do for the readability of your program:
Your Way
$ProgVersion= `$rootpath\\system32\\reg\.exe query \\\\$ASSET\\HKLM\\SOFTWARE\\Wow6432Node\\Microsoft\\Windows\\CurrentVersion\\Uninstall\\{9ACB414D-9347-40B6-A453-5EFB2DB59DFA} \/v DisplayVersion`;
With Win32::TieRegistry
use Win32::TieRegistry ( TiedHash => '%RegHash', DWordsToHex => 0 );
my $key = $TiedHash->{LMachine}->{Software}->{Wow6432Node}->{Microsoft}->{Windows}->{CurrentVersion}->{Uninstall}->{9ACB414D-9347-40B6-A453-5EFB2DB59DFA}->{Version};
my $programValue = $key->GetValue;
my $stringValue = unpack("L", $programValue);
Or, you can split it up:
my $MSSoftware = $TiedHash->{LMachine}->{Software}->{Wow6432Node}->{Microsoft};
my $uninstall = $MSSoftware->{Windows}->{CurrentVersion}->{Uninstall};
my $programVersion = $uninstall->{9ACB414D-9347-40B6-A453-5EFB2DB59DFA}->{Version};
See how much easier that's to read. You can also use this to test keys too.
(Word 'o Warning: I don't have a Windows machine in front of me, so I didn't exactly check the validity of the code. Try playing around with it and see what you get.)
Suppose I have this function:
std::string Func1(std::string myString)
{
//do some string processing
std::string newString = Func2(myString)
return newString;
}
How do I set a conditional break when newString has a specific value? (without changing the source)
Setting the condition newString == "my value" didn't work. The breakpoints were disabled with an error overloaded operator not found.
There is a much easier way in Visual Studio 2010/2012.
To accomplish what you are looking for in ANSI use this:
strcmp(newString._Bx._Ptr,"my value")==0
And in unicode (if newString were unicode) use this:
wcscmp(newString._Bx._Ptr, L"my value")==0
There are more things you can do than just a compare, you can read more about it here:
http://blogs.msdn.com/b/habibh/archive/2009/07/07/new-visual-studio-debugger-2010-feature-for-c-c-developers-using-string-functions-in-conditional-breakpoints.aspx
In VS2017, I was able to set the condition as:
strcmp(&newString[0], "my value") == 0
Some searching has failed to turn up any way to do this. Suggested alternatives are to put the test in your code and add a standard breakpoint:
if (myStr == "xyz")
{
// Set breakpoint here
}
Or to build up your test from individual character comparisons. Even looking at individual characters in the string is a bit dicey; in Visual Studio 2005 I had to dig down into the member variables like
myStr._Bx._Buf[0] == 'x' && myStr._Bx._Buf[1] == 'y' && myStr._Bx._Buf[2] == 'z'
Neither of these approaches is very satisfactory. We should have better access to a ubiquitous feature of the Standard Library.
In VS2017 you can do
strcmp(newString._Mypair._Myval2._Bx._Buf,"myvalue")==0
While I've had to work around this using something similar to Brad's answer (plus using DebugBreak() to break right from the code), sometimes editing/recompiling/re-running a bit of code is either too time consuming or just plain impossible.
Luckily, it's apparently possible to spelunk into the actual members of the std::string class. One way is mentioned here -- and though he calls out VS2010 specifically, you can still access individual chars manually in earlier versions. So if you're using 2010, you can just use the nice strcmp() functions and the like (more info), but if you're like me and still have 2008 or earlier, you can come up with a raggedy, terrible, but functional alternative by setting a breakpoint conditional something like:
strVar._Bx._Ptr[0] == 'a' && strVar._Bx._Ptr[1] == 'b' &&
strVar._Bx._Ptr[2] == 'c'
to break if the first three characters in strVar are "abc". You can keep going with additional chars, of course. Ugly.. but it's saved me a little time just now.
VS2012:
I just used the condition below because newString._Bx._Ptr ( as in OBWANDO's answer ) referenced illegal memory
strcmp( newString._Bx._Buf, "my value")==0
and it worked...
#OBWANDO (almost) has the solution, but as multiple comments rightly point out, the actual buffer depends on the string size; I see 16 to be the threshold. Prepending a size check to the strcmp on the appropriate buffer works.
newString._Mysize < 16 && strcmp(newString._Bx._Buf, "test value") == 0
or
newString._Mysize >= 16 && strcmp(newString._Bx._Ptr, "ultra super long test value") == 0
Tried to use strcmp in gdb8.1 under ubuntu18.04, but it doesn't work:
(ins)(gdb) p strcmp("a", "b")
$20 = (int (*)(const char *, const char *)) 0x7ffff5179d60 <__strcmp_ssse3>
According to this answer, strcmp, is a special IFUNC, one can setup condition like this:
condition 1 __strcmp_ssse3(camera->_name.c_str(), "ping")==0
It's pretty ugly, don't want to do it the second time.
This answer gives a much better solution, it use std::string::compare :
condition 1 camera->_name.compare("ping") == 0
In VS2015 you can do
newstring[0]=='x' && newString[1]=='y' && newString[2]=='z'
Comparing string works better than comparing characters
strcmp(name._Mypair._Myval2._Bx._Buf, "foo")==0
This works, but is very inconvenient to use and error prone.
name._Mypair._Myval2._Bx._Buf[0] == 'f' &&
name._Mypair._Myval2._Bx._Buf[1] == '0' &&
name._Mypair._Myval2._Bx._Buf[2] == '0'
You could convert it into a c string using c_str() like so:
$_streq(myStr.c_str(), "foo")
To set a conditional breakpoint in std::string you need to set it on real internal members of std::string. What you see on watch window is simplified.
You can display real structure of a variable in the watch window by using ,! suffix. In your example:
newString,!
For MSVC 2015 – 2019 you can use:
For string that were never longer than 15 characters:
(newString._Mypair._Myval2._Myres < 16) ?
strcmp(newString._Mypair._Myval2._Bx._Buf, "short") == 0 :
false
For (even historically) longer strings:
(newString._Mypair._Myval2._Myres < 16) ? false :
strcmp(newString._Mypair._Myval2._Bx._Ptr, "My_test_str_value_longer_than_16_chars") == 0
Beware:
The variable name is written twice in each condition!
You need whole expression on single line. Use the copy-paste versions bellow.
Universal condition needs to put the test value twice and variable name three times:
(newString._Mypair._Myval2._Myres < 16) ?
strcmp(newString._Mypair._Myval2._Bx._Buf, "My_test_string") == 0 :
strcmp(newString._Mypair._Myval2._Bx._Ptr, "My_test_string") == 0
Notes: use wcscmp instead of strcmp if you are working with std::wstring.
Find more info on small string optimization in C++ https://vorbrodt.blog/2019/03/30/sso-of-stdstring/ includes sample code to find size of string's internal buffer.
All std:string and std::wstring single line versions for your copy paste convenience:
(newString._Mypair._Myval2._Myres < 16) ? strcmp(newString._Mypair._Myval2._Bx._Buf, "short") == 0 : false
(newString._Mypair._Myval2._Myres < 16) ? false : strcmp(newString._Mypair._Myval2._Bx._Ptr, "My_test_str_value_longer_than_16_chars") == 0
(newString._Mypair._Myval2._Myres < 16) ? strcmp(newString._Mypair._Myval2._Bx._Buf, "My_test_string") == 0 : strcmp(newString._Mypair._Myval2._Bx._Ptr, "My_test_string") == 0
(newString._Mypair._Myval2._Myres < 16) ? wcscmp(newString._Mypair._Myval2._Bx._Buf, L"short") == 0 : false
(newString._Mypair._Myval2._Myres < 16) ? false : wcscmp(newString._Mypair._Myval2._Bx._Ptr, L"My_test_str_value_longer_than_16_chars") == 0
(newString._Mypair._Myval2._Myres < 16) ? wcscmp(newString._Mypair._Myval2._Bx._Buf, L"My_test_string") == 0 : wcscmp(newString._Mypair._Myval2._Bx._Ptr, L"My_test_string") == 0
All above copy/paste samples tested on MSVC version 16.9.10 and program for Windows 10.