C++ substring check

C++ substring check - c++11

I'm given two strings s1, s2 and an integer i. I want to check if the substring of s1 starting at index i of the same length as s2 equals s2. I know this is trivial to implement with a little loop, but I would prefer to use a library method if it exists.

As discussed in the comments, here's an in place solution using string::compare (http://www.cplusplus.com/reference/string/string/compare/)
s1.compare(i,s2.length(),s2)==0
Which would throw an std::out_of_range exception if (i>=s1.length()). Otherwise, it works. To avoid the out_of_range exception, just add something to short circuit the compare:
(i<s1.length()) && (s1.compare(i,s2.length(),s2)==0)

I would do it using std::equal with a check of proper size first. That way it directly works for vectors as well. The check of sizes is necessary to avoid invalid iterators:
if ( i + s2.size() > s1.size() )
{
return false;
}
auto s1beg = s1.cbegin() + i;
auto s1end = s1.cbegin() + i + s2.size();
return std::equal(s1beg, s1end, s2.cbegin());
If you want to support any ranges, use std::advance to create the start and end iterator:
auto s1beg = s1.cbegin();
std::advance(s1beg, i);
auto s2end = s1beg;
std::advance(s2beg, s2.size());
Here's a demo for both strings and a list. A further design decision may allow different container types as well.

bool is_sub_string=i <= s1.size() && s1.substr(i, s2.size()) == s2;

Related

how to get the if else condition using filter operation of java stream?

Question :
from the list of an integer get the square of odd number and half the even number and then return the list of value.
Ans : 1> i will write the logic and if else condition inside the map() method.
List<Integer> output = intArray.stream().map(x-> {
if(x%2 ==0){
x=x/2;
}else{
x= x*x;
}
}).collect(Collectors.toList());
Is their any better way to do this specially using Filter?

Try using map:
map(x -> x % 2 == 0? x / 2: x * x);
Let me know if this works for you.
You can learn more about map and filter here

As you are transforming data (performing a math operation), you cannot use Filter here. Filter is used to filter out elements in your stream. For example if you only want to preserve the even numbers, you could use a Filter
What you need to use is the Map, as you already did. Do note that a map should always return data. Your code is missing this return statement.
To make it more readable, you could split your mapping logic in a method. This makes your stream easy to read and easy to follow (when you give the method a good name ofcourse).
Code example
List<Integer> output = intArray.stream()
.map(test::divideOrPow)
.collect(Collectors.toList());
private int divideOrPow(intx) {
if (x % 2 == 0) {
return x / 2;
} else {
return x * x;
}
}

What is the most efficient way to replace a list of words without touching html attributes?

I absolutely disagree that this question is a duplicate! I am asking for an efficiency way to replace hundreds of words at once. This is an algorithm question! All the provided links are about to replace one word. Should I repeat that expensive operation hundreds of times? I'm sure that there are better ways as a suffix tree where I sort out html while building that tree. I removed that regex tag since for no good reason you are focusing on that part.
I want to translate a given set of words (more then 100) and translate them. My first idea was to use a simple regular expression that works better then expected. As sample:
const input = "I like strawberry cheese cake with apple juice"
const en2de = {
apple: "Apfel",
strawberry: "Erdbeere",
banana: "Banane",
/* ... */}
input.replace(new RegExp(Object.keys(en2de).join("|"), "gi"), match => en2de[match.toLowerCase()])
This works fine on the first view. However it become strange if you words which contains each other like "pineapple" that would return "pineApfel" which is totally nonsense. So I was thinking about checking word boundaries and similar things. While playing around I created this test case:
Apple is a company
That created the output:
Apfel is a company.
The translation is wrong, which is somehow tolerable, but the link is broken. That must not happen.
So I was thinking about extend the regex to check if there is a quote before. I know well that html parsing with regex is a bad idea, but I thought that this should work anyway. In the end I gave up and was looking for solutions of other devs and found on Stack Overflow a couple of questions, all without answers, so it seems to be a hard problem (or my search skills are bad).
So I went two steps back and was thinking to implement that myself with a parser or something like that. However since I have multiple inputs and I need to ignore the case I was thinking what the best way is.
Right now I think to build a dictionary with pointers to the position of the words. I would store the dict in lower case so that should be fast, I could also skip all words with the wrong prefix etc to get my matches. In the end I would replace the words from the end to the beginning to avoid breaking the indices. But is that way efficiency? Is there a better way to achieve that?
While my sample is in JavaScript the solution must not be in JS as long the solution doesn't include dozens of dependencies which cannot be translated easy to JS.
TL;DR:
I want to replace multiple words by other words in a case insensitive way without breaking html.

You may try a treeWalker and replace the text inplace.
To find words you may tokenize your text, lower case your words and map them.
const mapText = (dic, s) => {
return s.replace(/[a-zA-Z-_]+/g, w => {
return dic[w.toLowerCase()] || w
})
}
const dic = {
'grodzi': 'grodzila',
'laaaa': 'forever',
}
const treeWalker = document.createTreeWalker(
document.body,
NodeFilter.SHOW_TEXT
)
// skip body node
let currentNode = treeWalker.nextNode()
while(currentNode) {
const newS = mapText(dic, currentNode.data)
currentNode.data = newS
currentNode = treeWalker.nextNode()
}
p {background:#eeeeee;}
<p>
grodzi
LAAAA
</p>
The link stay untouched.
However mapping each word in an other language is bound to fail (be it missing representation of some word, humour/irony, or simply grammar construct). For this matter (which is a hard problem on its own) you may rely on some tools to translate data for you (neural networks, api(s), ...)

Here is my current work in progress solution of a suffix tree (or at least how I interpreted it). I'm building a dictionary with all words, which are not inside of a tag, with their position. After sorting the dict I replace them all. This works for me without handling html at all.
function suffixTree(input) {
const dict = new Map()
let start = 0
let insideTag = false
// define word borders
const borders = ' .,<"\'(){}\r\n\t'.split('')
// build dictionary
for (let end = 0; end <= input.length; end++) {
const c = input[end]
if (c === '<') insideTag = true
if (c === '>') {
insideTag = false
start = end + 1
continue
}
if (insideTag && c !== '<') continue
if (borders.indexOf(c) >= 0) {
if(start !== end) {
const word = input.substring(start, end).toLowerCase()
const entry = dict.get(word) || []
// save the word and its position in an array so when the word is
// used multiple times that we can use this list
entry.push([start, end])
dict.set(word, entry)
}
start = end + 1
}
}
// last word handling
const word = input.substring(start).toLowerCase()
const entry = dict.get(word) || []
entry.push([start, input.length])
dict.set(word, entry)
// create a list of replace operations, we would break the
// indices if we do that directly
const words = Object.keys(en2de)
const replacements = []
words.forEach(word => {
(dict.get(word) || []).forEach(match => {
// [0] is start, [1] is end, [2] is the replacement
replacements.push([match[0], match[1], en2de[word]])
})
})
// converting the input to a char array and replacing the found ranges.
// beginning from the end and replace the ranges with the replacement
let output = [...input]
replacements.sort((a, b) => b[0] - a[0])
replacements.forEach(match => {
output.splice(match[0], match[1] - match[0], match[2])
})
return output.join('')
}
Feel free to leave a comment how this can be improved.

Using the Haxe While Loop to Remove All of a Value from an Array

I'm wanting to remove all of a possibly duplicated value in an array. At the moment I'm using the remove(x:T):Bool function in a while loop, but I'm wondering about the expression part.
I've started by using:
function removeAll(array:Array<String>, element:String):Void
while (array.remove(element)) {}
but I'm wondering if any of these lines would be more efficient:
while (array.remove(element)) continue;
while (array.remove(element)) true;
while (array.remove(element)) 0;
or if it makes any kind of difference.
I'm guessing that using continue is less efficient because it actually has to do something, true and 0 are slightly more efficient, but still do something, and {} would probably be most efficient.
Does anyone have any background information on this?

While other suggested filter, it will create a new instance of list/array which may cause your other code to lose reference.
If you loop array.remove, it is going to loop through all the elements in the front of the array every time, which is not so performant.
IMO a better approach is to use a reverse while loop:
var i = array.length;
while(--i >= 0)
if(array[i] == element) array.splice(i, 1);

It doesn't make any difference. In fact, there's not even any difference in the generated code for the {}, 0 and false cases: they all end up generating {}, at least on the JS target.
However, you could run into issues if you have a large array with many duplicates: in that case, remove() would be called many times, and it has to iterate over the array each time (until it finds a match, that is). In that case, it's probably more efficient to use filter():
function removeAll(array:Array<String>, element:String):Array<String>
return array.filter(function(e) return e != element);
Personally, I also find this to be a bit more elegant than your while-loop with an empty body. But again, it depends on the use case: this does create a new array, and thus causes an allocation. Usually, that's not worth worrying about, but if you for instance do it in the update loop of a game, you might want to avoid it.

In terms of the expression part of the while loop, it seems that it's just set to empty brases ({}) when compiled so it doesn't really matter what you do.
In terms of performance, a much better solution is the Method 2 from the following:
class Test
{
static function main()
{
var thing:Array<String> = new Array<String>();
for (index in 0...1000)
{
thing.push("0");
thing.push("1");
}
var copy1 = thing.copy();
var copy2 = thing.copy();
trace("epoch");
while (copy1.remove("0")) {}
trace("check");
// Method 2.
copy2 = [
for (item in Lambda.filter(copy2, function(v)
{return v != "0";}))
item
];
trace("check");
}
}
which can be seen [here](https://try.haxe.org/#D0468"Try Haxe example."). For 200,000 one-character elements in an Array<String>, Method 2 takes 0.017s while Method 1 takes 44.544s.

For large arrays it will be faster to use a temporary array and then assign that back after populating ( method3 in try )?
OR
If you don't want to use a temp you can assign back and splice ( method4 in try )?
https://try.haxe.org/#5f80c
Both are more verbose codewise as I setup vars, but on mac seems faster at runtime, summary of my method3 approach:
while( i < l ) { if( ( s = copy[ i++ ] ) != '0' ) arr[ j++ ] = s;
copy = arr;
am I missing something obvious against these approaches?

Concise notation for assigning `unique_ptr`?

I have a pointer to a parent class and I want to assign a new child object to that pointer conditionally. Right now, the syntax I have is rather lengthly:
std::unique_ptr<ParentClass> parentPtr;
if (...) {
parentPtr = std::unique_ptr<ParentClass>(new ChildClass1());
} else {
parentPtr = std::unique_ptr<ParentClass>(new ChildClass2());
}
Is there a good way of making this more readable / less lengthly?

Two possibilities would be:
std::unique_ptr<ParentClass> parentPtr(condition ?
(ParentClass*)new ChildClass1() :
(ParentClass*)new ChildClass2());
If condition is complicated, just assign a boolean to it and then write the construction. This solution only works for a binary condition though.
Another is to embrace C++14, and use
parentPtr = std::make_unique<ChildClass>();

First off, the "obvious" solution C ? new X : new Y does not work, since even if X and Y have a common base class A, the types X * and Y * have no common type. This is actually not so surprising after all if you consider that a class can have many bases (direct or indirect) and a given type may appear as a base multiple times.
You could make the conditional operator work by inserting a cast:
A * = C ? static_cast<A *>(new X) : static_cast<A *>(new Y);
But this would quickly get long and tedious to read when you try to apply this to your real situation.
However, as for std::unique_ptr, it offers the reset function which can be used to good effect here:
std::unique_ptr<A> p;
if (C)
{
p.reset(new X);
}
else
{
p.reset(new Y);
}
Now even if the actual new expressions are long, this is still nicely readable.

Storing pointers wrong/not using Unordered_map.find correctly

so the title essentially says it all. I am writing a symbol table in c++ for a compiler project I am working on, and all is going well except for looking up identifiers in the table.
So this is how I store into the table (pseudo like):
vector<symbolTable*>* symbolStack = new symbolTable();
//where a symbolStack is a vector of unordered_maps (symbolTables),
//each iteration in vector referencing a new block of code.
string* check = new string(root->children[0]->lexicode->c_str());
symbol* sym = new symbol();
...... //setting sym info
symbol_entry pair = make_pair(check, test)
//the unordered_map has keys of (string*, symbol*)
symbolStack[tableNumber]->insert(pair);
I am pretty solid that this works, as I have tested printing the size/infos from the map and it all seems to be storing as expect. Here is where the problem is happening for me (this takes place in a different function later):
for(int i = 0; i =< tableNumber;i++){
auto finder = symbolStack[i]->find(checkS) //checkS == check from above
if(finder == symbolStack[i]->end()) cout<<not found;
else cout<<we did it!!!!
My else is never reached. However, if I do this assuming the string*->c_str() == "test":
cout<<string->c_str(); // prints out "test"
cout<<finder->second->c_str() //prints out "test".
So the question. Why is it finding the key, and knowing it found the key, but at the same time returning that is has reached the end of the symbol stack without finding it? I have been trying to figure this out for a good 4 days solid now. Is it that my pointers are somehow off? Any insight is appreciated greatly.

So somewhat answer to my own question.
First I will say this: I have concluded the comparison with find() or similar methods do not work because for some reason the pointers are not matching up. I have no clue why this is still, or what I am doing wrong.
What I did to solve my issue and complete my code is this:
for(int k = 0; k<= tableNumber; k++){
unordered_map<string*,symbol*>::iterator it;
for(it = symbolStack[k]->begin(); it != symbolStack[k]->end(); it++)
{
string a = targetString->c_str();
string b = it->first->c_str();
if(a.compare(b) == 0) cout<<"You have found the match! \n";
}
}
}
So this answers how to get it working pragmatically if somebody else is in a similar ship, however not really answers why my other attempt failed other than noticing the pointer values were different.

In symbolTable you store pointers to strings as keys, not strings themselves. Therefore unordered_map compares pointers, not strings, and cannot find matching items. When you reconstruct the key string (as in your answer, using string b = it->first->c_str()), the comparison on strings works again. So, either you need to store string instead of string * in symbolTable, or you need to provide your own comparison function that will compare keys of type string *.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

C++ substring check - c++11

I'm given two strings s1, s2 and an integer i. I want to check if the substring of s1 starting at index i of the same length as s2 equals s2. I know this is trivial to implement with a little loop, but I would prefer to use a library method if it exists.

bool is_sub_string=i <= s1.size() && s1.substr(i, s2.size()) == s2;

Related

how to get the if else condition using filter operation of java stream?

What is the most efficient way to replace a list of words without touching html attributes?

Using the Haxe While Loop to Remove All of a Value from an Array

Concise notation for assigning `unique_ptr`?

Storing pointers wrong/not using Unordered_map.find correctly

Categories

Resources