At my workplace, we changed string type (which holds internationalized characters) for from std::wstring to std::u16string after VS 2015(Update 3) compiler upgrade.
Due to this, we are seeing loads of performance regressions such as this.
The profiler analysis reveals that std::u16string's std::char_traits<char16_t> operations such as copy, compare, find and assign are the most hit and are taking longer than std::wstring's std::char_traits<wchar_t> counterparts.
These std::char_traits<wchar_t> operations are written in terms of std::wmem* and std::char_traits<char16_t> operations are written in terms of for loops.
If we change these traits operations for char16_t type (or std::u16string) to use our own customized traits, we are seeing performance improvements with performance comparable to std::wstring.
We are planning to write our own custom traits (until MS fixes it for next version of VS) as follows
struct string_custom_traits : public std::char_traits<char16_t>
{
static const char16_t * copy(char16_t* dest, const char16_t* src, size_t count)
{
return (count == 0 ? src : (char16_t*)std::memcpy(dest, src, count * sizeof(char16_t)));
}
};
Would that be OK? Are there any problems with this approach ?
Related
Given the following code how can I convert the v8::Local<v8::Value> into a uint32_t. Or other types based on the Is* method?
v8::Local<v8::Value> value;
v8::Local<v8::Context> context = v8::Context::New(v8::Isolate::GetCurrent());
if(value->IsUint32()) {
v8::MaybeLocal<Int32> maybeLocal = value->Uint32Value(context);
uint32_t i = maybeLocal;
}
Your posted code doesn't work because value->Uint32Value(context) doesn't return a v8::MaybeLocal<Int32>. C++ types are your friend (just like TypeScript)!
You have two possibilities:
(1) You can use Value::Uint32Value(...) which returns a Maybe<uint32_t>. Since you already checked that value->IsUint32(), this conversion cannot fail, so you can extract the uint32_t wrapped in the Maybe using Maybe::ToChecked().
(2) You can use Value::ToUint32(...) which returns a MaybeLocal<Uint32>. Again, since you already checked that value->IsUint32(), that cannot fail, so you can get a Local<Uint32> via MaybeLocal::ToLocalChecked(), and then simply use -> syntax to call the wrapped Uint32's Value() method, which gives a uint32_t.
If you're only interested in the final uint32_t (and not in the intermediate Local<Uint32>, which you could pass back to JavaScript), then option (1) will be slightly more efficient.
Note that IsUint32() will say false for objects like {valueOf: () => 42; }. If you want to handle such objects, then attempt the conversion, and handle failures, e.g.:
Maybe<uint32_t> maybe_uint = value->Uint32Value(context);
if (maybe_uint.IsJust()) {
uint32_t i = maybe_uint.FromJust();
} else {
// Conversion failed. Maybe it threw an exception (use a `v8::TryCatch` to catch it), or maybe the object wasn't convertible to a uint32.
// Handle that somehow.
}
Also, note that most of these concepts are illustrated in V8's samples and API tests. Reading comments and implementations in the API headers themselves also provides a lot of insight.
Final note: you'll probably want to track the current context you're using, rather than creating a fresh context every time you need one.
I'm experiencing some strange occurrences regarding varidaic templates that I never saw before.
In order to keep it simple i'll give a simple example of what I was trying to achieve, the real code is a bit more involved.
So I have one function that looks like:
template <typename T, typename F, typename ... Args>
static T Func( std::string str, Args&& ... args )
{
... Do something
}
I then call this function a number of times from different locations, where most times the passed types are different. But when I debug this, when I look for the symbol of Func, I get that the same function address has a number of different symbols:
00000000`70ba4ae0 Func<unsigned long,unsigned long (__stdcall*)(wchar_t const *),wchar_t const *>
00000000`70ba4ae0 Func<void *,void * (__stdcall*)(unsigned int),unsigned int>
00000000`70ba4ae0 Func<int,int (__stdcall*)(void *),void *>
So they are all basically the same function. When I try to call for example:
call Func<void *,void * (__stdcall*)(unsigned int),unsigned int>
In the debugger I see:
call Func<int,int (__stdcall*)(void *),void *>
I can see that there is a generated symbol for every template function instance, but every such instance that has the same number of arguments and the arguments are the same byte-size are just linked to one function.
While I can understand why this may happen, but is there a way to force each function to be standalone?
The issue was indeed COMDAT folding. Turning off the OPT:ICF and OPT:REF flags did the trick. Although worth noting that compiling with VS 2017 with the flags produced more "sensible" results than compiling the same code with VS 2013.
I have a project that makes extensive use (high frequency) of a limited set of key linear algebra operations such as matrix multiplication, matrix inverse, addition, etc. These operations are implemented by a handful of linear algebra libraries that I would like to benchmark without having to recompile the business logic code to accommodate the different mannerisms of these various libraries.
I'm interested in figuring out what is the smartest way of accommodating a wrapper class as an abstraction across all of these libraries in order to standardize these operations against the rest of my code. My current approach relies on the Curiously Recurring Template Pattern and the fact that C++11 gcc is smart enough to inline virtual functions under the right circumstances.
This is the wrapper interface that will be available to the business logic:
template <class T>
class ITensor {
virtual void initZeros(uint32_t dim1, uint32_t dim2) = 0;
virtual void initOnes(uint32_t dim1, uint32_t dim2) = 0;
virtual void initRand(uint32_t dim1, uint32_t dim2) = 0;
virtual T mult(T& t) = 0;
virtual T add(T& t) = 0;
};
And here is an implementation of that interface using e.g. Armadillo
template <typename precision>
class Tensor : public ITensor<Tensor<precision> >
{
public:
Tensor(){}
Tensor(arma::Mat<precision> mat) : M(mat) { }
~Tensor(){}
inline void initOnes(uint32_t dim1, uint32_t dim2) override final
{ M = arma::ones<arma::Mat<precision> >(dim1,dim2); }
inline void initZeros(uint32_t dim1, uint32_t dim2) override final
{ M = arma::zeros<arma::Mat<precision> >(dim1,dim2);}
inline void initRand(uint32_t dim1, uint32_t dim2) override final
{ M = arma::randu<arma::Mat<precision> >(dim1,dim2);}
inline Tensor<precision> mult(Tensor<precision>& t1) override final
{
Tensor<precision> t(M * t1.M);
return t;
}
inline Tensor<precision> add(Tensor<precision>& t1) override final
{
Tensor<precision> t( M + t1.M);
return t;
}
arma::Mat<precision> M;
};
Questions:
Does it make sense to use CRTP and inlining in this scenario?
Can this be improved with respect to optimizing performance?
As pointed out in an answer, the use of polymorphism here is a bit odd due to the templating of the base class. Here is why I think this still makes sense:
You will notice the base class is named "Tensor" rather than something more specific like "ArmadilloTensor" (after all, the base class implements ITensor methods using Armadillo methods). I kept the name as is because according to my current design, the use of polymorphism is more due to a sense of formalism than anything else. The plan is for the project code to be aware of a class called Tensor that offers the functionality specified in ITensor. For each new library that I want to benchmark, I would just write a new "Tensor" class in a new compilation unit, package the compilation results into an .a archive, and when doing a benchmarking test, link the business logic code against that library. Switching between different implementations then becomes a matter of choosing which Tensor implementation to link against. To the base code it is all the same whether the Tensor methods are implemented by Armadillo or something else. Advantages: avoids having code that knows about every library (they are all independent), and no compile time changes are required in the base code in order to use a new implementation. So, why the polymorphism? In my mind I just wanted to somehow formalize the functions that need to be implemented by any new library that is added to the benchmark. In reality, the base code would then work with ITensors in the function parameters, but then potentially static_cast them down to Tensors in the method bodies themselves.
It's possible I'm missing something here, or you haven't shown enough details.
You use polymorphism. As defined in its name, it's about same type taking different shapes (different behaviour). So you have an interface that is accepted by user code and you can provide different implementations of that interface.
But in your case you don't have different implementations of a single interface. Your ITensor template generates different classes and each final implementation of your Tensor derives from a distinct base.
Consider your user code is something like this:
template<typename T>
void useTensor(ITensor<T>& tensor);
and you can provide your Tensor implementation. It's almost the same as
template<typename T>
void useTensor(T& tensor);
just w/o CRTP and virtual calls. Now each wrapper should implement some set of functionality. There's a problem that this set of functionality is not explicitly defined. Compiler provides a great help here but it's not ideal. It's why we all look forward to get Concepts in the next standard.
I have a Entity-Component System.
Components : classes with data, but have no complex function
Entity : integer + list of Components (<=1 instance per type per entity)
Systems : a lot of function with minimum data, do complex game logic
Here is a sample. A bullet entity pewpew is exploded, so it will be set to be invisible (at #2):-
class Component_Projectile : public ComponentBase{
public: GraphicObject* graphicObject=nullptr; //#1
public: Entity whoCreateMe=nullptr;
public: Entity targetEnemy=nullptr;
//.... other fields ....
};
class System_Projectile : public SystemBase {
public: void explodeDamage(Entity pewpew){ //called every time-step
Pointer<Component_Projectile> comPro=pewpew;
comPro->graphicObject->setVisible(false); //#2
//.... generate some cool particles, damage surrounded object, etc
}
//.... other functions ....
};
It works OK.
New Version
Half year later, I realized that my architecture looks inconsistent.
I cache all game-logic object using Entity.
But cache Physic Object and Graphic Object by direct pointer. (e.g. #1)
I have a crazy idea :
Physic Object and Graphic Object should also be game-logic object!
They should be encapsulated into Entity.
Now the code will be (the change is marked with #1 and #2):-
class Component_Projectile : public ComponentBase{
public: Entity graphicObject=nullptr; //#1
public: Entity whoCreateMe=nullptr;
public: Entity targetEnemy=nullptr;
//.... other fields ....
};
class System_Projectile : public SystemBase {
public: void explodeDamage(Entity pewpew){ //called every time-step
Pointer<Component_Projectile> comPro=pewpew;
system_graphic->setVisible(comPro->graphicObject,false); //#2
//.... generate some cool particles, damage surrounded object, etc
}
//.... other functions ....
};
After playing them for a week, I can conclude pro/cons of this approach as below :-
Advantage
1. Drastically reduce couple between game logic VS graphic-engine/physic-engine
All graphics-specific function is now encapsulated inside 1-3 systems.
No other game systems (e.g. System_Projectile) refer to the hardcode type e.g. GraphicObject.
2. Dramatically increase flexibility in design
Old vanilla graphic object is not just a graphic object anymore!
I can change it to something else, especially add special feature that is totally insane / too specific for physic/graphic engine, e.g.
rainbow blinking graphic
strange gravity, magnet
swap in/out many physic-object type in the same Entity (not sure)
3. Reduce compile time
It is accidentally become a pimpl idiom.
For example, System_Projectile don't have to #include "GraphicObject.h" any more.
Disadvantage
1. I have to encapsulate many graphic/physic-object's functions.
For example,
system_graphic->setVisible(comPro->graphicObject,false);
is implemented as
public: void setVisible(Entity entity,bool visible){
entity.getComponent<Graphic_Component>()->underlying->setVisible(visible);
}
It is tedious, but not a very hard work.
It can be partially alleviated by <...>.
2. Performance is (only) little bit worse.
Need a few additional indirection.
3. Code is less readable.
The new version is harder to read.
comPro->graphicObject->setVisible(false); //old version
system_graphic->setVisible(comPro->graphicObject,false); //new version
4. Losing type + Ctrl+space is less usable
In old version, I can easily ctrl+space in this code :-
comPro->graphicObject-> (ctrl+space)
It is now harder. I have to think which system I want to call.
system_(ctrl+space)graphic->(ctrl+space)setVisible
Question
In most code location, the advantage overcome the disadvantage, so I decided I will use the new version.
How to alleviate the disadvantages, especially number 3 and 4?
Design-pattern? C++ magic?
I may use Entity-Component System in a wrong way. (?)
I would like to store some closures in an array.
I tagged the question MSVC10 since it seems that according to c++11 closures should be compatible (at least under some conditions) with function pointers but MSVC10 does not supports that.
Is there a way around this limitation?
example:
typedef double (*Func)(const C* c);
struct Feature{
Feature(FeatureId i_id = None, const QString& i_name=QString(), Func i_ex = nullptr)
:id(i_id),name(i_name), extraction(i_ex)
{}
FeatureId id;
QString name;
Func extraction;
};
QList<Feature> features;
features.append(Feature(feat_t, "a/t", [](const C* c) -> double{return c->a.t;} ));
I want to be able to assign closures to the function pointer because i do not want to define dozens of separate functions.
Thanks in advance for your suggestions.
You should use std::function<double(const C*)> (see this) instead of Func, so
struct Feature{
FeatureId id;
QString name;
std::function<double(const C*)> extraction;
/// etc...
};
You may need to upgrade your compiler (I guess that Visual Studio 2010 appeared before the C++11 standard, but I never used Windows or other Microsoft products). Did you consider using a recent GCC (4.9 at least) or a recent Clang/LLVM (3.5) ?
If you cannot upgrade your compiler, stick to C++98 and don't use C++11 features.
By definition, a closure is more heavy that a function pointer, since it contains closed values (some of which might be hidden or non-obvious).