XercesDOMParser* and DOMDocument* going out of scope before a DOMElement* - xerces-c

Short Version: Is it safe for a XercesDOMParser* and DOMDocument* to go out of scope before a DOMElement* that they were used to create does?
Long version:
In the code snippet below I create a local XercesDOMParser* and DOMDocument* in order to get the root element of the document and store it in a member DOMElement* variable. The XercesDOMParser* and DOMDocument* both go out of scope at the end of the constructor, but the DOMElement* lives on as a member variable. Is this ok? So far it seems to work but I am nervous that I may have problems later.
JNIRequest::JNIRequest(JNIEnv *env, jobject obj, jstring input)
{
char *szInputXML = (char*) env->GetStringUTFChars(input, NULL);
XMLPlatformUtils::Initialize();
XercesDOMParser* pParser = new XercesDOMParser();
XMLByte* xmlByteInput = (XMLByte*) szInputXML;
xercesc::MemBufInputSource source(xmlByteInput, strlen(szInputXML), "BufferID");
pParser->parse(source);
DOMDocument* pDocument = pParser->getDocument();
/* This next variable is a DOMElement* */
this->_pRootElement = pDocument->getDocumentElement();
}

Your code snippet looks like it is creating some memory leaks. I'm afraid this is also the reason why the code seems to "work" for the moment.
In general the Xerces parser owns the document tree. Please take a look at AbstractDOMParser::adoptDocument() to transfer the ownership away from the parser. This means for your code example, if you would correctly release the parser, the document is also deleted making your pointer to the DOMElement invalid.
The solution would be to call adoptDocument() and save the pointer to the Document Element afterwards. Please note that you need to release the node tree (on closing the application?) and that the tree could consume a lot of memory depending on the size of the XML ...
Hope this helps

Related

When does c++ right value destruct in this scenario?

Here is the code:
class SomeType {
public:
SomeType() {}
~SomeType() {}
std::string xxx;
}
bool funtion_ab() {
SomeType(); // This is a right val;
// The right val destructs here when I test the code. I want to make sure that it would always destructs here.
int a = 0, b = 10;
....// other code
return true;
}
Please tell me if you know the truth. Thank you!
What you have is called a temporary object. From §6.7.7,
Temporary objects are created
when a prvalue is converted to an xvalue
or, more specifically,
[Note 3: Temporary objects are materialized:
...
when a prvalue that has type other than cv void appears as a discarded-value expression ([expr.context]).
— end note]
and, on the lifetime, the same section has this to say
Temporary objects are destroyed as the last step in evaluating the full-expression ([intro.execution]) that (lexically) contains the point where they were created.
You can read more about the expression semantics, but in your case "full-expression" is fairly unambiguous.
SomeType();
The "full-expression" containing your constructor call is... the constructor call itself. So the destructor will be called immediately after evaluating the constructor. There are some exceptions to this rule (such as if the temporary object is thrown as an exception or is bound as a reference), but none of those apply here.
As noted in the comments, compilers are free to inline your constructor and destructor calls and then are free to notice that they do nothing and omit them entirely. Optimizers can do fun stuff with your code, provided it doesn't change the semantics. But a strict reading of the standard states that the destructor is called exactly where you suggested.

c++ access PEB_LDR_DATA struct member by offset

I am new to c++ and I am trying to access the InLoadOrderModuleList member in PEB_LDR_DATA structure.
I tried this:
// the ldrData data type is PPEB_LDR_DATA
PLIST_ENTRY firstitem_InMemoryOrderModuleList = ((PLIST_ENTRY)(pebLdrData + 0x0010)-> Flink);
without success. How should I accessing it?
LIST_ENTRY is how Windows does linked lists internally. There is plenty of information about them online if you need more details, but there are two things you need to know here:
is that the next/back pointers don't point to the head of the object (which is common in most implementations); so in order to get to the head of the object you have to do a fixup on the pointer based on the offset of the LIST_ENTRY member. This is where the CONTAINING_RECORD macro comes into use.
is that you don't want to this fixup on the first LIST_ENTRY in the PEB_LDR_DATA object, think of those as the "head" pointer, and you need to move through the Flink before you get to the data that you care about.
Sample code:
LIST_ENTRY *current_record = NULL;
LIST_ENTRY *start = &(pebLdrData->InLoadOrderModuleList);
// move off the initial list entry to the first actual object
current_record = start->Flink;
while (true)
{
// find the head of the object
LDR_DATA_TABLE_ENTRY *module_entry = (LDR_DATA_TABLE_ENTRY*)
CONTAINING_RECORD(current_record, LDR_DATA_TABLE_ENTRY, InLoadOrderLinks);
printf("%wZ\n", &module_entry->BaseDllName);
// advance to the next object
current_record = current_record->Flink;
if (current_record == start)
{
break;
}
}
The solution is to declare typedef structures of LDR_DATA_TABLE_ENTRY and PEB_LDR_DATA with its full structure.

Is there a way to make a moved object "invalid"?

I've some code that moves an object into another object. I won't need the original, moved object anymore in the upper level. Thus move is the right choice I think.
However, thinking about safety I wonder if there is a way to invalidate the moved object and thus preventing undefined behaviour if someone accesses it.
Here is a nice example:
// move example
#include <utility> // std::move
#include <vector> // std::vector
#include <string> // std::string
int main () {
std::string foo = "foo-string";
std::string bar = "bar-string";
std::vector<std::string> myvector;
myvector.push_back (foo); // copies
myvector.push_back (std::move(bar)); // moves
return 0;
}
The description says:
The first call to myvector.push_back copies the value of foo into the
vector (foo keeps the value it had before the call). The second call
moves the value of bar into the vector. This transfers its content
into the vector (while bar loses its value, and now is in a valid but
unspecified state).
Is there a way to invalidate bar, such that access to it will cause a compiler error? Something like:
myvector.push_back (std::move(bar)); // moves
invalidate(bar); //something like bar.end() will then result in a compiler error
Edit: And if there is no such thing, why?
Accessing the moved object is not undefined behavior. The moved object is still a valid object, and the program may very well want to continue using said object. For example,
template< typename T >
void swap_by_move(T &a, T &b)
{
using std::move;
T c = move(b);
b = move(a);
a = move(c);
}
The bigger picture answer is because moving or not moving is a decision made at runtime, and giving a compile-time error is a decision made at compile time.
foo(bar); // foo might move or not
bar.baz(); // compile time error or not?
It's not going to work.. you can approximate in compile time analysis, but then it's going to be really difficult for developers to either not get an error or making anything useful in order to keep a valid program or the developer has to make annoying and fragile annotations on functions called to promise not to move the argument.
To put it a different way, you are asking about having a compile time error if you use an integer variable that contains the value 42. Or if you use a pointer that contains a null pointer value. You might be succcessful in implementing an approximate build-time code convention checker using clang the analysis API, however, working on the CFG of the C++ AST and erroring out if you can't prove that std::move has not been called till a given use of a variable.
Move semantics works like that so you get an object in any it's correct state. Correct state means that all fields have correct value, and all internal invariants are still good. That was done because after move you don't actually care about contents of moved object, but stuff like resource management, assignments and destructors should work OK.
All STL classes (and all classed with default move constructor/assignment) just swap it's content with new one, so both states are correct, and it's very easy to implement, fast, and convinient enough.
You can define your class that has isValid field that's generally true and on move (i. e. in move constructor / move assignment) sets that to false. Then your object will have correct state I am invalid. Just don't forget to check it where needed (destructor, assignment etc).
That isValid field can be either one pointer having null value. The point is: you know, that object is in predictable state after move, not just random bytes in memory.
Edit: example of String:
class String {
public:
string data;
private:
bool m_isValid;
public:
String(string const& b): data(b.data), isValid(true) {}
String(String &&b): data(move(b.data)) {
b.m_isValid = false;
}
String const& operator =(String &&b) {
data = move(b.data);
b.m_isValid = false;
return &this;
}
bool isValid() {
return m_isValid;
}
}

Example to use boost::weak_ptr to break cyclic dependencies

I have seen one of the usages of boost::weak_ptr is to break cyclic dependencies. Can someone give me a simple concrete example to illustrate this feature?
Thank you
In simple terms:
{ // Enter scope
shared_ptr<A> my_a(new A);
shared_ptr<B> my_b(new B);
my_a->remember_this_b( my_b ); // Stores a copy of a smart pointer
my_b->remember_this_a( my_a ); // Stores a copy of a smart pointer
} // Leave scope. my_a and my_b are destroyed.
If both these functions stored a shared_ptr, the objects would never be deleted, because neither shared_ptr would reach a reference count of zero.
However, if either one used a weak_ptr, the object pointed to by the weak_ptr would be destroyed when leaving the scope. And that would in turn destroy the last shared_ptr to the other object.

When parsing YAML file using yaml-cpp, does it 'copy' all child nodes?

When parsing yaml file, normally we get root node from parser.
And I'm wondering if I can reference the root node after parsing process. Like below.
YAML::Node* globalRoot;
void ParseDocument(filename)
{
YAML::Parser parser(fin)
parser.GetNextDocument(*globalRoot);
}
void myFunction()
{
ParseDocument("myYAML.yml");
// After the method above, we lose the parser instance since it's a local variable.
// But if all child data is copied, below code should be safe.
// If globalRoot is just pointing inside the parser, this could be dangerous.
std::string stringKey;
(*globalRoot)["myKey"] >> stringKey;
}
Can I use like code above??
Yes, it does - once a Node is parsed, it does not rely on any memory from the Parser.
That said, in your example, you never actually construct the node pointed to by globalRoot. You'd need to call
globalRoot = new YAML::Node;
and even better, keep it in a smart pointer like std::auto_ptr.

Resources