In the following example, from where does the pointer p gets the information? - c++11

vector& vector::operator = (const vector& a)
//make this vector a copy of a
{
double* p = new double [ a.sz ]; // allocate new space
copy(a.elem, a.elem+a.sz, elem); // copy elements
delete[] elem; // deallocate old space
elem = p; // now we can reset elem
sz = a.sz;
return *this; // return a self-reference
}
I thought that the third argument of std::copy() should be the pointer p, but the book (Programming principles and practice using C++ - 2nd edition) says:
"When implementing the assignment, you could consider simplifying the code by freeing the memory for the old elements before creating the copy, but it is usually a very good idea not to throw away information before you know that you can replace it. Also, if you did that, strange things would happen if you assigned a vector to itself" - Page 635 and 636.
So, the pointer elem must be third argument of std::copy() to not let the pointer be invalid for a moment. I think...
But from where does p gets the information to be put in the array it points to, to be able to do: elem = p ?
I already know copy and swap strategy exist, you don't have to explain that.
I want to comprehend what is above.

No, that is a typo.
std::copy(a.elem, a.elem+a.sz, p);
is what the code should read.

Related

C++ Shared pointer points to the wrong address when i use make_shared function from stl

So i have a class in which there is a shared_ptr declared as following
std :: shared_ptr< T > dyn_arr{ new T[ dyn_arr_max_size ], std :: default_delete< T[] >() };
This points to the dynamic array of some size.
I also implemented an iterator for it. Inside this iterator there is a ++ overloaded operator. Now when i get
shared_ptr<T> ptr_iter= dyn_arr;
for example it will work for the first one or two elements. After that it does not iterate properly. Also i notices the following:
For example my ptr_iter is address ABDC0 for ptr_iter.get() in the beginning
After doing
ptr_iter = std :: make_shared<T>( *(ptr.get() + 1 ) );
or
ptr_iter = std :: make_shared<T>( ptr.get()[1] );
ptr_iter.get() will point to some other address now like SDBC instead of pointing to ABDC4 for integers for example. Can someone please explain me why is this happening???
I need to be able to do ptr_iter = make_shared( ptr_iter.get() + 1 ); somehow instead of ptr_iter = make_shared( *(ptr_iter.get() + 1) );
std::make_shared allocates new memory, which you don't want. To solve this problem just use the constructor of std::shared_ptr and pass the adress of the element in the array. However, std::shared_ptr attempts to deallocate as soon as the reference count falls to zero, and you will call delete on an element of the array. That's why you'll need to pass a custom delete which does nothing:
std::shared_ptr<int> ptr_iter{dyn_arr.get() + 1, [](int* pi) {}};
// ^-- Custom deleter does nothing
Example in a loop:
for (int i = 0; i < 9; ++i) {
std::shared_ptr<int> ptr_iter{dyn_arr.get() + i, [](int* pi) {}};
std::cout << *ptr_iter.get() << std::endl;
}
However, I strongly recommend to not do this in other cases than in your assignment!

Why this is an infinite loop

i have declared a map below using stl and inserted some values in it.
#include<bits/stdc++.h>
int main()
{
map<int,int> m;
m[1]=1;
m[2]=1;
m[3]=1;
m[4]=1;
m[5]=1;
m[6]=1;
for(auto it=m.begin();it!=m.end();)
{
cout<<it->first<<" "<<it->second<<endl;
it=it++;
}
return 0;
}
When i executed the above written code it ended up in an infinite loop. Can someone tell me why it does so?
I am incrementing the value of iterator it and then it gets stored in it which should get incremented next time the loop is executed and eventually it should terminate normally.Am i wrong?
The bad line is it = it++;. It is undefined behavior! Because it is not defined, when it is increased, in your case it is increased before the assingment to itsself again, that the value of it before it is increased is assigned to it again and so it keeps at the first position. The correct line would be it = ++it; or only ++it;/it++;, because it changes itsself.
Edit
That is only undefined with the builtin types, but in here that is defined by the source-code of the map in the stl.
If you try doing something similar with an int, you'll get a warning:
int nums[] = { 1, 2, 3, 4, 5 };
for (int i = 0; i < sizeof nums / sizeof *nums; ) {
cout << nums[i] << '\n';
i = i++;
}
warning: operation on 'i' may be undefined [-Wsequence-point]
However, when you're using a class (std::map::iterator) which has operator overloading, the compiler probably isn't smart enought to detect this.
In other words, what you're doing is a sequence point violation, so the behavior is undefined behavior.
The post-increment operation would behave like this:
iterator operator ++ (int) {
auto copy = *this;
++*this;
return copy;
}
So, what happens to your increment step is that iterator it would get overwritten by the copy of its original value. If the map isn't empty, your loop would remain stuck on the first element.

std::vector<std::string> insert empty string instead

In visual studio 2013, I created a std::vector and has store some strings in it. Then I want to make a copy of some string in the vector and append them to the end (suppose to move them to the end, after insert will do erase), but using insert method, I saw only empty strings at the end, very strange. I reproduced it with some simple test code,
std::vector<std::string> v;
std::string s = "0";
for (int i = 0; i < 7; ++i)
{
s[0] = '0' + i;
v.push_back(s);
}
v.insert(v.end(), v.begin(), v.begin() + 3);
for (std::string& s : v)
std::cout << "\"" << s.c_str() << "\" ";
What I get there is
"0" "1" "2" "3" "4" "5" "6" "" "" ""
I debugged into insert method, inside _Insert(..) method of vector class, it did some reallocating of memory, memory move/move and so on.
The first _Umove call move all 7 strings to new allocated memory, I think the std::move is invoked, the old memory has some empty string left.
Then the _Ucopy method try copy 3 items, but from old memory, as a result 3 empty string is attached.
There is another _Umove call, I am not sure what's that for. After all thes, the old memory is freed and new memory attached to the vector.
Using a scalar type like int does not make wrong output, because the memory is copied, no std::move is invoked.
Am I doing something wrong here, or is it a MS Visual Studio's STL bug?
From this std::vector::insert reference:
Causes reallocation if the new size() is greater than the old capacity(). If the new size() is greater than capacity(), all iterators and references are invalidated
[Emphasis mine]
You are adding elements to the vector while iterating the vector using iterators. Since this can cause the vector to be reallocated your iterators will be invalidated. That will lead to undefined behavior.

Hand-over-hand locking with Rust

I'm trying to write an implementation of union-find in Rust. This is famously very simple to implement in languages like C, while still having a complex run time analysis.
I'm having trouble getting Rust's mutex semantics to allow iterative hand-over-hand locking.
Here's how I got where I am now.
First, this is a very simple implementation of part of the structure I want in C:
#include <stdlib.h>
struct node {
struct node * parent;
};
struct node * create(struct node * parent) {
struct node * ans = malloc(sizeof(struct node));
ans->parent = parent;
return ans;
}
struct node * find_root(struct node * x) {
while (x->parent) {
x = x->parent;
}
return x;
}
int main() {
struct node * foo = create(NULL);
struct node * bar = create(foo);
struct node * baz = create(bar);
baz->parent = find_root(bar);
}
Note that the structure of the pointers is that of an inverted tree; multiple pointers may point at a single location, and there are no cycles.
At this point, there is no path compression.
Here is a Rust translation. I chose to use Rust's reference-counted pointer type to support the inverted tree type I referenced above.
Note that this implementation is much more verbose, possibly due to the increased safety that Rust offers, but possibly due to my inexperience with Rust.
use std::rc::Rc;
struct Node {
parent: Option<Rc<Node>>
}
fn create(parent: Option<Rc<Node>>) -> Node {
Node {parent: parent.clone()}
}
fn find_root(x: Rc<Node>) -> Rc<Node> {
let mut ans = x.clone();
while ans.parent.is_some() {
ans = ans.parent.clone().unwrap();
}
ans
}
fn main() {
let foo = Rc::new(create(None));
let bar = Rc::new(create(Some(foo.clone())));
let mut prebaz = create(Some(bar.clone()));
prebaz.parent = Some(find_root(bar.clone()));
}
Path compression re-parents each node along a path to the root every time find_root is called. To add this feature to the C code, only two new small functions are needed:
void change_root(struct node * x, struct node * root) {
while (x) {
struct node * tmp = x->parent;
x->parent = root;
x = tmp;
}
}
struct node * root(struct node * x) {
struct node * ans = find_root(x);
change_root(x, ans);
return ans;
}
The function change_root does all the re-parenting, while the function root is just a wrapper to use the results of find_root to re-parent the nodes on the path to the root.
In order to do this in Rust, I decided I would have to use a Mutex rather than just a reference counted pointer, since the Rc interface only allows mutable access by copy-on-write when more than one pointer to the item is live. As a result, all of the code would have to change. Before even getting to the path compression part, I got hung up on find_root:
use std::sync::{Mutex,Arc};
struct Node {
parent: Option<Arc<Mutex<Node>>>
}
fn create(parent: Option<Arc<Mutex<Node>>>) -> Node {
Node {parent: parent.clone()}
}
fn find_root(x: Arc<Mutex<Node>>) -> Arc<Mutex<Node>> {
let mut ans = x.clone();
let mut inner = ans.lock();
while inner.parent.is_some() {
ans = inner.parent.clone().unwrap();
inner = ans.lock();
}
ans.clone()
}
This produces the error (with 0.12.0)
error: cannot assign to `ans` because it is borrowed
ans = inner.parent.clone().unwrap();
note: borrow of `ans` occurs here
let mut inner = ans.lock();
What I think I need here is hand-over-hand locking. For the path A -> B -> C -> ..., I need to lock A, lock B, unlock A, lock C, unlock B, ... Of course, I could keep all of the locks open: lock A, lock B, lock C, ... unlock C, unlock B, unlock A, but this seems inefficient.
However, Mutex does not offer unlock, and uses RAII instead. How can I achieve hand-over-hand locking in Rust without being able to directly call unlock?
EDIT: As the comments noted, I could use Rc<RefCell<Node>> rather than Arc<Mutex<Node>>. Doing so leads to the same compiler error.
For clarity about what I'm trying to avoid by using hand-over-hand locking, here is a RefCell version that compiles but used space linear in the length of the path.
fn find_root(x: Rc<RefCell<Node>>) -> Rc<RefCell<Node>> {
let mut inner : RefMut<Node> = x.borrow_mut();
if inner.parent.is_some() {
find_root(inner.parent.clone().unwrap())
} else {
x.clone()
}
}
We can pretty easily do full hand-over-hand locking as we traverse this list using just a bit of unsafe, which is necessary to tell the borrow checker a small bit of insight that we are aware of, but that it can't know.
But first, let's clearly formulate the problem:
We want to traverse a linked list whose nodes are stored as Arc<Mutex<Node>> to get the last node in the list
We need to lock each node in the list as we go along the way such that another concurrent traversal has to follow strictly behind us and cannot muck with our progress.
Before we get into the nitty-gritty details, let's try to write the signature for this function:
fn find_root(node: Arc<Mutex<Node>>) -> Arc<Mutex<Node>>;
Now that we know our goal, we can start to get into the implementation - here's a first attempt:
fn find_root(incoming: Arc<Mutex<Node>>) -> Arc<Mutex<Node>> {
// We have to separate this from incoming since the lock must
// be borrowed from incoming, not this local node.
let mut node = incoming.clone();
let mut lock = incoming.lock();
// Could use while let but that leads to borrowing issues.
while lock.parent.is_some() {
node = lock.parent.as_ref().unwrap().clone(); // !! uh-oh !!
lock = node.lock();
}
node
}
If we try to compile this, rustc will error on the line marked !! uh-oh !!, telling us that we can't move out of node while lock still exists, since lock is borrowing node. This is not a spurious error! The data in lock might go away as soon as node does - it's only because we know that we can keep the data lock is pointing to valid and in the same memory location even if we move node that we can fix this.
The key insight here is that the lifetime of data contained within an Arc is dynamic, and it is hard for the borrow checker to make the inferences we can about exactly how long data inside an Arc is valid.
This happens every once in a while when writing rust; you have more knowledge about the lifetime and organization of your data than rustc, and you want to be able to express that knowledge to the compiler, effectively saying "trust me". Enter: unsafe - our way of telling the compiler that we know more than it, and it should allow us to inform it of the guarantees that we know but it doesn't.
In this case, the guarantee is pretty simple - we are going to replace node while lock still exists, but we are not going to ensure that the data inside lock continues to be valid even though node goes away. To express this guarantee we can use mem::transmute, a function which allows us to reinterpret the type of any variable, by just using it to change the lifetime of the lock returned by node to be slightly longer than it actually is.
To make sure we keep our promise, we are going to use another handoff variable to hold node while we reassign lock - even though this moves node (changing its address) and the borrow checker will be angry at us, we know it's ok since lock doesn't point at node, it points at data inside of node, whose address (in this case, since it's behind an Arc) will not change.
Before we get to the solution, it's important to note that the trick we are using here is only valid because we are using an Arc. The borrow checker is warning us of a possibly serious error - if the Mutex was held inline and not in an Arc, this error would be a correct prevention of a use-after-free, where the MutexGuard held in lock would attempt to unlock a Mutex which has already been dropped, or at least moved to another memory location.
use std::mem;
use std::sync::{Arc, Mutex};
fn find_root(incoming: Arc<Mutex<Node>>) -> Arc<Mutex<Node>> {
let mut node = incoming.clone();
let mut handoff_node;
let mut lock = incoming.lock().unwrap();
// Could use while let but that leads to borrowing issues.
while lock.parent.is_some() {
// Keep the data in node around by holding on to this `Arc`.
handoff_node = node;
node = lock.parent.as_ref().unwrap().clone();
// We are going to move out of node while this lock is still around,
// but since we kept the data around it's ok.
lock = unsafe { mem::transmute(node.lock().unwrap()) };
}
node
}
And, just like that, rustc is happy, and we have hand-over-hand locking, since the last lock is released only after we have acquired the new lock!
There is one unanswered question in this implementation which I have not yet received an answer too, which is whether the drop of the old value and assignment of a new value to a variable is a guaranteed to be atomic - if not, there is a race condition where the old lock is released before the new lock is acquired in the assignment of lock. It's pretty trivial to work around this by just having another holdover_lock variable and moving the old lock into it before reassigning, then dropping it after reassigning lock.
Hopefully this fully addresses your question and shows how unsafe can be used to work around "deficiencies" in the borrow checker when you really do know more. I would still like to want that the cases where you know more than the borrow checker are rare, and transmuting lifetimes is not "usual" behavior.
Using Mutex in this way, as you can see, is pretty complex and you have to deal with many, many, possible sources of a race condition and I may not even have caught all of them! Unless you really need this structure to be accessible from many threads, it would probably be best to just use Rc and RefCell, if you need it, as this makes things much easier.
I believe this to fit the criteria of hand-over-hand locking.
use std::sync::Mutex;
fn main() {
// Create a set of mutexes to lock hand-over-hand
let mutexes = Vec::from_fn(4, |_| Mutex::new(false));
// Lock the first one
let val_0 = mutexes[0].lock();
if !*val_0 {
// Lock the second one
let mut val_1 = mutexes[1].lock();
// Unlock the first one
drop(val_0);
// Do logic
*val_1 = true;
}
for mutex in mutexes.iter() {
println!("{}" , *mutex.lock());
}
}
Edit #1
Does it work when access to lock n+1 is guarded by lock n?
If you mean something that could be shaped like the following, then I think the answer is no.
struct Level {
data: bool,
child: Option<Mutex<Box<Level>>>,
}
However, it is sensible that this should not work. When you wrap an object in a mutex, then you are saying "The entire object is safe". You can't say both "the entire pie is safe" and "I'm eating the stuff below the crust" at the same time. Perhaps you jettison the safety by creating a Mutex<()> and lock that?
This is still not the answer your literal question of to how to do hand-over-hand locking, which should only be important in a concurrent setting (or if someone else forced you to use Mutex references to nodes). It is instead how to do this with Rc and RefCell, which you seem to be interested in.
RefCell only allows mutable writes when one mutable reference is held. Importantly, the Rc<RefCell<Node>> objects are not mutable references. The mutable references it is talking about are the results from calling borrow_mut() on the Rc<RefCell<Node>>object, and as long as you do that in a limited scope (e.g. the body of the while loop), you'll be fine.
The important thing happening in path compression is that the next Rc object will keep the rest of the chain alive while you swing the parent pointer for node to point at root. However, it is not a reference in the Rust sense of the word.
struct Node
{
parent: Option<Rc<RefCell<Node>>>
}
fn find_root(mut node: Rc<RefCell<Node>>) -> Rc<RefCell<Node>>
{
while let Some(parent) = node.borrow().parent.clone()
{
node = parent;
}
return node;
}
fn path_compress(mut node: Rc<RefCell<Node>>, root: Rc<RefCell<Node>>)
{
while node.borrow().parent.is_some()
{
let next = node.borrow().parent.clone().unwrap();
node.borrow_mut().parent = Some(root.clone());
node = next;
}
}
This runs fine for me with the test harness I used, though there may still be bugs. It certainly compiles and runs without a panic! due to trying to borrow_mut() something that is already borrowed. It may actually produce the right answer, that's up to you.
On IRC, Jonathan Reem pointed out that inner is borrowing until the end of its lexical scope, which is too far for what I was asking. Inlining it produces the following, which compiles without error:
fn find_root(x: Arc<Mutex<Node>>) -> Arc<Mutex<Node>> {
let mut ans = x.clone();
while ans.lock().parent.is_some() {
ans = ans.lock().parent.clone().unwrap();
}
ans
}
EDIT: As Francis Gagné points out, this has a race condition, since the lock doesn't extend long enough. Here's a modified version that only has one lock() call; perhaps it is not vulnerable to the same problem.
fn find_root(x: Arc<Mutex<Node>>) -> Arc<Mutex<Node>> {
let mut ans = x.clone();
loop {
ans = {
let tmp = ans.lock();
match tmp.parent.clone() {
None => break,
Some(z) => z
}
}
}
ans
}
EDIT 2: This only holds one lock at a time, and so is racey. I still don't know how to do hand-over-hand locking.
As pointed out by Frank Sherry and others, you shouldn't use Arc/Mutex when single threaded. But his code was outdated, so here is the new one (for version 1.0.0alpha2).
This does not take linear space either (like the recursive code given in the question).
struct Node {
parent: Option<Rc<RefCell<Node>>>
}
fn find_root(node: Rc<RefCell<Node>>) -> Rc<RefCell<Node>> {
let mut ans = node.clone(); // Rc<RefCell<Node>>
loop {
ans = {
let ans_ref = ans.borrow(); // std::cell::Ref<Node>
match ans_ref.parent.clone() {
None => break,
Some(z) => z
}
} // ans_ref goes out of scope, and ans becomes mutable
}
ans
}
fn path_compress(mut node: Rc<RefCell<Node>>, root: Rc<RefCell<Node>>) {
while node.borrow().parent.is_some() {
let next = {
let node_ref = node.borrow();
node_ref.parent.clone().unwrap()
};
node.borrow_mut().parent = Some(root.clone());
// RefMut<Node> from borrow_mut() is out of scope here...
node = next; // therefore we can mutate node
}
}
Note for beginners: Pointers are automatically dereferenced by dot operator. ans.borrow() actually means (*ans).borrow(). I intentionally used different styles for the two functions.
Although not the answer to your literal question (hand-over locking), union-find with weighted-union and path-compression can be very simple in Rust:
fn unionfind<I: Iterator<(uint, uint)>>(mut iterator: I, nodes: uint) -> Vec<uint>
{
let mut root = Vec::from_fn(nodes, |x| x);
let mut rank = Vec::from_elem(nodes, 0u8);
for (mut x, mut y) in iterator
{
// find roots for x and y; do path compression on look-ups
while (x != root[x]) { root[x] = root[root[x]]; x = root[x]; }
while (y != root[y]) { root[y] = root[root[y]]; y = root[y]; }
if x != y
{
// weighted union swings roots
match rank[x].cmp(&rank[y])
{
Less => root[x] = y,
Greater => root[y] = x,
Equal =>
{
root[y] = x;
rank[x] += 1
},
}
}
}
}
Maybe the meta-point is that the union-find algorithm may not be the best place to handle node ownership, and by using references to existing memory (in this case, by just using uint identifiers for the nodes) without affecting the lifecycle of the nodes makes for a much simpler implementation, if you can get away with it of course.

How to create a string on the heap in D?

I'm writing a trie in D and I want each trie object have a pointer to some data, which has a non-NULL value if the node is a terminal node in the trie, and NULL otherwise. The type of the data is undetermined until the trie is created (in C this would be done with a void *, but I plan to do it with a template), which is one of the reasons why pointers to heap objects are desirable.
This requires me to eventually create my data on the heap, at which point it can be pointed to by the trie node. Experimenting, it seems like new performs this task, much as it does in C++. However for some reason, this fails with strings. The following code works:
import std.stdio;
void main() {
string *a;
string b = "hello";
a = &b;
writefln("b = %s, a = %s, *a = %s", b, a, *a);
}
/* OUTPUT:
b = hello, a = 7FFF5C60D8B0, *a = hello
*/
However, this fails:
import std.stdio;
void main() {
string *a;
a = new string();
writefln("a = %s, *a = %s", a, *a);
}
/* COMPILER FAILS WITH:
test.d(5): Error: new can only create structs, dynamic arrays or class objects, not string's
*/
What gives? How can I create strings on the heap?
P.S. If anyone writing the D compiler is reading this, the apostrophe in "string's" is a grammatical error.
Strings are always allocated on the heap. This is the same for any other dynamic array (T[], string is only an alias to type immutable(char)[]).
If you need only one pointer there are two ways to do it:
auto str = "some immutable(char) array";
auto ptr1 = &str; // return pointer to reference to string (immutable(char)[]*)
auto ptr2 = str.ptr; // return pointer to first element in string (char*)
If you need pointer to empty string, use this:
auto ptr = &"";
Remember that you can't change value of any single character in string (because they are immutable). If you want to operate on characters in string use this:
auto mutableString1 = cast(char[])"Convert to mutable."; // shouldn't be used
// or
auto mutableString2 = "Convert to mutable.".dup; // T[].dup returns mutable duplicate of array
Generally you should avoid pointers unless you absolutely know what are you doing.
From memory point of view any pointer take 4B (8B for x64 machines) of memory, but if you are using pointers to arrays then, if pointer is not null, there are 12B (+ data in array) of memory in use. 4B if from pointer and 8B are from reference to array, because array references are set of two pointers. One to first and one to last element in array.
Remember that string is just immutable(char)[]. So you don't need pointers since string is already a dynamic array.
As for creating them, you just do new char[X], not new string.
The string contents are on the heap already because strings are dynamic arrays. However, in your case, it is better to use a char dynamic array instead as you require mutability.
import std.stdio;
void main() {
char[] a = null; // redundant as dynamic arrays are initialized to null
writefln("a = \"%s\", a.ptr = %s", a, a.ptr); // prints: a = "", a.ptr = null
a = "hello".dup; // dup is required because a is mutable
writefln("a = \"%s\", a.ptr = %s", a, a.ptr); // prints: a = "hello", a.ptr = 7F3146469FF0
}
Note that you don't actually hold the array's contents, but a slice of it. The array is handled by the runtime and it is allocated on the heap.
A good reading on the subject is this article http://dlang.org/d-array-article.html
If you can only use exactly one pointer and you don't want to use the suggestions in Marmyst's answer (&str in his example creates a reference to the stack which you might not want, str.ptr loses information about the strings length as D strings are not always zero terminated) you can do this:
Remeber that you can think of D arrays (and therefore strings) as a struct with a data pointer and length member:
struct ArraySlice(T)
{
T* ptr;
size_t length;
}
So when dealing with an array the array's content is always on the heap, but the ptr/length combined type is a value type and therefore usually kept on the stack. I don't know why the compiler doesn't allow you to create that value type on the heap using new, but you can always do it manually:
import core.memory;
import std.stdio;
string* ptr;
void alloc()
{
ptr = cast(string*)GC.malloc(string.sizeof);
*ptr = "Hello World!";
}
void main()
{
alloc();
writefln("ptr=%s, ptr.ptr=%s, ptr.length=%s, *ptr=%s", ptr, ptr.ptr, ptr.length, *ptr);
}

Resources