Memory allocation of string literal in c

Memory allocation of string literal in c - memory-management

I am having a strange issue with memory allocation in c, the file is fairly complicated so I cannot include it all here but perhaps you can point me in the right direction as to why this may be happening.
I am trying to create a string literal as such:
char * p = "root"
But when i look at the value of this variable at runtime (at the line directly after the declaration) i get this:
$1 = 0x7001260c "me"
and when I look at the contents of the memory at 0x7001260c it indeed holds the string "me".
EDIT:
To give more context when I run the following code the value of p on the last line is "root".
create_directory("root/home");
char * p = "root";
char * q = "foo";
And when I run the following code the value of p is "io"
create_directory("io/home");
char * p = "root";
char * q = "foo";
The create_directory function:
void create_directory(char * path) {
directory d;
directory * dir = &d;
//Browse to closest directory
path = find_directory(path, dir);
//Create remaining directories
char component[20];
path = next_component(path, component);
while (strlen(component) > 0) {
add_dir_entry(dir, component, inode_next);
write_dir_entry(dir, inode_to_loc(dir->inode));
directory new;
new.type = DIRECTORY;
new.inode = inode_next;
write_dir_entry(&new, inode_to_loc(inode_next));
inode_next++;
dir = &new;
path = next_component(path, component);
}
}

Almost certainly, there's a bug somewhere in your program that causes a constant to be modified which is, of course, illegal. Perhaps you're doing something like this:
void to_lower(char *j)
{
while (*j != 0) { *j = tolower(*j); j++; }
}
...
bool is_yes(char *k)
{
to_lower(k);
return strcmp(k, "yes") == 0;
}
void someFunc(char *k)
{
if (is_yes(k)) // ...
...
}
someFunc("testing");
See what this does? We pass a pointer to a constant to sumeFunc, but it flows down to to_lower which modifies the thing it points to -- modifying a constant.
Somehow, your code probably does something like that.
Start by changing code like char * p = "root" to code like char const* p = "root". That will give you a better chance of catching this kind of problem at compile time.

Related

Algorithm / data structure for resolving nested interpolated values in this example?

I am working on a compiler and one aspect currently is how to wait for interpolated variable names to be resolved. So I am wondering how to take a nested interpolated variable string and build some sort of simple data model/schema for unwrapping the evaluated string so to speak. Let me demonstrate.
Say we have a string like this:
foo{a{x}-{y}}-{baz{one}-{two}}-foo{c}
That has 1, 2, and 3 levels of nested interpolations in it. So essentially it should resolve something like this:
wait for x, y, one, two, and c to resolve.
when both x and y resolve, then resolve a{x}-{y} immediately.
when both one and two resolve, resolve baz{one}-{two}.
when a{x}-{y}, baz{one}-{two}, and c all resolve, then finally resolve the whole expression.
I am shaky on my understanding of the logic flow for handling something like this, wondering if you could help solidify/clarify the general algorithm (high level pseudocode or something like that). Mainly just looking for how I would structure the data model and algorithm so as to progressively evaluate when the pieces are ready.
I'm starting out trying and it's not clear what to do next:
{
dependencies: [
{
path: [x]
},
{
path: [y]
}
],
parent: {
dependency: a{x}-{y} // interpolated term
parent: {
dependencies: [
{
}
]
}
}
}
Some sort of tree is probably necessary, but I am having trouble figuring out what it might look like, wondering if you could shed some light on that with some pseudocode (or JavaScript even).
watch the leaf nodes at first
then, when the children of a node are completed, propagate upward to resolving the next parent node. This would mean once x and y are done, it could resolve a{x}-{y}, but then wait until the other nodes are ready before doing the final top-level evaluation.
You can just simulate it by sending "events" to the system theoretically, like:
ready('y')
ready('c')
ready('x')
ready('a{x}-{y}')
function ready(variable) {
if ()
}
...actually that may not work, not sure how to handle the interpolated nodes in a hacky way like that. But even a high level description of how to solve this would be helpful.
export type SiteDependencyObserverParentType = {
observer: SiteDependencyObserverType
remaining: number
}
export type SiteDependencyObserverType = {
children: Array<SiteDependencyObserverType>
node: LinkNodeType
parent?: SiteDependencyObserverParentType
path: Array<string>
}
(What I'm currently thinking, some TypeScript)

Here is an approach in JavaScript:
Parse the input string to create a Node instance for each {} term, and create parent-child dependencies between the nodes.
Collect the leaf Nodes of this tree as the tree is being constructed: group these leaf nodes by their identifier. Note that the same identifier could occur multiple times in the input string, leading to multiple Nodes. If a variable x is resolved, then all Nodes with that name (the group) will be resolved.
Each node has a resolve method to set its final value
Each node has a notify method that any of its child nodes can call in order to notify it that the child has been resolved with a value. This may (or may not yet) lead to a cascading call of resolve.
In a demo, a timer is set up that at every tick will resolve a randomly picked variable to some number
I think that in your example, foo, and a might be functions that need to be called, but I didn't elaborate on that, and just considered them as literal text that does not need further treatment. It should not be difficult to extend the algorithm with such function-calling features.
class Node {
constructor(parent) {
this.source = ""; // The slice of the input string that maps to this node
this.texts = []; // Literal text that's not part of interpolation
this.children = []; // Node instances corresponding to interpolation
this.parent = parent; // Link to parent that should get notified when this node resolves
this.value = undefined; // Not yet resolved
}
isResolved() {
return this.value !== undefined;
}
resolve(value) {
if (this.isResolved()) return; // A node is not allowed to resolve twice: ignore
console.log(`Resolving "${this.source}" to "${value}"`);
this.value = value;
if (this.parent) this.parent.notify();
}
notify() {
// Check if all dependencies have been resolved
let value = "";
for (let i = 0; i < this.children.length; i++) {
const child = this.children[i];
if (!child.isResolved()) { // Not ready yet
console.log(`"${this.source}" is getting notified, but not all dependecies are ready yet`);
return;
}
value += this.texts[i] + child.value;
}
console.log(`"${this.source}" is getting notified, and all dependecies are ready:`);
this.resolve(value + this.texts.at(-1));
}
}
function makeTree(s) {
const leaves = {}; // nodes keyed by atomic names (like "x" "y" in the example)
const tokens = s.split(/([{}])/);
let i = 0; // Index in s
function dfs(parent=null) {
const node = new Node(parent);
const start = i;
while (tokens.length) {
const token = tokens.shift();
i += token.length;
if (token == "}") break;
if (token == "{") {
node.children.push(dfs(node));
} else {
node.texts.push(token);
}
}
node.source = s.slice(start, i - (tokens.length ? 1 : 0));
if (node.children.length == 0) { // It's a leaf
const label = node.texts[0];
leaves[label] ??= []; // Define as empty array if not yet defined
leaves[label].push(node);
}
return node;
}
dfs();
return leaves;
}
// ------------------- DEMO --------------------
let s = "foo{a{x}-{y}}-{baz{one}-{two}}-foo{c}";
const leaves = makeTree(s);
// Create a random order in which to resolve the atomic variables:
function shuffle(array) {
for (var i = array.length - 1; i > 0; i--) {
var j = Math.floor(Math.random() * (i + 1));
[array[j], array[i]] = [array[i], array[j]];
}
return array;
}
const names = shuffle(Object.keys(leaves));
// Use a timer to resolve the variables one by one in the given random order
let index = 0;
function resolveRandomVariable() {
if (index >= names.length) return; // all done
console.log("\n---------------- timer tick --------------");
const name = names[index++];
console.log(`Variable ${name} gets a value: "${index}". Calling resolve() on the connected node instance(s):`);
for (const node of leaves[name]) node.resolve(index);
setTimeout(resolveRandomVariable, 1000);
}
setTimeout(resolveRandomVariable, 1000);

your idea of building a dependency tree it's really likeable.
Anyway I tryed to find a solution as simplest possible.
Even if it already works, there are many optimizations possible, take this just as proof of concept.
The background idea it's produce a List of Strings which you can read in order where each element it's what you need to solve progressively. Each element might be mandatory to solve something that come next in the List, hence for the overall expression. Once you solved all the chunks you have all pieces to solve your original expression.
It's written in Java, I hope it's understandable.
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.Objects;
public class StackOverflow {
public static void main(String[] args) {
String exp = "foo{a{x}-{y}}-{baz{one}-{two}}-foo{c}";
List<String> chunks = expToChunks(exp);
//it just reverse the order of the list
Collections.reverse(chunks);
System.out.println(chunks);
//output -> [c, two, one, baz{one}-{two}, y, x, a{x}-{y}]
}
public static List<String> expToChunks(String exp) {
List<String> chunks = new ArrayList<>();
//this first piece just find the first inner open parenthesys and its relative close parenthesys
int begin = exp.indexOf("{") + 1;
int numberOfParenthesys = 1;
int end = -1;
for(int i = begin; i < exp.length(); i++) {
char c = exp.charAt(i);
if (c == '{') numberOfParenthesys ++;
if (c == '}') numberOfParenthesys --;
if (numberOfParenthesys == 0) {
end = i;
break;
}
}
//this if put an end to recursive calls
if(begin > 0 && begin < exp.length() && end > 0) {
//add the chunk to the final list
String substring = exp.substring(begin, end);
chunks.add(substring);
//remove from the starting expression the already considered chunk
String newExp = exp.replace("{" + substring + "}", "");
//recursive call for inner element on the chunk found
chunks.addAll(Objects.requireNonNull(expToChunks(substring)));
//calculate other chunks on the remained expression
chunks.addAll(Objects.requireNonNull(expToChunks(newExp)));
}
return chunks;
}
}
Some details on the code:
The following piece find the begin and the end index of the first outer chunk of expression. The background idea is: in a valid expression the number of open parenthesys must be equal to the number of closing parenthesys. The count of open(+1) and close(-1) parenthesys can't ever be negative.
So using that simple loop once I find the count of parenthesys to be 0, I also found the first chunk of the expression.
int begin = exp.indexOf("{") + 1;
int numberOfParenthesys = 1;
int end = -1;
for(int i = begin; i < exp.length(); i++) {
char c = exp.charAt(i);
if (c == '{') numberOfParenthesys ++;
if (c == '}') numberOfParenthesys --;
if (numberOfParenthesys == 0) {
end = i;
break;
}
}
The if condition provide validation on the begin and end indexes and stop the recursive call when no more chunks can be found on the remained expression.
if(begin > 0 && begin < exp.length() && end > 0) {
...
}

MPI: How to ensure a subroutine is executed only on one processor on the default node?

I use a large-scale parallelized code, and I am new to MPI itself.
I try to run a set of shell commands from Fortran, and hence it would be entirely wasteful (and cause my results to be incorrect) if done on more than one processor.
The most relevant commands I have found are MPI_gather and MPI_reduce, but these seem problematic, because they are trying to take information from other processors and use them on the processor 0, but I have no information that I am calling from other processors.
Basically I want to do something like this :
if (MPI_node = 0 .and. MPI_process = 0) then
(execute a code)
end if

I recently had an issue like this. The way I solved it was to use MPI_Comm_split to create a communicator for each node. Something like this (C++):
char node_name[MPI_MAX_PROCESSOR_NAME];
int name_len;
int processor_hash_id;
int global_proc_id;
int global_proc_num;
int node_proc_id;
int node_proc_num;
MPI_Comm node_comm;
//Get global info
MPI_Comm_rank(MPI_COMM_WORLD, &global_proc_id);
MPI_Comm_size(MPI_COMM_WORLD, &global_proc_num);
MPI_Get_processor_name(node_name, &name_len);
//Hash the node name
processor_hash_id = get_hash_id(node_name);
//Make a new communicator for processes only on the node
// and get node info
MPI_Comm_split(MPI_COMM_WORLD, processor_hash_id, global_proc_id, &node_comm);
MPI_Comm_rank(node_comm, &node_proc_id);
MPI_Comm_size(node_comm, &node_proc_num);
//Now, if you know the name of the "root node" to execute shell commands on:
if (node_proc_id==0 && processor_hash_id == get_hash_id("name-of-root-node"))
{
//do whatever
}
//Some hash function
int get_hash_id(const char* s)
{
int h = 37;
while (*s)
{
h = (h * 54059) ^ (s[0] * 76963);
s++;
}
return h;
}
Of course you will need to know the name of the root node.
If it doesn't matter what node it executes on, then I would suggest the following:
int global_proc_id;
int global_proc_num;
//Get global info
MPI_Comm_rank(MPI_COMM_WORLD, &global_proc_id);
MPI_Comm_size(MPI_COMM_WORLD, &global_proc_num);
if (global_proc_id==0)
{
//do whatever
}
global_proc_id==0 will only be true on one node.

Pointer not printing char[] array

I'm writing some code to take in a string, turn it into a char array and then print back to the user (before passing to another function).
Currently the code works up to dat.toCharArray(DatTim,datsize); however, the pointer does not seem to be working as the wile loop never fires
String input = "Test String for Foo";
InputParse(input);
void InputParse (String dat)
//Write Data
datsize = dat.length()+1;
const char DatTim[datsize];
dat.toCharArray(DatTim,datsize);
//Debug print back
for(int i=0;i<datsize;i++)
{
Serial.write(DatTim[i]);
}
Serial.println();
//Debug pointer print back
const char *b;
b=*DatTim;
while (*b)
{
Serial.print(*b);
b++;
}
Foo(*DatTim);
I can't figure out the difference between what I have above vs the template code provided by Majenko
void PrintString(const char *str)
{
const char *p;
p = str;
while (*p)
{
Serial.print(*p);
p++;
}
}

The expression *DatTim is the same as DatTim[0], i.e. it gets the first character in the array and then assigns it to the pointer b (something the compiler should have warned you about).
Arrays naturally decays to pointers to their first element, that is DatTim is equal to &DatTim[0].
The simple solution is to simply do
const char *b = DatTim;

Char vector is modified on copying character in C++

Following is my submission for
https://leetcode.com/problems/letter-combinations-of-a-phone-number
However I see very weird behavior on lines
char c = map1[s[i]][j];
t.replace(i,1,1,c);
The vector map1[s[i]] is changing size on execution of above line in the code below. How can character c be copied by value rather than reference. If I remove reference for string in function argument (meaning no &) everything works fine but it won't be the right way, since copying string by value is costly.
class Solution {
public:
string s;
vector<string> res;
unordered_map<char, vector<char>> map1;
void letterCombinations(string &t, int i){
char c;
if(s.size()==0)
return;
if(i==s.size())
res.push_back(t);
else{
for(int j=0;j<map1[s[i]].size();j++){
//cout<<"Starting:"<<map1[s[i]].size()<<endl;
char c = map1[s[i]][j];
t.replace(i,1,1,c);
letterCombinations(t, i+1);
//cout<<"Return:"<<map1[s[i]].size()<<endl;
}
}
}
vector<string> letterCombinations(string digits) {
stringstream t;
map1['1'] = vector<char>{};
map1['0'] = vector<char>{};
map1['2'] = vector<char>{'a','b','c'};
map1['3'] = vector<char>{'d','e','f'};
map1['4'] = vector<char>{'g','h','i'};
map1['5'] = vector<char>{'j','k','l'};
map1['6'] = vector<char>{'m','n','o'};
map1['7'] = vector<char>{'p','q','r','s'};
map1['8'] = vector<char>{'t','u','v'};
map1['9'] = vector<char>{'w','x','y','z'};
s=digits;
letterCombinations(s,0);
return res;
}
};

Same .txt files, different sizes?

I have a program that reads from a .txt file
I use the cmd prompt to execute the program with the name of the text file to read from.
ex: program.exe myfile.txt
The problem is that sometimes it works, sometimes it doesn't.
The original file is 130KB and doesn't work.
If I copy/paste the contents, the file is 65KB and works.
If I copy/paste the file and rename it, it's 130KB and doesn't work.
Any ideas?
After more testing it shows that this is what makes it not work:
int main(int argc, char *argv[])
{
char *infile1
char tmp[1024] = { 0x0 };
FILE *in;
for (i = 1; i < argc; i++) /* Skip argv[0] (program name). */
{
if (strcmp(argv[i], "-sec") == 0) /* Process optional arguments. */
{
opt = 1; /* This is used as a boolean value. */
/*
* The last argument is argv[argc-1]. Make sure there are
* enough arguments.
*/
if (i + 1 <= argc - 1) /* There are enough arguments in argv. */
{
/*
* Increment 'i' twice so that you don't check these
* arguments the next time through the loop.
*/
i++;
optarg1 = atoi(argv[i]); /* Convert string to int. */
}
}
else /* not -sec */
{
if (infile1 == NULL) {
infile1 = argv[i];
}
else {
if (outfile == NULL) {
outfile = argv[i];
}
}
}
}
in = fopen(infile1, "r");
if (in == NULL)
{
fprintf(stderr, "Unable to open file %s: %s\n", infile1, strerror(errno));
exit(1);
}
while (fgets(tmp, sizeof(tmp), in) != 0)
{
fprintf(stderr, "string is %s.", tmp);
//Rest of code
}
}
Whether it works or not, the code inside the while loop gets executed.
When it works tmp actually has a value.
When it doesn't work tmp has no value.
EDIT:
Thanks to sneftel, we know what the problem is,
For me to use fgetws() instead of fgets(), I need tmp to be a wchar_t* instead of a char*.
Type casting seems to not work.
I tried changing the declaration of tmp to
wchar_t tmp[1024] = { 0x0 };
but I realized that tmp is a parameter in strtok() used elsewhere in my code.
I here is what I tried in that function:
//tmp is passed as the first parameter in parse()
void parse(wchar_t *record, char *delim, char arr[][MAXFLDSIZE], int *fldcnt)
{
if (*record != NULL)
{
char*p = strtok((char*)record, delim);
int fld = 0;
while (p) {
strcpy(arr[fld], p);
fld++;
p = strtok('\0', delim);
}
*fldcnt = fld;
}
else
{
fprintf(stderr, "string is null");
}
}
But typecasting to char* in strtok doesn't work either.
Now I'm looking for a way to just convert the file from UTF-16 to UTF-8 so tmp can be of type char*
I found this which looks like it can be useful but in the example it uses input from the user as UTF-16, how can that input be taken from the file instead?
http://www.cplusplus.com/reference/locale/codecvt/out/

It sounds an awful lot like the original file is UTF-16 encoded. When you copy/paste it in your text editor, you then save the result out as a new (default encoding) (ASCII or UTF-8) text file. Since a single character takes 2 bytes in a UTF-16-encode file but only 1 byte in a UTF-8-encoded file, that results in the file size being roughly halved when you save it out.
UTF-16 is fine, but you'll need to use Unicode-aware functions (that is, not fgets) to work with it. If you don't want to deal with all that Unicode jazz right now, and you don't actually have any non-ASCII characters to deal with in the file, just do the manual conversion (either with your copy/paste or with a command-line utility) before running your program.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Memory allocation of string literal in c - memory-management

Related

Algorithm / data structure for resolving nested interpolated values in this example?

MPI: How to ensure a subroutine is executed only on one processor on the default node?

Pointer not printing char[] array

Char vector is modified on copying character in C++

Same .txt files, different sizes?

Categories

Resources