sort files by ascii order - sorting

I'm trying to code a simple function to sort the content of a directory. The Thing is, it comes out in alphabetical order, regardless of uppercase or lowercase. I'd like to sort this content in ASCII order.
example: I got 4 files, named Art, boat, Cat and donkey. My actual code sort them in this order, while i'd like to get Art, Cat, boat and donkey.
void list_dir(char *str){
DIR *rep = NULL;
struct dirent* read_file = NULL;
rep = opendir(str);
if (!rep)
{
ft_putstr("ft_ls: ");
perror(str);
ft_putchar('\n');
}
while((read_file = readdir(rep)) != NULL)
{
if (read_file->d_name[0] != '.')
{
ft_putstr(read_file->d_name);
ft_putchar('\n');
}
}
}

readdir(3) does not normally sort at all, it lists the entries in directory order. If the list is sorted, either the files were created sorted, or the OS sorts them.
In order to sort the output yourself, put the list of names into an array then sort it e.g. with qsort(3) and strcmp(3).
Alternatively, just pipe the output through sort(1). Do make sure that the LC_COLLATION environment variable is set proper. For example, run ./yourprogram | (unset LC_ALL; LC_CTYPE=en_US.UTF-8 LC_COLLATE=C sort).

By calling scandir with user defined filter & comparator is a simple solution imho. Here is the code:
#include <dirent.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
static int my_dir_filter(const struct dirent* dir);
static int my_dir_comparator(const struct dirent**, const struct dirent**);
int main(int argc, char* const* argv) {
struct dirent** ent_list_ = NULL;
int r = scandir(".", &ent_list_, my_dir_filter, my_dir_comparator);
for (int i = 0; i < r; ++i)
printf("No. %-3d [%s]\n", i + 1, ent_list_[i]->d_name);
for (int i = 0; i < r; ++i)
free(ent_list_[i]);
free(ent_list_);
return r < 0 ? 1 : 0;
}
int my_dir_filter(const struct dirent* dir) {
return (dir->d_type == DT_REG) ? 1 : 0;
}
int my_dir_comparator(const struct dirent** lhs, const struct dirent** rhs) {
return strcasecmp((*lhs)->d_name, (*rhs)->d_name);
}
And test result:
$ ls|LANG=C sort ## in ASCII order
Art
Cat
boat
donkey
$ ../a.out ## in my_dir_comparator order
No. 1 [Art]
No. 2 [boat]
No. 3 [Cat]
No. 4 [donkey]

Related

how to find characters in a string and then remove them if they match a particular character?

/* I have to find character and remove them, based on the character it has to go inside the if/else if condition. I am facing difficulty in getting inside the else if condition */
#include <iostream>
#include <boost/algorithm/string.hpp>
#include <string>
using namespace std;
int main() {
int fut = 0, spd =0;
std::string symbol = "PGSh/d TWOGK h/d"; //it will contain either 'h/d' or '/'
std::string str = "h/d";
std::string str1 = "/";
if(symbol.find(str)) //if it finds "h/d" then it belongs to future
{
++fut; //even one count is enough
boost::erase_all(symbol, "h/d");
std::cout<<"Future Instrument "<<std::endl;
}
else if(symbol.find(str1)) //if it finds "/" then it belongs to spread
{
++spd; //even one count is enough
boost::erase_all(symbol, "//");
std::cout<<"Spread Instrument "<<std::endl;
}
boost::erase_all(symbol, " ");
boost::to_upper(symbol);
std::cout<<symbol<<std::endl;
return 0;
}

... operator in function

Sorry for the noob question. I've been immersed in Java for the past while and the book for this course doesn't cover C++.
I have to fill in a function to add keywords (of string type) to an Item object. the prototype of the function is as follows.
void addKeywordsForItem(const Item* const item, int nKeywords, ...);
In Java ... returns the remainder of arguments as a String object and I'm guessing C++ does something similar but I don't know the name of ... so searching for it is rather difficult.
What is ... called and what does it do?
What is ... called and what does it do?
There are multiple places where ... is used in C++. The context in which you are using it, it is called variadic arguments.
The standard header cstdarg provides a type and macros to help you extract specific arguments from variadic arguments.
Example code from http://en.cppreference.com/w/cpp/utility/variadic/va_start:
#include <iostream>
#include <cstdarg>
int add_nums(int count, ...)
{
int result = 0;
va_list args;
va_start(args, count);
for (int i = 0; i < count; ++i) {
result += va_arg(args, int);
}
va_end(args);
return result;
}
int main()
{
std::cout << add_nums(4, 25, 25, 50, 50) << '\n';
}

As one MPI process executes MPI_Barrier(), other processes hang

I have an MPI program for having multiple processes read from a file that contains list of file names and based on the file names read - it reads the corresponding file and counts the frequency of words.
If one of the processes completes this and returns - to block executing MPI_Barrier(), the other processes also hang. On debugging, it could be seen that the readFile() function is not entered by the processes currently in process_files() Unable to figure out why this happens. Please find the code below:
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include <ctype.h>
#include <string.h>
#include "hash.h"
void process_files(char*, int* , int, hashtable_t* );
void initialize_word(char *c,int size)
{
int i;
for(i=0;i<size;i++)
c[i]=0;
return;
}
char* readFilesList(MPI_File fh, char* file,int rank, int nprocs, char* block, const int overlap, int* length)
{
char *text;
int blockstart,blockend;
MPI_Offset size;
MPI_Offset blocksize;
MPI_Offset begin;
MPI_Offset end;
MPI_Status status;
MPI_File_open(MPI_COMM_WORLD,file,MPI_MODE_RDONLY,MPI_INFO_NULL,&fh);
MPI_File_get_size(fh,&size);
/*Block size calculation*/
blocksize = size/nprocs;
begin = rank*blocksize;
end = begin+blocksize-1;
end+=overlap;
if(rank==nprocs-1)
end = size;
blocksize = end-begin+1;
text = (char*)malloc((blocksize+1)*sizeof(char));
MPI_File_read_at_all(fh,begin,text,blocksize,MPI_CHAR, &status);
text[blocksize+1]=0;
blockstart = 0;
blockend = blocksize;
if(rank!=0)
{
while(text[blockstart]!='\n' && blockstart!=blockend) blockstart++;
blockstart++;
}
if(rank!=nprocs-1)
{
blockend-=overlap;
while(text[blockend]!='\n'&& blockend!=blocksize) blockend++;
}
blocksize = blockend-blockstart;
block = (char*)malloc((blocksize+1)*sizeof(char));
block = memcpy(block, text + blockstart, blocksize);
block[blocksize]=0;
*length = strlen(block);
MPI_File_close(&fh);
return block;
}
void calculate_term_frequencies(char* file, char* text, hashtable_t *hashtable,int rank)
{
printf("Start File %s, rank %d \n\n ",file,rank);
fflush(stdout);
if(strlen(text)!=0||strlen(file)!=0)
{
int i,j;
char w[100];
i=0,j=0;
while(text[i]!=0)
{
if((text[i]>=65&&text[i]<=90)||(text[i]>=97&&text[i]<=122))
{
w[j]=text[i];
j++; i++;
}
else
{
w[j] = 0;
if(j!=0)
{
//ht_set( hashtable, strcat(strcat(w,"#"),file),1);
}
j=0;
i++;
initialize_word(w,100);
}
}
}
return;
}
void readFile(char* filename, hashtable_t *hashtable,int rank)
{
MPI_Status stat;
MPI_Offset size;
MPI_File fx;
char* textFromFile=0;
printf("Start File %d, rank %d \n\n ",strlen(filename),rank);
fflush(stdout);
if(strlen(filename)!=0)
{
MPI_File_open(MPI_COMM_WORLD,filename,MPI_MODE_RDONLY,MPI_INFO_NULL,&fx);
MPI_File_get_size(fx,&size);
printf("Start File %s, rank %d \n\n ",filename,rank);
fflush(stdout);
textFromFile = (char*)malloc((size+1)*sizeof(char));
MPI_File_read_at_all(fx,0,textFromFile,size,MPI_CHAR, &stat);
textFromFile[size]=0;
calculate_term_frequencies(filename, textFromFile, hashtable,rank);
MPI_File_close(&fx);
}
printf("Done File %s, rank %d \n\n ",filename,rank);
fflush(stdout);
return;
}
void process_files(char* block, int* length, int rank,hashtable_t *hashtable)
{
char s[2];
s[0] = '\n';
s[1] = 0;
char *file;
if(*length!=0)
{
/* get the first file */
file = strtok(block, s);
/* walk through other tokens */
while( file != NULL )
{
readFile(file,hashtable,rank);
file = strtok(NULL, s);
}
}
return;
}
void execute_process(MPI_File fh, char* file, int rank, int nprocs, char* block, const int overlap, int * length, hashtable_t *hashtable)
{
block = readFilesList(fh,file,rank,nprocs,block,overlap,length);
process_files(block,length,rank,hashtable);
}
int main(int argc, char *argv[]){
/*Initialization*/
MPI_Init(&argc, &argv);
MPI_File fh=0;
int rank,nprocs,namelen;
char *block=0;
const int overlap = 70;
char* file = "filepaths.txt";
int *length = (int*)malloc(sizeof(int));
hashtable_t *hashtable = ht_create( 65536 );
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
char processor_name[MPI_MAX_PROCESSOR_NAME];
MPI_Get_processor_name(processor_name, &namelen);
printf("Rank %d is on processor %s\n",rank,processor_name);
fflush(stdout);
execute_process(fh,file,rank,nprocs,block,overlap,length,hashtable);
printf("Rank %d returned after processing\n",rank);
MPI_Barrier(MPI_COMM_WORLD);
MPI_Finalize();
return 0;
}
The filepaths.txt is a file that contain the absolute file names of normal text files:
eg:
/home/mpiuser/mpi/MPI_Codes/code/test1.txt
/home/mpiuser/mpi/MPI_Codes/code/test2.txt
/home/mpiuser/mpi/MPI_Codes/code/test3.txt
Your readFilesList function is pretty confusing, and I believe it doesn't do what you want it to do, but maybe I just do not understand it correctly. I believe it is supposed to collect a bunch of filenames out of the list file for each process. A different set for each process. It does not do that, but this is not the problem, even if this would do what you want it to, the subsequent MPI IO would not work.
When reading files, you use MPI_File_read_all with MPI_COMM_WORLD as communicator. This requires all processes to participate in reading this file. Now, if each process should read a different file, this obviously is not going to work.
So there are several issues with your implementation, though I can not really explain your described behavior, I would rather first start off and try to fix them, before debugging in detail, what might go wrong.
I am under the impression, you want to have an algorithm along these lines:
Read a list of file names
Distribute that list of files equally to all processes
Have each process work on its own set of files
Do something with the data from this processing
And I would suggest to try this with the following approach:
Read the list on a single process (no MPI IO)
Scatter the list of files to all processes, such that all get around the same amount of work
Have each process work on its list of files independently and in serial (serial file access and processing)
Some data reduction with MPI, as needed
I believe, this would be the best (easiest and fastest) strategy in your scenario. Note, that no MPI IO is involved here at all. I don't think doing some complicated distributed reading of the file list in the first step would result in any advantage here, and in the actual processing it would actually be harmful. The more independent your processes are, the better your scalability usually.

Main C Program does not find header and methods

Wasn't really sure how to explain this any better in the title. Basically I am learning how to separate my code in C. I have a main, the equivalent of an ArrayList class from java (but converted to c and is very basic) and a header file which declares my struct and all the functions in use. I am using all sample code out of the text and I am using the latest version of dev c++ for windows 8.
Every time I try to compile main I get:
In function main undefined reference to "newList"
[Error] Id returned 1 exit status
Here is my code:
main.c
#include <stdio.h>
#include "ArrayList.h"
int main(int numParms, char *parms[]){
list myList;
myList = newList(myList);
printf("End");
return 0;
}
ArrayList.c
#include <stdio.h>
#include "ArrayList.h"
list newList(list myList){
myList.size = 0;
return myList;
}
list add(list myList, int value){
myList.values[myList.size] = value;
myList.size++;
return myList;
}
int get(list myList, int position){
int entry;
entry = myList.values[position];
return entry;
}
int size(list myList){
return myList.size;
}
list delete(list myList, int position){
int count;
for(count =0; count<(myList.size-1); count++){
myList.values[count] = myList.values[count+1];
}
myList.size --;
return myList;
}
void print(list myList){
int count;
printf("Current list contents:\n");
if (myList.size > 0){
for (count=0; count<myList.size; count++){
printf("Element %d is %d\n", count, get(myList, count));
}
printf("\n");
}
else{
printf("The list is empty\n\n");
}
}
ArrayList.h
#define MAX_SIZE 100
typedef struct{
int size;
int values[MAX_SIZE];
}list;
list newList(list);
list add(list, int);
int get(list, int);
int size(list);
list delete(list, int);
void print(list);
That is actually a linker problem. The compilation is OK, but when the linker tries to assemble the pieces it can't find newList anywhere. My guess would be that you did not compile the file ArrayList.c and link the result to your project.

Sorting a structure of arrays using Thrust

I am working on CUDA and facing the following problem.
I have a following structure of arrays:
typedef struct Edge{
int *from, *to, *weight;
}
I want to sort this structure on weight array such that the corresponding "from" and "to" arrays get updated too. I thought of using Thrust library but it works only on vectors is what I understood. I can sort_by_key and get two arrays sorted but I am not able to understand how to sort three arrays? I even looked at zip_iterator but did not understand how to use it to serve my purpose. Please help
First decouple the structure into 1) keys, and 2) paddings. Then sort the keys and reorder paddings accordingly. For example, break this structure:
typedef struct Edge{
int from, to, weight;
}
into:
int weight[N];
typedef struct Edge{
int from, to;
}
The full code is here:
#include <thrust/device_vector.h>
#include <thrust/host_vector.h>
#include <cmath>
#include <thrust/sort.h>
typedef struct pad {
int from;
int to;
} padd;
__host__ padd randPad() {
padd p;
p.from = rand();
p.to = rand();
return p;
}
__host__ std::ostream& operator<< (std::ostream& os, const padd& p) {
os << "(" << p.to << " , " << p.from << " )";
return os;
}
int main(void)
{
// allocation
#define N 4
thrust::host_vector<int> h_keys(4);
thrust::host_vector<padd> h_pad(4);
// initilization
thrust::generate(h_keys.begin(), h_keys.end(), rand);
thrust::generate(h_pad.begin(), h_pad.end(), randPad);
// print unsorted data
std::cout<<"Unsorted keys\n";
thrust::copy(h_keys.begin(), h_keys.end(), std::ostream_iterator<int>(std::cout, "\n"));
std::cout<<"\nUnsorted paddings\n";
thrust::copy(h_pad.begin(), h_pad.end(), std::ostream_iterator<padd>(std::cout, "\n"));
// transfer to device
thrust::device_vector<int> d_keys = h_keys;
thrust::device_vector<padd> d_pad = h_pad;
//thrust::sort(d_keys.begin(), d_keys.end());
// sort
thrust::sort_by_key(d_keys.begin(), d_keys.end(), d_pad.begin());
// transfer back to host
thrust::copy(d_keys.begin(), d_keys.end(), h_keys.begin());
thrust::copy(d_pad.begin(), d_pad.end(), h_pad.begin());
// print the results
std::cout<<"\nSorted keys\n";
thrust::copy(h_keys.begin(), h_keys.end(), std::ostream_iterator<int>(std::cout, "\n"));
std::cout<<"\nSorted paddings\n";
thrust::copy(h_pad.begin(), h_pad.end(), std::ostream_iterator<padd>(std::cout, "\n"));
return 0;
}
The output would be something like this:
Unsorted keys
1804289383
846930886
1681692777
1714636915
Unsorted paddings
(424238335 , 1957747793 )
(1649760492 , 719885386 )
(1189641421 , 596516649 )
(1350490027 , 1025202362 )
Sorted keys
846930886
1681692777
1714636915
1804289383
Sorted paddings
(1649760492 , 719885386 )
(1189641421 , 596516649 )
(1350490027 , 1025202362 )
(424238335 , 1957747793 )

Resources