How to Write A Second Column in Bash in an Existing txt file - bash

I need to extract the ID name of a parent directory and put that in a tab-delimited text file. Then I need to extract names of the contents of that folder and put it in the same row as that ID name I first extracted. Essentially, Column 1 should list the directory name from parent, Column 2 should list the name first file in that directory, Column 3 should be the name of the next file, and so on and so forth.
/path/to/folder/ID/
pwd | xargs echo | awk -F "/" '{print $n; exit}' >> Text.txt
where 'n' is the location of the desired parent folder (in this case, ID). This works fine, and writes something like "ID001" to my Text.txt file.
I try the same little hack again, using my pwd as my input to xargs, listing out the contents of that folder, and writing the names to my Text.txt file:
pwd | xargs echo | awk -F "/" '{print $7; exit}' >> Text.txt | pwd | xargs echo | xargs ls | xargs echo >> Text.txt
But instead of
ID001 file1 file2
I get
file1 file2
ID001
Which is mostly to be expected, given the commands. I am confused as to why my file names are being appended to the first row and not to the last row. The only related article I could find was this for writing a specific column to a CSV, but it wasn't quite what I was looking for.

This find plus awk pipeline MAY be what you're trying to do:
$ ls tmp
a b
$ find tmp -print | awk '{sub("^[^/]+/",""); printf "%s%s", sep, $0; sep="\t"} END{print ""}'
tmp a b
YMMV if your file names contain tabs or newlines of course.

You probably want to do that as part of multiple commands; for ease in understanding.
You can put the commands in a bash script.
Example scenario
$ pwd
/Users/pa357856/test/tmp/foo
$ ls
file1.txt file2.txt
commands -
$ parentDIR=`pwd | xargs echo | awk -F "/" '{print $6}'`
$ filesList=`ls`
$ echo "$parentDIR" "$filesList" >> test.txt
Result -
$ cat test.txt
foo file1.txt file2.txt

Related

How to save a list of all files in a directory in a single text file and add prefixes and suffixes?

I am trying to save a list of files in a directory into a single file using
ls > output.txt
Let's say we have in the directory:
a.txt
b.txt
c.txt
I want to modify the names of these files in the output.txt to be like:
1a.txt$
1b.txt$
1c.txt$
Another easy way use AWK to change content and save to file via .tmp
This script will print content how you want. Just add "1" and "$" to begining and ending accordingly.
cat output.txt | awk '{print "1"$1"$"}'
And then you can save to original file as you want by extending command && (if success then next )
cat output.txt | awk '{print "1"$1"$"}' > output.txt.tmp && mv output.txt.tmp output.txt
#!/bin/sh -x
for f in *.txt
do
nf=$(echo "${f}" | sed 's#^#1#')
mv -v "${f}" "${nf}"
done

How to rename a CSV file from a value in the CSV file

I have 100 1-line CSV files. The files are currently labeled AAA.txt, AAB.txt, ABB.txt (after I used split -l 1 on them). The first field in each of these files is what I want to rename the file as, so instead of AAA, AAB and ABB it would be the first value.
Input CSV (filename AAA.txt)
1234ABC, stuff, stuff
Desired Output (filename 1234ABC.csv)
1234ABC, stuff, stuff
I don't want to edit the content of the CSV itself, just change the filename
something like this should work:
for f in ./* ; do new_name=$(head -1 $f | cut -d, -f1); cp $f dir/$new_name
move them into a new dir just in case something goes wrong, or you need the original file names.
starting with your original file before splitting
$ awk -F, '{print > ($1".csv")}' originalFile.csv
and do all in one shot.
This will store the whole input file into the colum1.csv of the inputfile.
awk -F, '{print $0 > $1".csv" }' aaa.txt
In a terminal, changed directory, e.g. cd /path/to/directory that the files are in and then use the following compound command:
for f in *.txt; do echo mv -n "$f" "$(awk -F, '{print $1}' "$f").cvs"; done
Note: There is an intensional echo command that is there for you to test with, and it will only print out the mv command for you to see that it's the outcome you wish. You can then run it again removing just echo from the compound command to actually rename the files as desired via the mv command.

Append xargs argument number as prefix

I want to analyze the most frequentry occuring entries in (column of) a logfile. To write the detail results, I am creating new directories from the output of something along the lines of
cat logs| cut -d',' -f 6 | sort | uniq -c | sort -rn | head -10 | \
awk '{print $2}' |xargs mkdir -p
Is there a way to create the directories with the sequence number of the argument as processed by xargs as a prefix? For e.g. For e.g. "oranges" is the most frequent entry (of the column) the directory created should be named "1.oranges" and so on.
A quick (and dirty?) solution could be to pipe your directory names through cat -n in their proper order and then remove the whitespace separating the line number from the directory name, before passing them to xargs.
A better solution would be to modify your awk command:
... | awk '{ print NR "." $2 }' | xargs mkdir -p
The NR variable contains the record (i.e. line) number.

Parsing CSV file in bash script [duplicate]

This question already has answers here:
How to extract one column of a csv file
(18 answers)
Closed 7 years ago.
I am trying to parse in a CSV file which contains a typical access control matrix table into a shell script. My sample CSV file would be
"user","admin","security"
"user1","x",""
"user2","","x"
"user3","x","x"
I would be using this list in order to create files in their respective folders. The problem is how do I get it to store the values of column 2/3 (admin/security)? The output I'm trying to achieve is to group/sort all users that have admin/security rights and create files in their respective folders. (My idea is to probably store all admin/security users into different files and run from there.)
The environment does not allow me to use any Perl or Python programs. However any awk or sed commands are greatly appreciated.
My desired output would be
$ cat sample.csv
"user","admin","security"
"user1","x",""
"user2","","x"
"user3","x","x"
$ cat security.csv
user2
user3
$ cat admin.csv
user1
user3
if you can use cut(1) (which you probably can if you're on any type of unix) you can use
cut -d , -f (n) (file)
where n is the column you want.
You can use a range of columns (2-3) or a list of columns (1,3).
This will leave the quotes but you can use a sed command or something light-weight for that.
$ cat sample.csv
"user","admin","security"
"user1","x",""
"user2","","x"
"user3","x","x"
$ cut -d , -f 2 sample.csv
"admin"
"x"
""
"x"
$ cut -d , -f 3 sample.csv
"security"
""
"x"
"x"
$ cut -d , -f 2-3 sample.csv
"admin","security"
"x",""
"","x"
"x","x"
$ cut -d , -f 1,3 sample.csv
"user","security"
"user1",""
"user2","x"
"user3","x"
note that this won't work for general csv files (doesn't deal with escaped commas) but it should work for files similar to the format in the example for simple usernames and x's.
if you want to just grab the list of usernames, then awk is pretty much the tool made for the job, and an answer below does a good job that I don't need to repeat.
But a grep solution might be quicker and more lightweight
The grep solution:
grep '^\([^,]\+,\)\{N\}"x"'
where N is the Nth column, with the users being column 0.
$ grep '^\([^,]\+,\)\{1\}"x"' sample.csv
"user1","x",""
"user3","x","x"
$ grep '^\([^,]\+,\)\{2\}"x"' sample.csv
"user2","","x"
"user3","x","x"
from there on you can use cut to get the first column:
$ grep '^\([^,]\+,\)\{1\}"x"' sample.csv | cut -d , -f 1
"user1"
"user3"
and sed 's/"//g' to get rid of quotes:
$ grep '^\([^,]\+,\)\{1\}"x"' sample.csv | cut -d , -f 1 | sed 's/"//g'
user1
user3
$ grep '^\([^,]\+,\)\{2\}"x"' sample.csv | cut -d , -f 1 | sed 's/"//g'
user2
user3
Something to get you started (please note this will not work for csv files with embedded commas and you will have to use a csv parser):
awk -F, '
NR>1 {
gsub(/["]/,"",$0);
if($2!="" && $3!="")
print $1 " has both privileges";
print $1 > "file"
}' csv

Searching for Strings

I would like to have a shell script that searches two files and returns a list of strings:
File A contains just a list of unique alphanumeric strings, one per line, like this:
accc_34343
GH_HF_223232
cwww_34343
jej_222
File B contains a list of SOME of those strings (some times more than once), and a second column of infomation, like this:
accc_34343 dog
accc_34343 cat
jej_222 cat
jej_222 horse
I would like to create a third file that contains a list of the strings from File A that are NOT in File B.
I've tried using some loops with grep -v, but that doesn't work. So, in the above example, the new file would have this as it's contents:
GH_HF_223232
cwww_34343
Any help is greatly appreciated!
Here's what you can do:
grep -v -f <(awk '{print $1}' file_b) file_a > file_c
Explanation:
grep -v : Use -v option to grep to invert the matching
-f : Use -f option to grep to specify that the patterns are from file
<(awk '{print $1}' file_b): The <(awk '{print $1}' file_b) is to simply extract the first column values from file_b without using a temp file; the <( ... ) syntax is process substitution.
file_a : Tell grep that the file to be searched is file_a
> file_c : Output to be written to file_c
comm is used to find intersections and differences between files:
comm -23 <(sort fileA) <(cut -d' ' -f1 fileB | sort -u)
result:
GH_HF_223232
cwww_34343
I assume your shell is bash/zsh/ksh
awk 'FNR==NR{a[$0];next}!($1 in a)' fileA fileB
check here

Resources