Populate a value in a particular column in csv - shell

I have a folder where there are 50 excel sheets in CSV format. I have to populate a particular value say "XYZ" in the column I of all the sheets in that folder.
I am new to unix and have looked for a couple of pages Here and Here . Can anyone please provide me the sample script to begin with?
For example :
Let's say column C in this case:
A B C
ASFD 2535
BDFG 64486
DFGC 336846
I want to update column C to value "XYZ".
Thanks.

I would export those files into csv format
- with semikolon as field separator
- eventually by leaving out column descriptions (otherwise see comment below)
Then the following combination of SHELL and SED script could more or less do already the trick
#! /bin/sh
for i in *.csv
do
sed -i -e "s/$/;XZY/" $i
done
-i means to edit the file in place, here you could append the value to all lines
-e specifies the regular expresssion for substitution
You might want to use a similar script like this, to rename "XYZ" to "C" only in the 1st line if the csv files should contain also column descriptions.

Related

How to check if the following files are exists are not with condition in shell script?

I had an scenario
In this path $path1 i have list of files
LINUX-7.1.0.00.00-010.RHEL6.DEBUG.i386.rpm
LINUX-7.1.0.00.00-010.RHEL6.DEBUG.x86_64.rpm
LINUX-7.1.0.00.00-010.RHEL6.i386.rpm
LINUX-7.1.0.00.00-010.RHEL6.x86_64.rpm
LINUX-7.1.0.00.00-010.RHEL7.DEBUG.x86_64.rpm
LINUX-7.1.0.00.00-010.RHEL7.x86_64.rpm
LINUX-7.1.0.00.00-010.SLES12SP4.DEBUG.x86_64.rpm
LINUX-7.1.0.00.00-010.SLES12SP4.x86_64.rpm
In $path2 i have these files
7.1.0.00.00-010 - (build.major).(build.minor).(build.servicepack).(build.patch).(build.hotfix)-(build.number)
build.major - 7
build.minor - 1
build.servicepack - 0
build.patch - 0
build.hotfix - 0
build.number - 010
I need to check if List of particular files exists or not, if exists then it can follow some steps else it should exit.
As Barmar said, this website is more aimed at solving technical issues.
Assuming you don't know where to look, I would approach the problem with the following steps:
"cat" the input file and use "awk" to extract the 3rd column
use the output in a for loop to iterate through the lines (even if you could do it with awk directly), concatenating in a variable (called tmp for example)
looking for the files using $tmp in their name.
So, in shell, you can use awk to select a column from a text input, you can iterate directly through lines of a text flux with a for loop and you can insert the value of a variable in a string, using $myVariable.
You're now on tracks!

how to merge multiple text files using bash and preserving column order

I'm new to bash, I have a folder in which there are many text files among them there's a group which are named namefile-0, namefile-1,... namefile-100. I need to merge these file all in a new file. The format of each of these files is: header and 3 columns of data.
It is very important that the format of the new file is:
3 * 100 columns of data respecting the order of the columns (123123123...).
I don't mind if the header is also repeated or not.
I'm also willing, in case it was necessary, to place all these files in a folder in which no other files are present.
I've tried to do something like this:
for i in {1..100}
do
paste `echo "namefile$i"` >> `echo "b"
done
which prints only the first file into b.
I've also tried to do this:
STR=""
for i in {1..100}
do
STR=$STR"namefile"$i" "
done
paste $STR > b
which prints everything but does not preserve the order of the columns.
You need to mention what delimeter separates columns in your file.
Assuming the columns are separated by a single space,
paste -d' ' namefile-* > newfile
Other conditions like existence of other similar files or directories in the working directory, stripping of headers etc can also be tackled but some more information needs to be provided in the question.
paste namefile-{0..100} >combined
paste namefile* > new_file_name

AIX script for file information

I have a file, in AIX server, with multiple record entries in below format
Name(ABC XYZ) Gender(Male)
AGE(26) BDay(1990-12-09)
My problem is I want to extract the name and the b'day from the file for all the records. I am trying to list it like below:
ABC XYZ 1990-12-09
Can someone please help me with the scripting
Something like this maybe:
awk -F"[()]" '/Name/ && /Gender/{name=$2} /BDay/{print name,$4}' file.txt
That says... "treat opening and closing parentheses as field separators. If you see a line go by that contains Name and Gender, save the second field in the variable name. If you see a line go by that contains the word Bday, print out the last name you saw and also the fourth field on the current line."

Create CSV from specific columns in another CSV using shell scripting

I have a CSV file with several thousand lines, and I need to take some of the columns in that file to create another CSV file to use for import to a database.
I'm not in shape with shell scripting anymore, is there anyone who can help with pointing me in the correct direction?
I have a bash script to read the source file but when I try to print the columns I want to a new file it just doesn't work.
while IFS=, read symbol tr_ven tr_date sec_type sec_name name
do
echo "$name,$name,$symbol" >> output.csv
done < test.csv
Above is the code I have. Out of the 6 columns in the original file, I want to build a CSV with "column6, column6, collumn1"
The test CSV file is like this:
Symbol,Trading Venue,Trading Date,Security Type,Security Name,Company Name
AAAIF,Grey Market,22/01/2015,Fund,,Alternative Investment Trust
AAALF,Grey Market,22/01/2015,Ordinary Shares,,Aareal Bank AG
AAARF,Grey Market,22/01/2015,Ordinary Shares,,Aluar Aluminio Argentino S.A.I.C.
What am I doing wrong with my script? Or, is there an easier - and faster - way of doing this?
Edit
These are the real headers:
Symbol,US Trading Venue,Trading Date,OTC Tier,Caveat Emptor,Security Type,Security Class,Security Name,REG_SHO,Rule_3210,Country of Domicile,Company Name
I'm trying to get the last column, which is number 12, but it always comes up empty.
The snippet looks and works fine to me, maybe you have some weird characters in the file or it is coming from a DOS environment (use dos2unix to "clean" it!). Also, you can make use of read -r to prevent strange behaviours with backslashes.
But let's see how can awk solve this even faster:
awk 'BEGIN{FS=OFS=","} {print $6,$6,$1}' test.csv >> output.csv
Explanation
BEGIN{FS=OFS=","} this sets the input and output field separators to the comma. Alternatively, you can say -F=",", -F, or pass it as a variable with -v FS=",". The same applies for OFS.
{print $6,$6,$1} prints the 6th field twice and then the 1st one. Note that using print, every comma-separated parameter that you give will be printed with the OFS that was previously set. Here, with a comma.

Grep and substitute?

I have to parse ASCII files, output the relevant data to a comma-delimited file and load it into a database table.
The specs for the file format have been recently updated and one section is causing problems. This is the original layout for that section.
CSVHeaderAttr:PUIS,IdleImmediate,POH,Temp,WorstTemp
CSVValuesAttr:NO,NO,9814,31,56
I parse it with grep thusly
CSVAttributes=$(grep ^CSVValuesAttr: ${filename}|cut -d':' -f2)
[ -z "$CSVAttributes" ] && CSVAttributes="NA"
It works great but now that the section has new fields and they are named differently
CSVHeaderAttr:PUIS,IdleImmediateSupported,IdleImmediateEnabled,POH,Temp,WorstTemp
CSVValuesAttr:NO,YES,YES,23861,31,51
Right now, I am grepping the files based on their layout (there is a field in the the header which tells me the version of the layout) to two different comma-delimited files and load them into two different tables. I would like to output both sections to the same file so the data scientist only has one table to use in his analysis.
Is there a way to use grep to produce an output like this and substitute empty fields with NA?
For one file type:
CSVHeaderAttr:PUIS,IdleImmediate,IdleImmediateSupported,IdleImmediateEnabled,POH,Temp,WorstTemp
CSVValuesAttr:NO,NO,NA,NA,9814,31,56
For the other file type:
CSVHeaderAttr:PUIS,IdleImmediate,IdleImmediateSupported,IdleImmediateEnabled,POH,Temp,WorstTemp
CSVValuesAttr:NO,NA,YES,YES,23861,31,51
Thanks for your input.
sed -n '/CSVHeaderAttr:/ c\
CSVHeaderAttr:PUIS,IdleImmediate,IdleImmediateSupported,IdleImmediateEnabled,POH,Temp,WorstTemp
/CSVValuesAttr:/ {
/\([^,]*,\)\{5\}/ s/\([^,]*,\)/&NA,/
t p
s/\(\([^,]*,\)\{2\}\)/\1NA,NA,/
# t p
: p
p
}' AllYourFiles > ConcatFile
using sed that test how many column (with /\([^,]*,\)\{5\}/) before changing the new layout

Resources