How to get a particular text from a file using bash - bash

I have a file that has data as follows
a==1 b==2 c==9 x==4 d==5 ...etc each of them are in new line I need to get the c==9 from a file and c==9 could be anywhere in the file so getting the value from using line number is not possible. I am looking to use bash as possible solution.

What exactly do you want and what you have tried? Simplest way is to use awk:
~$ cat test
a==1
b==1
dskjfhjks
kjfhkhd
c==1
kjfds
~$ awk '/c==1/ { print }' test
c==1

Related

Bash - Read Directory Path From TXT, Append Executable, Then Execute

I am setting up a directory structure with many different R & bash scripts in it. They all will be referencing files and folders. Instead of hardcoding the paths I would like to have a text file where each script can search for a descriptor in the file (see below) and read the relevant path from that.
Getting the search-append to work in R is easy enough for me; I am having trouble getting it to work in Bash, since I don't know the language very well.
My guess is it has something to do with the way awk works / stores the variable, or maybe the way the / works on the awk output. But I'm not familiar enough with it and would really appreciate any help
Text File "Master_File.txt":
NOT_DIRECTORY "/file/paths/Fake"
JOB_TEST_DIRECTORY "/file/paths/Real"
ALSO_NOT_DIRECTORY "/file/paths/Fake"
Bash Script:
#! /bin/bash
master_file_name="Master_File.txt"
R_SCRIPT="RScript.R"
SRCPATH=$(awk '/JOB_TEST_DIRECTORY/ { print $2 }' $master_file_name)
Rscript --vanilla $SRCPATH/$R_SCRIPT
The last line, $SRCPATH/$R_SCRIPT, seems to be replacing part of SRCPath with the name of $R_SCRIPT which outputs something like /RScript.Rs/Real instead of what I would like, which is /file/paths/Real/RScript.R.
Note: if I hard code the path path="/file/paths/Real" then the code $path/$R_SCRIPT outputs what I want.
The R Script:
system(command = "echo \"SUCCESSFUL_RUN\"", intern = FALSE, wait = TRUE)
q("no")
Please let me know if there's any other info that would be helpful, I added everything I could think of. And thank you.
Edit Upon Answer:
I found two solutions.
Solution 1 - By Mheni:
[ see his answer below ]
Solution 2 - My Adaptation of Mheni's Answer:
After seeing a Mehni's note on ignoring the " quotation marks, I looked up some more stuff, and found out it's possible to change the character that awk used to determine where to separate the text. By adding a -F\" to the awk call, it successfully separates based on the " character.
The following works
#!/bin/bash
master_file_name="Master_File.txt"
R_SCRIPT="RScript.R"
SRCPATH=$(awk -F\" -v r_script=$R_SCRIPT '/JOB_TEST_DIRECTORY/ { print $2 }' $master_file_name)
Rscript --vanilla $SRCPATH/$R_SCRIPT
Thank you so much everyone that took the time to help me out. I really appreciate it.
the problem is because of the quotes around the path, this change to the awk command ignores them when printing the path.
there was also a space in the shebang line that shouldn't be there as #david mentioned
#!/bin/bash
master_file_name="/tmp/data"
R_SCRIPT="RScript.R"
SRCPATH=$(awk '/JOB_TEST_DIRECTORY/ { if(NR==2) { gsub("\"",""); print $2 } }' "$master_file_name")
echo "$SRCPATH/$R_SCRIPT"
OUTPUT
[1] "Hello World!"
in my example the paths are in /tmp/data
NOT_DIRECTORY "/tmp/file/paths/Fake"
JOB_TEST_DIRECTORY "/tmp/file/paths/Real"
ALSO_NOT_DIRECTORY "/tmp/file/paths/Fake"
and in the path that corresponds to JOB_TEST_DIRECTORY i have a simple hello_world R script
[user#host tmp]$ cat /tmp/file/paths/Real/RScript.R
print("Hello World!")
I would use
Master_File.txt :
NOT_DIRECTORY="/file/paths/Fake"
JOB_TEST_DIRECTORY="/file/paths/Real"
ALSO_NOT_DIRECTORY="/file/paths/Fake"
Bash Script:
#!/bin/bash
R_SCRIPT="RScript.R"
if [[ -r /path/to/Master_File.txt ]]; then
. /path/to/Master_File.txt
else
echo "ERROR -- Can't read Master_File"
exit
fi
Rscript --vanilla $JOB_TEST_DIRECTORY/$R_SCRIPT
Basically, you create a configuration file Key=value, source it then use the the keys as variable for whatever you need throughout the script.

Need to replace first line with the input from user in shell script

I am a newbie in shell scripting. I am making a script where I need to replace certain values in the first line of a file with an input value from the user. How can I achieve this?
I have these two lines in my file named exclude:
Exclude = CLASS:itsc_usa7061vm1300-1399
Exclude = RECYCLER
Now I want replace the everything after CLASS:* with any value from the user.
I used below command but it yielded no result:
sed "1s/*/Exclude = CLASS:$1/" exclude
Use the following command instead cause your selector is selecting everything,
sed "1s/CLASS:.*/CLASS:$1/" exclude
Output:
$ a="hello"
$ echo $a
hello
$ sed "1s/CLASS:.*/CLASS:$a/" sample
Exclude = CLASS:hello
Exclude = RECYCLER

Create CSV from specific columns in another CSV using shell scripting

I have a CSV file with several thousand lines, and I need to take some of the columns in that file to create another CSV file to use for import to a database.
I'm not in shape with shell scripting anymore, is there anyone who can help with pointing me in the correct direction?
I have a bash script to read the source file but when I try to print the columns I want to a new file it just doesn't work.
while IFS=, read symbol tr_ven tr_date sec_type sec_name name
do
echo "$name,$name,$symbol" >> output.csv
done < test.csv
Above is the code I have. Out of the 6 columns in the original file, I want to build a CSV with "column6, column6, collumn1"
The test CSV file is like this:
Symbol,Trading Venue,Trading Date,Security Type,Security Name,Company Name
AAAIF,Grey Market,22/01/2015,Fund,,Alternative Investment Trust
AAALF,Grey Market,22/01/2015,Ordinary Shares,,Aareal Bank AG
AAARF,Grey Market,22/01/2015,Ordinary Shares,,Aluar Aluminio Argentino S.A.I.C.
What am I doing wrong with my script? Or, is there an easier - and faster - way of doing this?
Edit
These are the real headers:
Symbol,US Trading Venue,Trading Date,OTC Tier,Caveat Emptor,Security Type,Security Class,Security Name,REG_SHO,Rule_3210,Country of Domicile,Company Name
I'm trying to get the last column, which is number 12, but it always comes up empty.
The snippet looks and works fine to me, maybe you have some weird characters in the file or it is coming from a DOS environment (use dos2unix to "clean" it!). Also, you can make use of read -r to prevent strange behaviours with backslashes.
But let's see how can awk solve this even faster:
awk 'BEGIN{FS=OFS=","} {print $6,$6,$1}' test.csv >> output.csv
Explanation
BEGIN{FS=OFS=","} this sets the input and output field separators to the comma. Alternatively, you can say -F=",", -F, or pass it as a variable with -v FS=",". The same applies for OFS.
{print $6,$6,$1} prints the 6th field twice and then the 1st one. Note that using print, every comma-separated parameter that you give will be printed with the OFS that was previously set. Here, with a comma.

Printing lines according to their columns in shell scripting

i know it is very basic question but im total new in shell scripting
i a txt file called 'berkay' and content of it is like
03:05:16 debug blablabla1
03:05:18 error blablablablabla2
05:42:14 degub blabblablablabal
06:21:24 debug balbalbal1
I want to print the lines whose second column is error so the output will be
03:05:18 error blablablablabla2
I am thinking about something like " if nawk { $2}" but i need help.
With this for example:
$ awk '$2=="error"' file
03:05:18 error blablablablabla2
Why is this working? Because when the condition is true, awk automatically performs its default behaviour: {print $0}. So there is no need to explicitly write it.

Need a quick way of removing partial duplicates from a log

I'm using a bash script to grep out some lines from a log file. The basic format of this log file is:
field1: value1, field2=value2, field3=value3,
field4=value4,value5,value6, field5=value7
Sometimes there will be lines in which field1: value1 is identical, but some of the other information is either the same or different. I'd like to filter those lines out, so that I only grep out the first instance of anything that has the same "field1: value1" tuple.
I'd prefer a nice command-line one-liner if you can find something especially simple. I definitely want to keep it in the bash script. This is on linux, so we've got all the command-line tools available.
Thanks!
Using awk:
awk -F, '!arr[$1]++ { print }' LOGFILE
The awk program uses an array to keep a count of the number of times a particular 'field1: value1` string is seen, but only prints the incoming line the first time.

Resources