Quotation marks in text file - bash

I am new to using Unix and R etc. I am in the process of analysing some data from the NCBI website using GEO 2r datasets. I have downloaded some data from the website and have it in a text file. However, the data has quotation marks throughout and I am trying to get rid of these but have been unable to do so. The file is called GSE23182_geo2r.txt and have used following functions:
sed 's/\"//g' GSE23182_geo2r.txt > GSE23182_geo2r_2.txt
and
sed 's/\"//g' GSE23182_geo2r.txt
and
sed "s/\"//g" GSE23182_geo2r.txt
and
cat GSE23182_geo2r.txt | tr -d '\"' > GSE23182_geo2r_2.txt
but none of them have worked and seem to present with the problem: no such file or directory
Would be so grateful for any help!!
Thanks

File 'test' contains lots of "
The contents of the file are:
$ cat test
hi " this is a a quote"
"'"starting quote""
Now to delete " we use tr -d command.
$ cat test |tr -d "\""
hi this is a a quote
'starting quote
Now you can re-direct this to another file as under:
$ cat test |tr -d "\"" >test1
$ cat test1
hi this is a a quote
'starting quote

One simple way would be to open the file in vim and issue the command:
:%s/\"//g
This would remove " throughout the file.

Related

format a file content using shell script [duplicate]

This question already has answers here:
Escaping separator within double quotes, in awk
(3 answers)
Closed 2 years ago.
Hello everyone I'm a beginner in shell coding. In daily basis I need to convert a file's data to another format, I usually do it manually with Text Editor. But I often do mistakes. So I decided to code an easy script who can do the work for me. The file's content like this
/release201209
a1,a2,"a3",a4,a5
b1,b2,"b3",b4,b5
c1,c2,"c3",c4,c5
to this:
a2>a3
b2>b3
c2>c3
The script should ignore the first line and print the second and third values separated by '>'
I'm half way there, and here is my code
#!/bin/bash
cat $1 | sed '1d' | cut -d, -f2-3 | tr -d '"' > $2
It was working well until I found out that it is not working for a type of data containing comma in a3 like this one:
data,VERSION,"FUNDS.TRANSFER,ASS.VERS.TIERS.BOP",,
Which returns
VERSION>FUNDS.TRANSFER
instead of
VERSION>FUNDS.TRANSFER,ASS.VERS.TIERS.BOP
Can you help me out updating it please ? Thanks
Consider using a proper CSV parsing tool like csvtool to extract the relevant columns (its much easier & more reliable than rolling out your own parsing). Then, use tr/sed to do the necessary transformations:
sed '1d' file.txt | csvtool -t ',' col 2,3 - | tr -d '"' | sed 's/,/>/'
Steps:
Remove the header line using sed
Use csvtool to extract the 2nd and 3rd columns
Use tr to remove the double quotes
Use sed to map the first , to a > (you can't use tr for this since that does a global translation)
You can install csvtool with your package manager, e.g. on a Debian-based system sudo apt-get install csvtool. Replace apt-get with your package manager on other systems e.g. yum, brew, ...
Ruby has a CSV module included:
ruby -rcsv -e '
CSV.read(ARGV.shift).map {|row|
printf "%s>%s\n", row[1], row[2]
}
' file | sed 1d
enter image description here
hello sir I'm also learning a shell script, what do you want like that ??
this is my code
function klir(){
line2=$(cut -d\, -f1 $1 > $1.temp.filterline2)
line3=$(cut -d\" -f2 $1 > $1.temp.filterline3)
paste $1.temp.filterline2 $1.temp.filterline3 | sed "s/\t/>/g ; 1d"
rm $1.temp.filterline2 $1.temp.filterline3 2>/dev/null
}
klir $1

Extract specific string from line with standard grep,egrep or awk

i'm trying to extract a specific string from a grep output
uci show minidlna
produces a large list
.
.
.
minidlna.config.enabled='1'
minidlna.config.db_dir='/mnt/sda1/usb/db'
minidlna.config.enable_tivo='1'
minidlna.config.wide_links='1'
.
.
.
so i tried to narrow down what i wanted by running
uci show minidlna | grep -oE '\bdb_dir=\S+'
this narrows the output to
db_dir='/mnt/sda1/usb/db'
what i want is to output only
/mnt/sda1/usb/db
without the quotes and without the starting "db_dir" so i can run rm /mnt/sda1/usb/db/file.db
i've used the answers found here
How to extract string following a pattern with grep, regex or perl
and that's as close as i got.
EDIT: after using Ed Morton's awk command i needed to pass the output to rm command.
i used:
| ( read DB; (rm $DB/files.db) .
read DB passes the output into the vairable DB.
(...) combines commands.
rm $DB/files.db deletes the the file files.db.
Is this what you're trying to do?
$ awk -F"'" '/db_dir/{print $2}' file
/mnt/sda1/usb/db
That will work in any awk in any shell on every UNIX box.
If that's not what you want then edit your question to clarify your requirements and post more truly representative sample input/output.
Using sed with some effort to avoid single quotes:
sed -n 's/^minidlna.config.db_dir=\s*\S\(\S*\)\S\s*$/\1/p' input
Well, so you end up having a string like db_dir='/mnt/sda1/usb/db'.
I would first remove the quotes by piping this to
.... | tr -d "'"
Now you end up with a string like db_dir=/mnt/sda1/usb/db.
Say you have this string stored in a variable named confstr, then
${confstr##*=}
gives you just /mnt/sda1/usb/db, since *= denotes everything from the start to the equal sign, and ## denotes removal.
I would do this:
Once you either extracted your line about into file.txt (or pipe it into this command), split the fields using the quote character. Use printf to generate the rm command and pass this into bash to execute.
$ awk -F"'" '{printf "rm %s.db/file.db\n", $2}' file.txt | bash
rm: /mnt/sda1/usb/db.db/file.db: No such file or directory
With your original command:
$ uci show minidlna | grep -oE '\bdb_dir=\S+' | \
awk -F"'" '{printf "rm %s.db/file.db\n", $2}' | bash

Unable to remove spaces between strings from a file

I have a file whose contents are like below.
$ cat test
static2 deploy
TDPlanValidator-Prod
I am trying upload contents from these directories to s3 bucket. The issue is s3 doesnt accept spaces and hence I am getting an error. For this to be done, I am trying to remove space between "static2 deploy". This file will have around 400 entries and some of them will have directories with space in it like "static2 deploy". The script which I have written is not able to do that. The script and the output is below.
for i in `cat test`;do var="$( echo "$i" | tr -d ' ' )"; echo $var;done
static2
deploy
TDPlanValidator-Prod
I have tried sed too but that also doesnt work. I want output as below so that I can push it in s3 bucket
static2deploy
Can someone please help me out here? I have been trying things since yesterday but have been unable to fix it.
You can achieve this by below sed command
echo "static2 deploy" | sed "/static2/s/ //g"
How will it work? sed first search for string static2 and once found it will search for all the spaces in that line and removes them.
So above command will output
static2deploy
But if you try with below:-
echo "static deploy" | sed "/static2/s/ //g"
Output would be
static deploy
So in your case you need to try with below:-
cat test | sed "/static2/s/ //g" > output.txt
Hope this will help.
for i in `cat test` loops over every string in the file and not every line.
This works:
cat test | while read line; do var="$( echo "$line" | tr -d ' ' )"; echo $var; done
or shorter if you only want to print the line:
cat test | while read line; do echo "$line" | tr -d ' '; done
Output:
static2deploy
TDPlanValidator-Prod

How to locate files that contain a specific word in a specific folder first, then replace it with a different one in Mac Terminal

i'm looking for a solution within Mac Terminal to first locate a specific word e.g. "testword" in a folder with a lot of text files in it and then replace the word with a different one.
I found the following line to locate a word in a folder with several text files in it:
grep -r 'testword' "path"
which works fine but i can't find a line to add to replace the word with another one in one combined command. Any suggestions?
Thanks a lot for your help! :)
For safety, I would suggest to do it in two steps. Using first grep, tr and sed to prepare all word replacements for each files. Check if all is okay, then execute all the sed remplacements.
$ touch hello123 helloABC helloXYZ hello000
$ echo "hello" > hello*
$ cat hello*
hello
hello
hello
hello
$ grep -lr "hello" . | tr "\n" " " | sed 's/^/sed \-i "s\/hello\/NewWord\/g" /g' | tee replace.sh
sed -i "s/hello/NewWord/g" ./helloXYZ ./helloABC ./hello000 ./hello123
$ sh replace.sh
$ cat hello*
NewWord
NewWord
NewWord
NewWord

How to replace a particular string with another in UNIX shell script

Could you please let me know how to replace a particular string present in a text file or ksh file in the server with another string ?
For example :-
I have 10 files present in the path /file_sys/file in which i have to replace the word "BILL" to "BILLING" in all the 10 files.
Works for me:
I created a file 'test' with this content: "This is a simple test". Now I execute this call to the sed command:
sed -i 's/ is / is not /' test
Afterwards the file 'test' contains this content: "This is not a simple test"
If your sed utility does not support the -i flag, then there is a somewhat awkward workaround:
sed 's/ is / is not /' test > tmp_test && mv tmp_test test
This should work. Please find the testing as well.
$ cat > file1
I am a BILL boy
sed 's/[[:alnum:] [:cntrl:] [:lower:] [:space:] [:alpha:] [:digit:] [:print:] [:upper:] [:blank:] [:graph:] [:punct:] [:xdigit:]]BILL[[:alnum:] [:cntrl:] [:lower:] [:space:] [:alpha:] [:digit:] [:print:] [:upper:] [:blank:] [:graph:] [:punct:] [:xdigit:]]/BILLING/g' file1>file2
$ cat file2
I am a BILLING boy
Using sed:
sed 's/\bBILL\b/BILLING/g' file
For inplace:
sed --in-place 's/\bBILL\b/BILLING/g' file
A little for loop might assist for dealing with multiple files, and here I'm assuming -i option is not available:
for file in $(grep -wl BILL /file_sys/file/*); do
echo $file
sed -e 's/\bBILL\b/BILLING/g' $file > tmp
mv tmp $file
done
Here's what's happening:
grep -w Search for all (and only) files with the word BILL
grep -l Listing the file names (rather than content)
$(....) Execute whats inside the brackets (command substitution)
for file in Loop over each item in the list (each file with BILL in it)
echo $file Print each file name we loop over
sed command Replace the word BILL (here, specifically delimited with word boundaries "\b") with BILLING, into a tmp file
mv command Move the tmp file back to the original name (replace original)
You can easily test this without actually changing anything - e.g. just print the file name, or just print the contents (to make sure you've got what you expect before replacing the original files).

Resources