Unable to remove spaces between strings from a file - bash

I have a file whose contents are like below.
$ cat test
static2 deploy
TDPlanValidator-Prod
I am trying upload contents from these directories to s3 bucket. The issue is s3 doesnt accept spaces and hence I am getting an error. For this to be done, I am trying to remove space between "static2 deploy". This file will have around 400 entries and some of them will have directories with space in it like "static2 deploy". The script which I have written is not able to do that. The script and the output is below.
for i in `cat test`;do var="$( echo "$i" | tr -d ' ' )"; echo $var;done
static2
deploy
TDPlanValidator-Prod
I have tried sed too but that also doesnt work. I want output as below so that I can push it in s3 bucket
static2deploy
Can someone please help me out here? I have been trying things since yesterday but have been unable to fix it.

You can achieve this by below sed command
echo "static2 deploy" | sed "/static2/s/ //g"
How will it work? sed first search for string static2 and once found it will search for all the spaces in that line and removes them.
So above command will output
static2deploy
But if you try with below:-
echo "static deploy" | sed "/static2/s/ //g"
Output would be
static deploy
So in your case you need to try with below:-
cat test | sed "/static2/s/ //g" > output.txt
Hope this will help.

for i in `cat test` loops over every string in the file and not every line.
This works:
cat test | while read line; do var="$( echo "$line" | tr -d ' ' )"; echo $var; done
or shorter if you only want to print the line:
cat test | while read line; do echo "$line" | tr -d ' '; done
Output:
static2deploy
TDPlanValidator-Prod

Related

creating a variable from sed output

I am banging my head against the keyboard on this simple piece of code.
#!/bin/bash
connstate="Connected"
vpnstatus=$(/opt/cisco/anyconnect/bin/vpn state | (grep -m 1 'state:'))
echo $vpnstatus
vpnconn=$(echo $vpnstatus | sed -e 's/>>\ state: //g' | sed "s/ //g")
echo "$vpnconn" "$connstate"
if [ "$vpnconn" = "$connstate" ];then
echo $vpnconn
else echo "this script still fails"
fi
echo done
This is the output from the above code:
>> state: Connected
Connected Connected
this script still fails
done
I believe the issue revolves around the vpnconn=$ if I comment that section of code out and fill the variable vpnconn="Connected" this code works fine. Something with how the sed is working on the input from vpnstatus and outputting the results to vpnconn is making what looks like a correct result incorrect when doing the compare in the if then.
I have tried splitting up the vpnconn line into two separate lines and that did not change anything, I took out the sed "s/ //g" and replaced it with a trim -d ' ' and that did not change the results. I know this is something small in this tiny piece of code that I am missing.
Did you try?
vpnconn=$(echo "$vpnstatus" | awk '{print $3}')
Something like:
vpnstatus=$(/opt/cisco/anyconnect/bin/vpn state|grep -m 1 'state:'|awk '{print 3}')
should do the work.

bash: cURL from a file, increment filename if duplicate exists

I'm trying to curl a list of URLs to aggregate the tabular data on them from a set of 7000+ URLs. The URLs are in a .txt file. My goal was to cURL each line and save them to a local folder after which I would grep and parse out the HTML tables.
Unfortunately, because of the format of the URLs in the file, duplicates exist (example.com/State/City.html. When I ran a short while loop, I got back fewer than 5500 files, so there are at least 1500 dupes in the list. As a result, I tried to grep the "/State/City.html" section of the URL and pipe it to sed to remove the / and substitute a hyphen to use with curl -O. cURL was trying to grab
Here's a sample of what I tried:
while read line
do
FILENAME=$(grep -o -E '\/[A-z]+\/[A-z]+\.htm' | sed 's/^\///' | sed 's/\//-/')
curl $line -o '$FILENAME'
done < source-url-file.txt
It feels like I'm missing something fairly straightforward. I've scanned the man page because I worried I had confused -o and -O which I used to do a lot.
When I run the loop in the terminal, the output is:
Warning: Failed to create the file State-City.htm
I think you dont need multitude seds and grep, just 1 sed should suffice
urls=$(echo -e 'example.com/s1/c1.html\nexample.com/s1/c2.html\nexample.com/s1/c1.html')
for u in $urls
do
FN=$(echo "$u" | sed -E 's/^(.*)\/([^\/]+)\/([^\/]+)$/\2-\3/')
if [[ ! -f "$FN" ]]
then
touch "$FN"
echo "$FN"
fi
done
This script should work and also take care of downloading same files multiple files.
just replace the touch command by your curl one
First: you didn't pass the url info to grep.
Second: try this line instead:
FILENAME=$(echo $line | egrep -o '\/[^\/]+\/[^\/]+\.html' | sed 's/^\///' | sed 's/\//-/')

Merging fastq files by identifiers with a shell script

I have to merge files with the following naming pattern :
[SampleID]_[custom_ID01]_ID[RUN_ID]_L001_R1.fastq
[SampleID]_[custom_ID02]_ID[RUN_ID]_L002_R1.fastq
[SampleID]_[custom_ID03]_ID[RUN_ID]_L003_R1.fastq
[SampleID]_[custom_ID04]_ID[RUN_ID]_L004_R1.fastq
I need to merge all files with identical [SampleID] but different "Lanes" (L001-L004).
The following script works fine when directly run in the terminal:
custom_id="000"
RUN_ID="0025"
wd="/path/to/script/" # was missing/ incorrect
# get ALL sample identifiers
touch temp1.txt
for line in $wd/*.fastq ; do
fastq_identifier=$(echo "$line" | cut -d"_" -f1);
echo $fastq_identifier >> temp1.txt
done
# get all uniqe samples identical
cat temp1.txt | uniq > temp2.txt
input_var=$(cat temp2.txt)
# concatenate all fastq (different lanes) with identical identifier
for line in $input_var; do
cat $line*fastq >> $line"_"$custom_id"_ID"$Run_ID"_L001_R1.fastq"
done
rm temp1.txt temp2.txt;
But if I create a script file (concatenate_fastq.sh) and make it executable
$ chomd +x concatenate_fastq.sh
and run it
$ ./concatenate_fastq.sh
I got the following error:
$ concatenate_fastq.sh: line 17: /*.fastq_000_ID_L001_R1.fastq: Keine Berechtigung # = Permission denied
Thx to your hints below I solved the problem by fixing
wd=/path/to/script/
The immediate problem seems to be that wd is unset. If you script really genuinely contains exactly the line
wd="/path/to/script/"
then I would suspect invisible control characters in the script file (using a Windows editor is a common way to shoot yourself in the foot).
More generally, your script should cope correctly when the wildcard does not match any files. A common way to do that is to shopt -s nullglob but the subsequent script would still need adaptation then.
Refactoring the script to loop only over actual matches would help avoid trouble. Perhaps something like this:
shopt -s nullglob # bashism
printf '%s\n' "$wd"/*.fastq |
cut -d_ -f1 |
uniq |
while read -r line; do
cat "$line"*fastq >> "${line}_${custom_id}_ID${Run_ID}_L001_R1.fastq"
done
You'll notice that this simplifies the script tremendously, and avoids the pesky temporary files.
I solved it with:
if [ $# -ne 3 ] ; then
echo -e "Usage: $0 {path_to_working_directory} {custom_ID:Z+} {run_ID:ZZZZ}\n"
exit 1
fi
cwd=$(pwd)
wd=$1
custom_id=$2
RUN_ID=$3
folder=$(basename $wd)
input_var=$(ls *fastq | cut --fields 1 -d "_" | uniq)
for line in $input_var; do
cat $line*fastq >> $line"_"$custom_id"_ID"$RUN_ID"_L001_R1.fastq"
done

Appending the output of a variable to the end of a specific line

Within my template(callbacks), there is a line that ends with "IP:" I would like to append to. I tried this command:
cat callbacks | grep "IP:" | cut -d ":" -f 2 | echo $(ping -c2 host.com).
I thought i would be able to echo something at the end, but that didn't work. Could someone please shed some light on what i am doing wrong.
This is what i have so far:
for textfile in $(find . -iname "2013*-malware-callback*.txt")
do cat callbacks | cat - $textfile > tmpfile && mv tmpfile $textfile
done
The following takes the contents of $textfile, finds any occurrence of IP: and appends to it an IP address, and saves the result in tmpfile:
v="1.2.3.4"
cat "$textfile" | sed 's/IP:/IP: '"$v/" >tmpfile
The pipeline can be simplified:
sed 's/IP:/IP: '"$v/" <"$textfile" >tmpfile
Further, if the ultimate goal is to replace $textfile with the modified version, we can use sed's modify-in-place feature:
sed -i.bak 's/IP:/IP: '"$v/" "$textfile"
This modifies $textfile in place and, for safekeeping, leaves a backup copy of the original with extension .bak.

shell script to read contain from file and grep on other file

I am working on shell, I want to write one liner which will read the file contents of file A and execute grep command on file B.
for example, suppose there are two file
dataFile.log which have following value
abc
xyz
... and so on
now read abc and grep on searchFile.log like grep abc searchFile.log
I have shell script for the same but want one liner for it
for i in "cat dataFile.log" do grep $i searchFile.log done;
try this:
grep -f dataFile.log searchFile.log
Note that if you want to grep as fixed string, you need -F, if you want to match the text in dataFile.log as regex, use -E or -P
How about the following: it even ignores blank lines and # comments:
while read FILE; do if [[ "$FILE" != [/a-zA-Z0-9]* ]]; do continue; fi; grep -h pattern "$FILE"; done;
Beware: have not compiled this.
You can use grep -f option:
cat dataFile.log | grep -f searchFile.log
Edit
OK, now I understand the problem. You want to use every line from dataFile.log to grep in searchFile.log. I also see you have value1|value2|..., so instead of grep you need egrep.
Try with this:
for i in `cat dataFile.log`
do
egrep "$i" searchFile.log
done
Edit 2
Following chepner suggestion:
egrep -f dataFile.log searchFile.log

Resources