Echo the command result in a file.txt - bash

I have a script such as :
cat list_id.txt | while read line; do for ACC in $line;
do
echo -n "$ACC\t"
curl -s "link=fasta&retmode=xml" |\
grep TSeq_taxid |\
cut -d '>' -f 2 |\
cut -d '<' -f 1 |\
tr -d "\n"
echo
sleep 0.25
done
done
This script allows me from a list of ID in list_id.txt to get the corresponding names in a database in https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nuccore&id=${ACC}&rettype=fasta&retmode=xml
So from this script I get something like
CAA42669\t9913
V00181\t7154
AH002406\t538120
And what I would like is directly to print or echo this result in fiel call new_ids.txt, I tried echo >> new_ids.txt but the file is empty.
Thanks for your help.

A minimal refactoring of your script might look like
# Avoid useless use of cat
# Use read -r
# Don't use upper case for private variables
while read -r line; do
for acc in $line; do
echo -n "$acc\t"
# No backslash necessary after | character
curl -s "link=fasta&retmode=xml" |
# Probably use a proper XML parser for this
grep TSeq_taxid |
cut -d '>' -f 2 |
cut -d '<' -f 1 |
tr -d "\n"
echo
sleep 0.25
done
done <list_id.txt >new_ids.txt
This could probably still be simplified significantly, but without knowledge of what your input file looks like exactly, or what curl returns, this is somewhat speculative.
tr -s ' \t\n' '\n' <list_id.txt |
while read -r acc; do
curl -s "link=fasta&retmode=xml" |
awk -v acc="$acc" '/TSeq_taxid/ {
split($0, a, /[<>]/); print acc "\t" a[3] }'
sleep 0.25
done <list_id.txt >new_ids.txt

Related

Using xargs parameterrs as variables to compare two md5sum

I'm extracting two md5sums by using this code:
md5sum test{1,2} | cut -d' ' -f1-2
I'm receiving two md5sums as in example below:
02eace9cb4b99519b49d50b3e44ecebc
d8e8fca2dc0f896fd7cb4cb0031ba249
Afterwards I'm not sure how to compare them. I have tried using the xargs:
md5sum test{1,2} | cut -d' ' -f1-2 | xargs bash -c '$0 == $1'
However, it tries to execute md5sum as a command
Any advice?
Try using a command subsitution instead
#!/bin/bash
echo 1 > file_a
echo 2 > file_b
echo 1 > file_c
file1=file_a
# try doing "file2=file_b" as well
file2=file_c
if [[ $(sha1sum $file1 | cut -d ' ' -f1-2) = $(sha1sum $file2 | cut -d ' ' -f1-2) ]]; then
echo same
else
echo different
fi

how to awk pattern as variable and loop the result?

I assign a keyword as variable, and need to awk from a file using this variable and loop. The file has millions of lines.
i have tried the code below.
DEVICE="DEV2"
while read -r line
do
echo $line
X_keyword=`echo $line | cut -d ',' -f 2 | grep -w "X" | cut -d '=' -f2`
echo $X_keyword
done <<< "$(grep -w $DEVICE $config)"
log="Dev2_PRT.log"
while read -r file
do
VALUE=`echo $file | cut -d '|' -f 1`
HEADER=`echo $VALUE | cut -c 1-4`
echo $file
if [[ $HEADER = 'PTR:' ]]; then
VALUE=`echo $file | cut -d '|' -f 4`
echo $VALUE
XCOORD+=($VALUE)
((X++))
fi
done <<< "awk /$X_keyword/ $log"
expected result:
the log files content lots of below:
PTR:1|2|3|4|X_keyword
PTR:1|2|3|4|Y_rest .....
Filter the X_keyword and get the field no 4.
Unfortunately your shell script is simply the wrong approach to this problem (see https://unix.stackexchange.com/q/169716/133219 for some of the reasons why) so you should set it aside and start over.
To demonstrate the solution, lets create a sample input file:
$ seq 10 | tee file
1
2
3
4
5
6
7
8
9
10
and a shell variable to hold a regexp that's a character list of the chars 5, 6, or 7:
$ var='[567]'
Now, given the above input, here is the solution for how to g/re/p pattern as variable and count how many results:
$ awk -v re="$var" '$0~re{print; c++} END{print "---" ORS c+0}' file
5
6
7
---
3
If that's not all you need then please edit your question to clarify your requirements and provide concise, testable sample input and expected output.

sh to read a file and take particular value in shell

I need to read a json file and take value like 99XXXXXXXXXXXX0 and cccs and write in csv which having column BASE_No and Schedule.
Input file: classedFFDCD_5666_4888_45_2018_02112018012106.021.json
"bfgft":"99XXXXXXXXXXXX0","fp":"XXXXXX","cur_gt":225XXXXXXXX0,"cccs"
"bfgft":"21XXXXXXXXXXXX0","fp":"XXXXXX","cur_gt":225XXXXXXXX0,"nncs"
"bfgft":"56XXXXXXXXXXXX0","fp":"XXXXXX","cur_gt":225XXXXXXXX0,"fgbs"
"bfgft":"44XXXXXXXXXXXX0","fp":"XXXXXX","cur_gt":225XXXXXXXX0,"ddss"
"bfgft":"94XXXXXXXXXXXX0","fp":"XXXXXX","cur_gt":225XXXXXXXX0,"jjjs"
Expected output:
BASE_No,Schedule
99XXXXXXXXXXXX0,cccs
21XXXXXXXXXXXX0,nncs
56XXXXXXXXXXXX0,fgbs
44XXXXXXXXXXXX0,ddss
94XXXXXXXXXXXX0,jjjs
I am using below code for reading file name and date, but unable to read file for BASE_No,Schedule.
SAVEIFS=$IFS
IFS=$(echo -en "\n\b")
for line in `ls -lrt *.json`; do
date=$(echo $line |awk -F ' ' '{print $6" "$7}');
file=$(echo $line |awk -F ' ' '{print $9}');
echo ''$file','$(date "+%Y/%m/%d %H.%M.%S")'' >> $File_Tracker`
Assuming the structure of the json doesnt change for every line, the sample code checks through line by line to retrieve the particular value and concatenates using printf. The output is then stored as new output.txt file.
#!/bin/bash
input="/home/kj4458/winhome/Downloads/sample.json"
printf "Base,Schedule \n" > output.txt
while IFS= read -r var
do
printf "`echo "$var" | cut -d':' -f 2 | cut -d',' -f 1`,`echo "$var" | cut -d':' -f 4 | cut -d',' -f 2` \n" | sed 's/"//g' >> output.txt
done < "$input"
awk -F " \" " ' {print $4","$12 }' file
99XXXXXXXXXXXX0,cccs
21XXXXXXXXXXXX0,nncs
56XXXXXXXXXXXX0,fgbs
44XXXXXXXXXXXX0,ddss
94XXXXXXXXXXXX0,jjjs
I got that result!

Too many arguments error in shell script

I am trying a simple shell script like the following:
#!/bin/bash
up_cap=$( cat result.txt | cut -d ":" -f 6,7 | sort -n | cut -d " " -f 2 | sort -n)
down_cap=$( cat result.txt | cut -d : -f 6,7 | sort -n | cut -d " " -f 6| sort -n)
for value in "${down_cap[#]}";do
if [ $value > 80000 ]; then
cat result.txt | grep -B 1 "$value"
fi
done
echo " All done, exiting"
when I execute the above script as ./script.sh, I get the error:
./script.sh: line 5: [: too many arguments
All done, exiting
I have googled enough, and still not able to rectify this.
You want
if [ "$value" -gt 80000 ]; then
You use -gt for checking if A is bigger than B, not >. The quotation marks I merely added to prevent the script from failing in case $value is empty.
Try to declare variable $value explicitly:
declare -i value
So, with the dominikh's and mine additions the code should look like this:
#!/bin/bash
up_cap=$( cat result.txt | cut -d ":" -f 6,7 | sort -n | cut -d " " -f 2 | sort -n)
down_cap=$( cat result.txt | cut -d : -f 6,7 | sort -n | cut -d " " -f 6| sort -n)
for value in "${down_cap[#]}";do
declare -i value
if [ $value -gt 80000 ]; then
cat result.txt | grep -B 1 "$value"
fi
done
echo " All done, exiting"

hash each line in text file

I'm trying to write a little script which will open a text file and give me an md5 hash for each line of text. For example I have a file with:
123
213
312
I want output to be:
ba1f2511fc30423bdbb183fe33f3dd0f
6f36dfd82a1b64f668d9957ad81199ff
390d29f732f024a4ebd58645781dfa5a
I'm trying to do this part in bash which will read each line:
#!/bin/bash
#read.file.line.by.line.sh
while read line
do
echo $line
done
later on I do:
$ more 123.txt | ./read.line.by.line.sh | md5sum | cut -d ' ' -f 1
but I'm missing something here, does not work :(
Maybe there is an easier way...
Almost there, try this:
while read -r line; do printf %s "$line" | md5sum | cut -f1 -d' '; done < 123.txt
Unless you also want to hash the newline character in every line you should use printf or echo -n instead of echo option.
In a script:
#! /bin/bash
cat "$#" | while read -r line; do
printf %s "$line" | md5sum | cut -f1 -d' '
done
The script can be called with multiple files as parameters.
You can just call md5sum directly in the script:
#!/bin/bash
#read.file.line.by.line.sh
while read line
do
echo $line | md5sum | awk '{print $1}'
done
That way the script spits out directly what you want: the md5 hash of each line.
this worked for me..
cat $file | while read line; do printf %s "$line" | tr -d '\r\n' | md5 >> hashes.csv; done

Resources