Fill a argument with tab file values in bash - bash

I need to creat a bash file in order to run a certain command on a server.
Here is one of the lines
Programm/programm.pl -k 1 -q --acc_number
where --acc_number needs a Comma-separated list of accession numbers, e.g. --acc_number Number13JJ2,Number0090D93,Number088DF.
but I actually have a file calle file_acc_number where I have each of the accession number in line such as :
Number13JJ2
Number0090D93
Number088DF
does someone have an idea how to parse this tab file and to directly put the accessio number in a comma-separated way and get :
Programm/programm.pl -k 1 -q --acc_number Number13JJ2,Number0090D93,Number088DF
Thank you for your help

Try using paste:
Programm/programm.pl -k 1 -q --acc_number `paste -s -d, file_acc_number`
Try running paste -s -d, file_acc_number first to understand whether you get what you require.

with an inline expansion maybe? Like this
Programm/programm.pl -k 1 -q --acc_number $(sed -z 's/\n/,/g' file_acc_number)
Make sure your file "file_acc_number" has no "new line" at the end of it.
With this, you will replace the "new line" character with a comma on the fly without affecting the original file.

Related

How to use grep/awk/sed to print until a certain character?

I am a complete beginner on shell scripting and I am trying to iterate through a set of JSON files and trying to extract a certain field out of it. Each JSON file has a "country:"xxx" field. In each JSON file, there are 10k of the same field with the same country name so I need only the first occurrence and I can do that using "-m 1".
I tried to use grep for this but could not figure out how to extract the whole field including the country name from each file at first occurrence.
for FILE in *.json;
do
grep -o -a -m 1 -h -r '"country":"' $FILE;
done
I tried to use another pipe and use the below pattern but it did not work
| egrep -o '^[^"]+'
Actual Output:
"country":"
"country":"
"country":"
Desired Output:
"country:"romania"
"country:"united kingdom"
"country:"tajikistan"
but I need the whole thing. Any help would be great. Thanks
There is one general answer on the question "I only want the first occurence", and that answer is:
... | head -n 1
This mean, whatever your do: take the head (the first lines), the -n switch gives you the possibility to say how many you want (one in this case).
The same can be done for the last occurence(s), but then you use tail instead of head (you can also use the -n switch).
After trying many things. I found the pattern I was looking for.
grep -Po '"country":.*?[^\\]",' $FILE | head -n 1;

Shell script to fetch a value from a file and save it to other

I have some text in one file which I want to be copied to another file, using shell script.
This is the script -
#!/bin/sh
PROPERTY_FILE=/path/keyValuePairs.properties
function getValue {
FIELD_KEY=$1
FIELD_VALUE=`cat $PROPERTY_FILE | grep "$FIELD_KEY" | cut --complement -d'=' -f1`
}
SERVER_FILE=/path/FileToReplace.yaml
getValue "xyz.abc"
sed -i -e "s|PASSWORD|$FIELD_VALUE|g" $SERVER_FILE
keyValuePairs.properties:
xyz.abc=abs
FileToReplace.yaml:
someField:
address: "someValue"
password: PASSWORD
The goal of the script is to fetch "abs" from keyValuePairs.properties and replace it in FileToReplace.yaml from PASSWORD field.
The FileToReplace.yaml should look like
someField:
address: "someValue"
password: abs
Note - Instead of "abs", there could be '=' in the text. It should work fine too.
The current situation is that when I run the script, it updates FileToReplace.yaml as
someField:
address: "someValue"
password:
It is setting the value as empty.
Can someone please help me figure what's wrong with this script?
Note - Whenever I execute the script, I get the issue -
sh scriptToRun.sh
cut: illegal option -- -
usage: cut -b list [-n] [file ...]
cut -c list [file ...]
cut -f list [-s] [-d delim] [file ...]
If I use gcut, the code just works fine, but I can't use gcut (requirement issues). I need to fix this using cut.
There are a few issues with your script:
FIELD_VALUE is local to the getValue() function.
getValue() will match rows containing FIELD_KEY anywhere in the line (e.g. some.property=string.containing.xyz.abc)
getValue() could return multiple rows.
All occurrences of the string "PASSWORD" in the server file will be updated, not just the ones on the "password: PASSWORD" line.
If you can use bash instead of sh, this should resolve all of the issues:
#!/bin/bash
declare property_file=/path/keyValuePairs.properties
declare server_file=/path/FileToReplace.yaml
declare property="xyz.abc"
property_line=$(grep -m 1 "^${property}=" ${property_file}" )
sed -i 's|^\(\s*password:\s*\)PASSWORD\s$|\1'${property_line##*=}'|g' ${server_file}
The original code which I posted, worked. I was using the wrong name of the file in the shell (in my real code) which was causing it to not read the value and hence setting it to empty.
Replace the cut command with:
cut -d'=' -f2-
and it should work on all versions of cut.
-f2- means field 2 and all later. This is necessary to handle values containing '='s.
And yes, some characters will cause problems for the sed command. It's hard to get a robust solution without getting into trouble here. A python script may be the better choice.
If shell script is the only option, you could try something like this:
(sed -n -e '1,/PASSWORD/p' FileToReplace.yaml | head -n -1;
echo " password: ${FIELD_VALUE}";
sed -n -e '/PASSWORD/,$ p' FileToReplace.yaml) > FileToReplace.yaml.new \
&& mv FileToReplace.yaml.new FileToReplace.yaml
but it gets quite ugly. (print the file up to the line containing "PASSWORD", then echo the full password line, then print the rest of the file)
You can also use something like this:
cat << EOF > FileToCreate.yaml
someField:
address: "someValue"
password: ${FIELD_VALUE}
if keeping the old contents of the file is not important.

Defining a variable using head and cut

might be an easy question, I'm new in bash and haven't been able to find the solution to my question.
I'm writing the following script:
for file in `ls *.map`; do
ID=${file%.map}
convertf -p ${ID}_par #this is a program that I use, no problem
NAME=head -n 1 ${ID}.ind | cut -f1 -d":" #Now: This step is the problem: don't seem to be able to make a proper NAME function. I just want to take the first column of the first line of the file ${ID}.ind
It gives me the return
line 5: bad substitution
any help?
Thanks!
There are a couple of issues in your code:
for file in `ls *.map` does not do what you want. It will fail e.g. if any of the filenames contains a space or *, but there's more. See http://mywiki.wooledge.org/BashPitfalls#for_i_in_.24.28ls_.2A.mp3.29 for details.
You should just use for file in *.map instead.
ALL_UPPERCASE names are generally used for system variables and built-in shell variables. Use lowercase for your own names.
That said,
for file in *.map; do
id="${file%.map}"
convertf -p "${id}_par"
name="$(head -n 1 "${id}.ind" | cut -f1 -d":")"
...
looks like it would work. We just use $( cmd ) to capture the output of a command in a string.

How to get the highest numbered link from curl result?

i have create small program consisting of a couple of shell scripts that work together, almost finished
and everything seems to work fine, except for one thing of which i'm not really sure how to do..
which i need, to be able to finish this project...
there seem to be many routes that can be taken, but i just can't get there...
i have some curl results with lots of unused data including different links, and between all data there is a bunch of similar links
i only need to get (into a variable) the link of the highest number (without the always same text)
the links are all similar, and have this structure:
always same text
always same text
always same text
i was thinking about something like;
content="$(curl -s "$url/$param")"
linksArray= get from $content all links that are in the href section of the links
that contain "always same text"
declare highestnumber;
for file in $linksArray
do
href=${1##*/}
fullname=${href%.html}
OIFS="$IFS"
IFS='_'
read -a nameparts <<< "${fullname}"
IFS="$OIFS"
if ${nameparts[1]} > $highestnumber;
then
highestnumber=${nameparts[1]}
fi
done
echo ${nameparts[1]}_${highestnumber}.html
result:
https://always/same/link/unique-name_19.html
this was just my guess, any working code that can be run from bash script is oke...
thanks...
update
i found this nice program, it is easily installed by:
# 64bit version
wget -O xidel/xidel_0.9-1_amd64.deb https://sourceforge.net/projects/videlibri/files/Xidel/Xidel%200.9/xidel_0.9-1_amd64.deb/download
apt-get -y install libopenssl
apt-get -y install libssl-dev
apt-get -y install libcrypto++9
dpkg -i xidel/xidel_0.9-1_amd64.deb
it looks awsome, but i'm not really sure how to tweak it to my needs.
based on that link and the below answer, i guess a possible solution would be..
use xidel, or use "$ sed -n 's/.href="([^"]).*/\1/p' file" as suggested in this link, but then tweak it to get the link with html tags like:
< a href="https://always/same/link/same-name_17.html">always same text< /a>
then filter out all that doesn't end with ( ">always same text< /a> )
and then use the grep sort as mentioned below.
Continuing from the comment, you can use grep, sort and tail to isolate the highest number of your list of similar links without too much trouble. For example, if you list of links is as you have described (I've saved them in a file dat/links.txt for the purpose of the example), you can easily isolate the highest number in a variable:
Example List
$ cat dat/links.txt
always same text
always same text
always same text
Parsing the Highest Numbered Link
$ myvar=$(grep -o 'https:.*[.]html' dat/links.txt | sort | tail -n1); \
echo "myvar : '$myvar'"
myvar : 'https://always/same/link/same-name_19.html'
(note: the command above is all one line separate by the line-continuation '\')
Applying Directly to Results of curl
Whether your list is in a file, or returned by curl -s, you can apply the same approach to isolate the highest number link in the returned list. You can use process substitution with the curl command alone, or you can pipe the results to grep. E.g. as noted in my original comment,
$ myvar=$(grep -o 'https:.*[.]html' < <(curl -s "$url/$param") | sort | tail -n1); \
echo "myvar : '$myvar'"
or pipe the result of curl to grep,
$ myvar=$(curl -s "$url/$param" | grep -o 'https:.*[.]html' | sort | tail -n1); \
echo "myvar : '$myvar'"
(same line continuation note.)
Why not use Xidel with xquery to sort the links and return the last?
xidel -q links.txt --xquery "(for $i in //#href order by $i return $i)[last()]" --input-format xml
The input-format parameter makes sure you don't need any html tags at the start and ending of your txt file.
If I'm not mistaken, in the latest Xidel the -q (quiet) param is replaced by -s (silent).

Bash tr -s command

So lets say I have several characters in an email which don't belong. I want to take them out with the tr command. For example...
jsmith#test1.google.com
msmith#test2.google.com
zsmith#test3.google.com
I want to take out all the test[123]. so I am using the command tr -s 'test[123].' < email > mail. That is one way I have tried but the two or three I have attempted all do not work as intended. The output I am trying to get to is ...
jsmith#google.com
msmith#google.com
zsmith#google.com
You could use sed.
$ sed 's/#test[1-3]\./#/' file
jsmith#google.com
msmith#google.com
zsmith#google.com
[1-3] matches all the characters which falls within the range 1 to 3 (1,2,3). Add in-place edit -i parameter to save the changes made.

Resources