How I would run this file without snippy? - anaconda

Running the following script to read train names and phenotypes from snippy output file "core.vcf"
$ grep '^CP001217' core.vcf > new_core.txt

Related

diff 2 files and print only difference bash in Jenkins job

I try to diff to 2 file in bash in Jenkins job
If i do this in minigw (eg gitbash) all working just fine
But if run same command in Jenkins i got all from files 2
For example i try 3 different methods
comm -1 -3 --nocheck-order file1.txt file2.txt
grep -vxF -f file1.txt file2.txt
diff --changed-group-format='%>' --unchanged-group-format='' file1.txt file2.txt
File is output of sqlplus command i and output is already sorted like this:
First file:
STANDARD
CONSTANT
PL_SQL
CREATE_OUT
RECALL
And second file
STANDARD
CONSTANT
PL_SQL
CREATE_OUT
RECALL
CONFIRM
I'm using git bash as shell and works in Windows. In git bash if i run any above command i got only changes and if i run same command in Jenkins i got all output from file2.txt
This driving me crazy(
UPD. I also try windows command findstr /bevg:file1.txt file2.txt
Same result(

bash one liner to remove duplicate path in line

I have a file with lot of a strings and one line starts with LIBXML2_INCLUDE
and the file is generated by another program to be specific by ./configure, this line wrongly gives two path and the first path is not correct and i need to remove it. This is how the line appears in file
LIBXML2_INCLUDE=-I/home/gan/Music/wvm/build/level/ast/deliveryx/libxml2//home/gan/Music/wvm/build/level/ast/deliveryx/libxml2/include/libxml2
i need to remove first /home/gan/Music/wvm/build/level/ast/deliveryx/libxml2/
and expected output is
LIBXML2_INCLUDE=-I/home/gan/Music/wvm/build/level/ast/deliveryx/libxml2/include/libxml2
How can i create a bash one liner to accomplish this?
Try like this:
# cat file
SOMEVAR=-I/some/path//some/path
# sed -i -e '/^SOMEVAR=/s,=-I.*//,=-I/,' file
# cat file
SOMEVAR=-I/some/path
#
To be a bit more fancy --
$ cat file
SOMEVAR=-I/some/path//some/path
$ sed -i -e '/^SOMEVAR=/s,=-I\(.*\)/\1$,=-I\1/,' file
$ cat file
SOMEVAR=-I/some/path/
$

using cat in a bash script is very slow

I have very big text files(~50,000) over which i have to do some text processing. Basically run multiple grep commands.
When i run it manually it returns in an instant , but when i do the same in a bash script - it takes a lot of time. What am i doing wrong in below bash script. I pass the names of files as command line arguments to script
Example Input data :
BUSINESS^GFR^GNevil
PERSONAL^GUK^GSheila
Output that should come in a file - BUSINESS^GFR^GNevil
It starts printing out the whole file on the terminal after quite some while. How do i suppress the same?
#!/bin/bash
cat $2 | grep BUSINESS
Do NOT use cat with program that can read file itself.
It slows thing down and you lose functionality:
grep BUSINESS test | grep '^GFR|^GDE'
Or you can do like this with awk
awk '/BUSINESS/ && /^GFR|^GDE/' test

File contents not deleted in linux centos

I have the program running using this command
command 2> sample.txt
now that file is growing continuously and command will exit in 5-6 days and i beleive that file size won't go in GB
I tried this
echo "" > sample.txt but thats not making any differnce to it and filesize is growing.
i was thinking of setting up cron job after 1 hour to empty its contents
How can i empty the contents of file
Try the following command, this will write the console output to a file. (Your console will also get the messages printed).
command | tee -a file.log
and you can empty the contents by
> file.log

BASH shell scripting file parsing [newbie]

I am trying to write a bash script that goes through a file line by line (ignoring the header), extracts a file name from the beginning of each line, and then finds a file by this name in one directory and moves it to another directory. I will be processing hundreds of these files in a loop and moving over a million individual files. A sample of the file is:
ImageFileName Left_Edge_Longitude Right_Edge_Longitude Top_Edge_Latitude Bottom_Edge_Latitude
21088_82092.jpg: -122.08007812500000 -122.07733154296875 41.33763821961143 41.33557596965434
21088_82093.jpg: -122.08007812500000 -122.07733154296875 41.33970040427444 41.33763821961143
21088_82094.jpg: -122.08007812500000 -122.07733154296875 41.34176252364274 41.33970040427444
I would like to ignore the first line and then grab 21088_82092.jpg as a variable. File names may not always be the same length, but they will always have the format digits_digits.jpg
Any help for an efficient approach is much appreciated.
This should get you started:
$ tail -n +2 input | cut -f 1 -d: | while read file; do test -f $dir/$file && mv -v $dir/$file $destination; done
You can construct a script that will do something like this, then simply run the script. The following command will give you a script which will copy the files from one place to another, but you can make the script generation more complex simply by changing the awk output:
pax:~$ cat qq.in
ImageFileName Left_Edge_Longitude Right_Edge_Longitude
21088_82092.jpg: -122.08007812500000 -122.07733154296875
21088_82093.jpg: -122.08007812500000 -122.07733154296875
21088_82094.jpg: -122.08007812500000 -122.07733154296875
pax:~$ awk -F: '/^[0-9]+_[0-9]+.jpg:/ {
printf "cp /srcdir/%s /dstdir\n",$1
} {}' qq.in
cp /srcdir/21088_82092.jpg /dstdir
cp /srcdir/21088_82093.jpg /dstdir
cp /srcdir/21088_82094.jpg /dstdir
You capture the output of that script (the last three lines) to another file then that file is your script for doing the actual copies.

Resources