getting error while using sed function on hp unix box - bash

I'm trying to retrieve nth column from "busfile" file by substituting values in "i"
the below code works fine on redhat linux, when tried on hp unix i'm getting error
"sed: Function {i}{p} cannot be parsed."
here is my code
acList=/z/temp/busfile
i=1
temp1=`sed -n "{i}{p}" $acList`
echo $temp1
Update:
Even when I add the $ as suggested in some of the answers, I still have the same problem.
temp1=`sed -n "${i}{p}" $acList`

If you're trying to use the i variable to print a line, you need to precede it with $:
temp1=`sed -n "${i}p" $acList`
as per the following transcript:
pax> i=3
pax> echo 'a
...> b
...> c
...> d
...> e
...> f
...> g' | sed -n "${i}p"
c
In situations like this, I tend to first try the simplest solution then gradually add complexity until it fails.
The first step would be to create a four-line file (called myfile) with the words one through four:
one
two
three
four
then try various commands with it, in ever increasing complexity:
sed -n "p" myfile # Print all lines.
sed -n "3p" myfile # Print hard-coded line.
i=3 ; sed -n "${i}p" myfile # Print line with parameter.
i=3 ; x=`sed -n "${i}p" myfile` ; echo $x # Capture line with parameter.
At some point, it will hopefully "break" and you can then target your investigations in a more concentrated manner.
However, I suspect it's unnecessary here since your purported use of that command to extract a column is incorrect. If you're trying to print a column rather than a line, then awk may be a better tool for the job:
pax> i=5
pax> echo 'pax is a really great guy' | awk -vf=$i '{print $f}'
great

You can use:
acList=/z/temp/busfile
i=1
temp1=`sed -n $i'p' $acList`
echo "$temp1"

Related

Speed up bash for loop which contains multiple sed commands

my bash for loop looks like:
for i in read_* ; do
cut -f1 $i | sponge $i
sed -i '1 s/^/>/g' $i
sed -i '3 s/^/>ref\n/g' $i
sed -i '4d' $i
sed -i '1h;2H;1,2d;4G' $i
mv $i $i.fasta
done
Are there any methods of speeding up this process, perhaps using GNU parallel?
EDIT: Added input and expected output.
Input:
sampleid 97 stuff 2086 42 213M = 3322 1431
TATTTAGGGAAGATCTGGCCTTCCTACAAGGGAAGGCCAGGGAATTTTCTTCAGAGCAGA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
TTTTTAGGGAAGATCTGGCCTTCCTACAAGGGAAGGCCAGGGAATTTTCTTCAGAGCAGA
Hopeful output:
>ref
TTTTTAGGGAAGATCTGGCCTTCCTACAAGGGAAGGCCAGGGAATTTTCTTCAGAGCAGA
>sampleid
TATTTAGGGAAGATCTGGCCTTCCTACAAGGGAAGGCCAGGGAATTTTCTTCAGAGCAGA
I used the sed -i '1h;2H;1,2d;4G' $i command to swap lines 2 and 4.
If I read it right, this should create the same result, though it would probably help a LOT if I could see what your input and expected output look like...
awk '{$0=$1}
FNR==1{hd=">"$0; next}
FNR==2{hd=hd"\n"$0;next}
FNR==3{print ">ref\n"$0 > FILENAME".fasta"}
FNR==4{next}
FNR==5{print hd"\n"$0 > FILENAME".fasta"}
' read_*
My input files:
$: cat read_x
foo x
bar x
baz x
last x
curiosity x
$: cat read_y
FOO y
BAR y
BAZ y
LAST y
CURIOSITY y
and the resulting output files:
$: cat read_x.fasta
>ref
baz
>foo
bar
curiosity
$: cat read_y.fasta
>ref
BAZ
>FOO
BAR
CURIOSITY
This runs in one pass with no loop aside from awk's usual internals, and leaves the originals in place so you can check it first. If all is good, all that's left is to remove the originals. For that, I would use extended globbing.
$: shopt -s extglob; rm read_!(*.fasta)
That will clean up the original inputs but not the new outputs.
Same results, three commands, no loops.
I am, or course, making some assumptions about what you are meaning to do that might not be accurate. To get this format in a single sed call -
$: sed -e 's/[[:space:]].*//' -e '1{s/^/>/;h;d}' -e '2{H;s/.*/>ref/}' -e '4x' read_x
>ref
baz
>foo
bar
curiosity
but that's not the same commands you used, so maybe I'm misreading it.
To use this to in-place edit multiple files at a time (instead of calling it in a loop on each file), use -si so that the line numbers apply to each file rather than the stream of records they collectively produce.
DON'T use -is, though you could use -i -s.
$: sed -s -i -e 's/[[:space:]].*//' -e '1{s/^/>/;h;d}' -e '2{H;s/.*/>ref/}' -e '4x' read_*
This still leaves you with the issue of renaming each, but xargs makes that pretty easy in the given example.
printf "%s\n" read_* | xargs -I# mv # #.fasta
addendum
Using the file you gave in the OP, assuming every file is the same general structure and exactly 4 lines -
$: cat file_0 # I made files 0 through 7, but with same data
sampleid 97 stuff 2086 42 213M = 3322 1431
TATTTAGGGAAGATCTGGCCTTCCTACAAGGGAAGGCCAGGGAATTTTCTTCAGAGCAGA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
TTTTTAGGGAAGATCTGGCCTTCCTACAAGGGAAGGCCAGGGAATTTTCTTCAGAGCAGA
$: sed -Esi '1{s/^([^[:space:]]+).*/>\1/;h;s/.*/>ref/}; 3x;' file_?
$: cat file_0 # used a diff on each, worked on all at once
>ref
TATTTAGGGAAGATCTGGCCTTCCTACAAGGGAAGGCCAGGGAATTTTCTTCAGAGCAGA
>sampleid
TTTTTAGGGAAGATCTGGCCTTCCTACAAGGGAAGGCCAGGGAATTTTCTTCAGAGCAGA
Breakout:
-Esi Extended pattern matching, separate file linecounts, in-place edits
1{...}; Collectively do these commands, in order, only on every line 1
s/^([^[:space:]]+).*/>\1/ add leading > but strip everything after any whitespace
h store the resulting >\1 line in the hold buffer
s/.*/>ref/ then replace the whole line with a literal >ref
`3x' swap line 3 with the value in the hold buffer from line 1
file_? I used a glob to supply the appropriate list of files all at once.
Doing same with awk:
$: awk 'FNR==1{id=">"$1; print ">ref" >FILENAME".fasta"; next} FNR==3{print id > FILENAME".fasta"; next} {print $0 > FILENAME".fasta"}' file_?
Then you can do file management as above with the xargs/mv for the sed or the shopt/rm for the awk - or we could add a little organizational work in awk if you like. Consider this:
awk 'BEGIN { system(" mkdir -p done ") }
FNR==1 { id=">"$1; print ">ref" > FILENAME".fasta"; next } # skip printing original
FNR==3 { print id > FILENAME".fasta"; next } # skip printing original
{ print $0 > FILENAME".fasta" } # every line NOT skipped
FNR==4 { close(FILENAME); close(FILENAME".fasta");
system("mv " FILENAME " done/")
}' file_?
Then if there are any problems, it's easy to delete the fasta's, move the originals back, adjust the code, and try again. If everything is ok, it's fast and easy to rm -fr done, yes?
Note that I really only added the mkdir inside a system call in the awk to show that you can, and to keep from having to manually do it separately if you have to run a few iterations or move it all into a wrapper script, etc.
The code in the question runs multiple subprocesses (cut, sponge, sed four times, and mv) for each file that is processed. Running subprocesses is relatively slow, so you can speed up the code significantly by reducing the number of them.
This Shellcheck-clean code is one way to do it:
#! /bin/bash -p
old_files=()
for f in read_* ; do
readarray -t lines <"$f"
printf '>ref\n%s\n>%s\n%s\n' \
"${lines[3]}" "${lines[0]%%[[:space:]]*}" "${lines[1]}" >"$f.fasta"
old_files+=( "$f" )
done
rm -- "${old_files[#]}"
This runs no subprocesses when processing individual files. It just reads the lines of the old file into an array using the built-in readarray command and writes to the new file using the built-in printf.
See Removing part of a string (BashFAQ/100 (How do I do string manipulation in bash?)) for an explanation of the %% in ${lines[0]%%[[:space:]]*}.
To avoid running rm for each file, the code keeps a list of files to be deleted and removes all of them at the end. If you try the code, consider commenting the rm line until you are very confident that the rest of the code is doing what you want.

shell script compare file with multiple line pattern

I have a file which is created after some manual configuration.
I need to check this file automatically with a shell script.
The file looks like this:
eth0;eth0;1c:98:ec:2a:1a:4c
eth1;eth1;1c:98:ec:2a:1a:4d
eth2;eth2;1c:98:ec:2a:1a:4e
eth3;eth3;1c:98:ec:2a:1a:4f
eth4;eth4;48:df:37:58:da:44
eth5;eth5;48:df:37:58:da:45
eth6;eth6;48:df:37:58:da:46
eth7;eth7;48:df:37:58:da:47
I want to compare it to a pattern like this:
eth0;eth0;*
eth1;eth1;*
eth2;eth2;*
eth3;eth3;*
eth4;eth4;*
eth5;eth5;*
eth6;eth6;*
eth7;eth7;*
If I would only have to check this pattern I could run this loop:
c=0
while [ $c -le 7 ]
do
if [ "$(grep "eth"${c}";eth"${c}";*" current_mapping)" ];
then
echo "eth$c ok"
fi
(( c++ ))
done
There are 6 or more different patterns possible. A pattern could also look like this for example (depending and specific configuration requests):
eth4;eth0;*
eth5;eth1;*
eth6;eth2;*
eth7;eth3;*
eth0;eth4;*
eth1;eth5;*
eth2;eth6;*
eth3;eth7;*
So I don't think I can run a standard grep per line command in a loop. The eth numbers are not consistently the same.
Is it possible somehow to compare the whole file to pattern like it would be possible with grep for a single line?
Assuming file is your data file and patt is your file that contains above pattern. You can use this grep -f in conjunction with sed in a process substitution that replaces * with .* and ? with . to make it a workable regex.
grep -f <(sed 's/\*/.*/g; s/?/./g' patt) file
eth0;eth0;1c:98:ec:2a:1a:4c
eth1;eth1;1c:98:ec:2a:1a:4d
eth2;eth2;1c:98:ec:2a:1a:4e
eth3;eth3;1c:98:ec:2a:1a:4f
eth4;eth4;48:df:37:58:da:44
eth5;eth5;48:df:37:58:da:45
eth6;eth6;48:df:37:58:da:46
eth7;eth7;48:df:37:58:da:47
I wrote this loop now and it does the job (current_mapping being the file with the content in the first code block of the question). I would have to create arrays with different patterns and use a case for every pattern. I was just wondering if there is something like grep for multiple lines, that could the same without writing this loop.
array=("eth0;eth0;*" "eth1;eth1;*" "eth2;eth2;*" "eth3;eth3;*" "eth4;eth4;*" "eth5;eth5;*" "eth6;eth6;*" "eth7;eth7;*")
c=1
while [ $c -le 8 ]
do
if [ ! "$(sed -n "${c}"p current_mapping | grep "${array[$c-1]}")" ];
then
echo "somethings wrong"
fi
(( c++ ))
done
Try any:
grep -P '(eth[0-9]);\1'
grep -E '(eth[0-9]);\1'
sed -n '/\(eth[0-9]\);\1/p'
awk -F';' '$1 == $2'
There are commands only. Apply them to a pipe or file.
Updated the answer after the question was edited.
As we can see the task requirements are as follows:
a file (a set of lines) formatted like ethN;ethM;MAC
examine each line for equality ethN and ethM
if they are equal, output a string ethN ok
If I understand the task correctly we can achieve this using the following code without loops:
awk -F';' '$1 == $2 { print $1, "ok" }'

Using sed to replace character

I am leaning how to use awk and sed this week. I know this question might have been asked before but I am not sure what is wrong with my script. I have three files and I am using grep to search for the pattern gge0001x gge0001y gge0001z. x is in file1, y is in file 2, and z is in file3. If anyone wants to see L2E[1-3].iva they are here: https://gist.github.com/anonymous/1112988408874c730cd4f3d313226ba4
#!/bin/bash
echo "Performance Data"
sed -n '1,19p' L2E1.iva|cat > file1 #take lines 1-19 in L2E1 and take the
# output into file1. The next two commands do the same thing
sed -n '1,19p' L2E2.iva|cat > file2
sed -n '1,19p' L2E3.iva|cat > file3
curveName=`grep "F" file1|sed "s/F/ /"`
# This will search for F in file 1, and then substitute F with a space
curveName2=`grep "F" file2|sed "s/F/ /"`
curveName3=`grep "F" file3|sed "s/F/ /"`
echo "Curve Name" "$curveName $curveName2 $curveName3"
I want my output to be Curve Name gge0001x gge0001y gge0001z. But the output is this instead:
Performance Data gge0006ze gge0006x
If I echo them out by themselves then it is fine, but once I echo all three on the same line the output gets skewed. Why does x show up last when it is first when I echo it and where did my y go to?
A few tips at first:
sed -n '1,19p' L2E1.iva|cat > file1
You can omit the cat an redirect the output of sed directly to the file:
sed -n '1,19p' L2E1.iva > file1
curveName=`grep "F" file1|sed "s/F/ /"`
Use $() instead of backticks for process substitution:
curveName=$(grep "F" file1 | sed "s/F/ /")
But the output is this instead: Performance Data gge0006ze gge0006x
The reason for Performance Data in your output is, that you echo it at the beginning of your script.
Moreover you've got a typo in the last echo: $curvename2 -> $curveName2, this is why your y is missing.
Did you double check your files for the right contents? That's the only reason i can imagine, why your x comes last and the z first.
You can perhaps compress your script into one line
$ echo Curve name $(grep -Pohm1 '(?<=F ).*' L2E{1..3})
Curve name gge0006x gge0006y gge0006z
exercise is to search the options used in grep

Bash - extracting a string between two points

For example:
((
extract everything here, ignore the rest
))
I know how to ignore everything within, but I don't know how to do the opposite. Basically, it'll be a file and it needs to extract the data between the two points and then output it to another file. I've tried countless approaches, and all seem to tell me the indentation I'm stating doesn't exist in the file, when it does.
If somebody could point me in the right direction, I'd be grateful.
If your data are "line oriented", so the marker is alone (as in the example), you can try some of the following:
function getdata() {
cat - <<EOF
before
((
extract everything here, ignore the rest
someother text
))
after
EOF
}
echo "sed - with two seds"
getdata | sed -n '/((/,/))/p' | sed '1d;$d'
echo "Another sed solution"
getdata | sed -n '1,/((/d; /))/,$d;p'
echo "With GNU sed"
getdata | gsed -n '/((/{:a;n;/))/b;p;ba}'
echo "With perl"
getdata | perl -0777 -pe "s/.*\(\(\s*\\n(.*)?\)\).*/\$1/s"
Ps: yes, its looks like a dance of crazy toothpicks
Assuming you want to extract the string inside (( and )):
VAR="abc((def))ghi"
echo "$VAR"
VAR=${VAR##*((}
VAR=${VAR%%))*}
echo "$VAR"
## cuts away the longest string from the beginning; # cuts away the shortest string from the beginning; %% cuts away the longest string at the end; % cuts away the shortes string at the end
The file :
$ cat /tmp/l
((
extract everything here, ignore the rest
someother text
))
The script
$ awk '$1=="((" {p=1;next} $1=="))" {p=o;next} p' /tmp/l
extract everything here, ignore the rest
someother text
sed -n '/^((/,/^))/ { /^((/b; /^))/b; p }'
Brief explanation:
/^((/,/^))/: range addressing (inclusive)
{ /^((/b; /^))/b; p }: sequence of 3 commands
1. skip line with ^((
2. skip line with ^))
3. print
The line skipping is required to make the range selection exclusive.

Setting a BASH environment variable directly in AWK (in an AWK one-liner)

I have a file that has two columns of floating point values. I also have a C program that takes a floating point value as input and returns another floating point value as output.
What I'd like to do is the following: for each row in the original, execute the C program with the value in the first column as input, and then print out the first column (unchanged) followed by the second column minus the result of the C program.
As an example, suppose c_program returns the square of the input and behaves like this:
$ c_program 4
16
$
and suppose data_file looks like this:
1 10
2 11
3 12
4 13
What I'd like to return as output, in this case, is
1 9
2 7
3 3
4 -3
To write this in really sketchy pseudocode, I want to do something like this:
awk '{print $1, $2 - `c_program $1`}' data_file
But of course, I can't just pass $1, the awk variable, into a call to c_program. What's the right way to do this, and preferably, how could I do it while still maintaining the "awk one-liner"? (I don't want to pull out a sledgehammer and write a full-fledged C program to do this.)
you just do everything in awk
awk '{cmd="c_program "$1; cmd|getline l;print $1,$2-l}' file
This shows how to execute a command in awk:
ls | awk '/^a/ {system("ls -ld " $1)}'
You could use a bash script instead:
while read line
do
FIRST=`echo $line | cut -d' ' -f1`
SECOND=`echo $line | cut -d' ' -f2`
OUT=`expr $SECOND \* 4`
echo $FIRST $OUT `expr $OUT - $SECOND`
done
The shell is a better tool for this using a little used feature. There is a shell variable IFS which is the Input Field Separator that sh uses to split command lines when parsing; it defaults to <Space><Tab><Newline> which is why ls foo is interpreted as two words.
When set is given arguments not beginning with - it sets the positional parameters of the shell to the contents of the arguments as split via IFS, thus:
#!/bin/sh
while read line ; do
set $line
subtrahend=`c_program $1`
echo $1 `expr $2 - $subtrahend`
done < data_file
Pure Bash, without using any external executables other than your program:
#!/bin/bash
while read num1 num2
do
(( result = $(c_program num2) - num1 ))
echo "$num1 $result"
done
As others have pointed out: awk is not not well equipped for this job. Here is a suggestion in bash:
#!/bin/sh
data_file=$1
while read column_1 column_2 the_rest
do
((result=$(c_program $column_1)-$column_2))
echo $column_1 $result "$the_rest"
done < $data_file
Save this to a file, say myscript.sh, then invoke it as:
sh myscript.sh data_file
The read command reads each line from the data file (which was redirected to the standard input) and assign the first 2 columns to $column_1 and $column_2 variables. The rest of the line, if there is any, is stored in $the_rest.
Next, I calculate the result based on your requirements and prints out the line based on your requirements. Note that I surround $the_rest with quotes to reserve spacing. Failure to do so will result in multiple spaces in the input file to be squeezed into one.

Resources