I have a bash script with that loops through lines of a file (a.txt) containing paths to a list of gzipped files and searches for phrases in each of those files using zgrep.
My bash script is:
for i in $(cat $1); do
echo $i;
echo zgrep -E '"phrase1|phrase2|phrase3|phrase4|phrase5|phrase6"' $i;
zgrep -E '"phrase1|phrase2|phrase3|phrase4|phrase5|phrase6"' $i;
done
which I call by myscript.sh a.txt:
The output is:
zgrep -E "phrase1|phrase2|phrase3|phrase4|phrase5|phrase6" myzippedfile.1.gz
phrase2 48 48.00 48
phrase3 35 35.00 35
phrase4 67 67.00 67
phrase5 99 99.00 99
(repeated for each file listed in a.txt).
However, when I execute the zgrep command that is being executed inside the scripts for loop, I get a different output.
Executing:
zgrep -E "phrase1|phrase2|phrase3|phrase4|phrase5|phrase6" myzippedfile.1.gz
at the command line yields:
phrase1 29 29.00 29
phrase2 48 48.00 48
phrase3 35 35.00 35
phrase4 67 67.00 67
phrase5 99 99.00 99
phrase6 54 54.00 54
This output is correct, whereas the output generated from the bash script for loop is missing the first and last lines. How can this be?
Does anyone see any issues in my bash script? Why would the first and last lines be missing from the output?
The first and last elements in your alternatives contain literal " characters. So you'll only match phrase1 if there's a " before it, and only match phrase6 if there's a " after it. You shouldn't have those double quotes in the pattern, it should be:
zgrep -E 'phrase1|phrase2|phrase3|phrase4|phrase5|phrase6' $i;
You don't have the extra quotes when you execute the zgrep command by hand.
Related
I have 45 lines of code that need a sed command. Due to the recent change in GNU all my scripts are breaking and need -std=legacy & -fallow-invalid-boz. The only way I know how to do this is with sed. I'm not a computer programmer and sed is simple and easy to understand.
These are a sample of my sed commands.
Is there a way to do all these sed commands in a loop or with sed itself. If there is another editor that makes it easier I can try to learn that too.
I have tried this
for X in [24,28,32,36,40,45,49,53,56,60,64,68,69,73,74,79]
sed -i '$Xs/= /= -std=legacy -fallow-invalid-boz /g' $HOME/WRF/Downloads/NCEPlibs/macros.make.linux.gnu
done
But I get the error:
$ for X in [24,28,32,36,40,45,49,53,56,60,64,68,69,73,74,79] sed -i
'$Xs/= /= -std=legacy -fallow-invalid-boz /g'
$HOME/WRF/Downloads/NCEPlibs/macros.make.linux.gnu done bash: syntax
error near unexpected token sed' bash: syntax error near unexpected token done'
doing it like this is a lot smarter
y="24 28 32 36 40 45 49 53 56 60 64 68 69 73 74 79"
for X in $y; do
sed -i "${X}s/= /= -std=legacy -fallow-invalid-boz /g" $HOME/WRF/Downloads/NCEPlibs/macros.make.linux.gnu
done
First of all, you forgot the do statement in for so the for statement will just fail before it can even execute.
Second of all [24,28,32,36,40,45,49,53,56,60,64,68,69,73,74,79] in not valid as for uses newlines and or white spaces to declare a new value going from left to right.
And last but not least, using $X is not valid in this example as bash reads it as $Xs/ so using ${X} is the correct way and of course using "" instead of using '' so ${X} can actually be used.
I have a .txt file with bookmarks and all bookmarks above 100 have to be placed 2 pages down from where they are now, because I added two pages in the document. How do I write a bash script that adds 2 to all integers it finds in the document?
I'm new to writing code in general, but I already know that I should make a for loop to read each line, then determine if each word is an integer or not and then with an if statement add 2 to each integer above 100.
The problem is that i don't exactly know how to access (read and write) to the file and I also don't know how to determine if something is a number or not.
Here is the link to the .txt file. A small sample:
The Tortle Package; 24
Tortle; 25
Elemental Evil Player's Companion; 27
Aarakocra; 28
Deep Gnome (gnome subrace); 30
Eberron\: Rising from the Last War; 84
Changelings; 85
Gnomes; 91
Goblinoids; 92
Bugbear; 93
Goblin; 94
Hobgoblin; 94
Half-Elves; 94
I did some research and this is the code I've come up with:
#!/bin/bash
cd /home/dexterdy/Documents/
i=$(grep -ho '[0-9]*' bookmarks.txt)
if [ "$i" -gt 100 ]; then
i += 2
fi
It seems that the grep variable outputs one large string with all the numbers. I also can't get the if-statement to work for some reason and I don't know how to actually write the numbers into the file.
From the shape of your input file, I suggest the following magic:
awk 'BEGIN{FS=OFS=";"}($NF>100){$NF+=2}1' input_file > output_file
This will remove that space just after the ;, which can be set back when doing:
awk 'BEGIN{FS=OFS=";"}($NF>100){$NF=" "($NF+2)}1' input_file > output_file
If you want to ensure that malformatted lines such as
foo;20
bar\; car;105
are all correctly converted into
foo; 20
bar\; car; 107
You have to do:
awk 'BEGIN{FS=OFS=";"}{$NF=" "($NF+($NF>100?2:0))}1' input_file > output_file
I want to get the longest line in /etc/sudoers.
For wc.
sudo wc -L /etc/sudoers
90 /etc/sudoers
For awk
sudo awk '{if(length>L){L=length;s=$0}}END{print NR,L,s}' /etc/sudoers
27 83 Defaults secure_path="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
awk treat 8 whitespaces in 27th line as just 1 whitespace,8-1=7,90-83=7.
So the numbers got with wc is 7 bigger than awk.
How to keep the numbers of whitspace in a line unchanged with awk,to make awk treat 8 whitespaces in 27th line unchanged ,instead of 1.
How to fix the command sudo awk '{if(length>L){L=length;s=$0}}END{print NR,L,s}' /etc/sudoers ,to output the following result with awk.
27 90 Defaults secure_path="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
This is probably happening because the file contains tabs.
With the --max-line-length' option,wc' prints the length of the
longest line per file, and if there is more than one file it prints
the maximum (not the sum) of those lengths. The line lengths here are
measured in screen columns, according to the current locale and
assuming tab positions in every 8th column.
So wc assumes each tab to be of width 8. We have to emulate this in awk. One special case when the tab only occurs in the beginning of the line is easy to mend.
sudo awk '{sub("\t"," ") ;
if(length>L){L=length;s=$0}}END{print NR,L,s}' /etc/sudoers
The max length of the line in /etc/sudoers contained a tab, as you can see from the hexdump of the file,
$ sudo cat /etc/sudoers | hexdump -C
...
...
00000110 73 73 0a 44 65 66 61 75 6c 74 73 09 73 65 63 75 |ss.Defaults.secu|
...
...
You may see 09 between 2 s (hex: 73). And look it into the ascii table, it would be horizontal tab.
You may see from #Dmitri Chubarov's answer, wc would treat the length of '\t' as 8.
An alternative way to make awk output the same result as wc -L can be,
$ sudo awk '{if(length>L){L=7*(match($0,/\t/)?1:0)+length;s=$0}}END{print NR,L,s}' /etc/sudoers
30 100 Defaults secure_path="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin"
The function match($0,/\t/)?1:0 would return 1 any \t existed, add 7 for the length if yes.
I want to find top CPU usage pid, my script is here:
#!/bin/sh
ppid=`top -n 1 |sed -n 8p |awk '{print $1}'`
echo $ppid
ps aux|grep $ppid
but I get an error:
grep: Unmatched [ or [^
Why? How can I fix it?
OK! I found it! The problem is that top is including terminal control sequences in its output. So you don't actually see it in the echo, but I noticed because it had put my terminal session into bold output and later I was trying to figure out how it happened and I traced it back to my testing for this question. So, the [ that grep was complaining about is in the escape sequence that ppid gets set to. I got a work around by adding |tr -dc 0-9 after the awk, i.e.
ppid=`top -n 1 |sed -n 8p |awk '{print $1}'|tr -dc 0-9`
That will delete anything that's not a digit at the end. But the sed needs to be adjusted, too, I think. And, I suspect some of the digits may be from the escape sequence, so you need to come up with a cleaner way to excise the escape sequence.
But, in the final analysis this will be pretty useless. The highest user of the CPU every time I ran that pipeline was the top process that's part of it. In retrospect that's probably not surprising.
When all else fails, examine the input:
$ top -n 1 | awk 'NR==8 {print $1 ": " $2}' | hexdump -C
00000000 1b 28 42 1b 5b 6d 1b 5b 31 6d 31 38 37 31 35 3a |.(B.[m.[1m18715:|
00000010 20 6a 6b 6c 6f 77 64 65 6e 0a | jklowden.|
0000001a
(I shortened your command. Nearly every command that combines sed and awk can be better expressed with just awk. Then I added the second field, so we could see what's going on.)
The result is nondeterministic. top will highlight changed lines; to do so, it emits an ANSI escape sequence. If you capture one, you'll capture that sequence -- esc(Besc[mesc[1m -- which will look very weird indeed to ps. The brackets in that sequence doubtless provoked your error message.
To fix that, your top probably has a batch mode. In mine, top -n 1 -b does the trick.
How to use ps to find a command line?
I'm afraid the best answer is RTFM. ps is one of those commands with a lot of variation across systems. My GNU version likes this:
$ ps -c -f -p $(top -n 1 -b | awk 'NR==8 {print $1}')
UID PID PPID CLS PRI STIME TTY TIME CMD
root 1300 1 TS 19 Jun30 ? 00:14:46 /usr/bin/python /usr/bin/la
Other observations:
Prefer $() to backticks
While this kind of thing is fine for learning, look for solutions that don't involve parsing output from interactive utilities. Under account in the manual, you'll find ways to capture much more information than just command line of what happens to be on top at the moment.
HTH.
Because your ppid retrieve null value
change your command to retrieve ppid value as below
top -n 1 | sed -n 8p | awk -F " " '{print$2}'
And other thing for happening this is top process will end immediately
This is in my ubuntu 14.04 os.
I'm looking for the best way to use pipes when the input program doesn't write to stdout. Specifically I want to pipe objcopy into hexdump like this
objcopy -I ifmt -O binary ifile - | hexdump -C
but objcopy doesn't accept '-' as a file meaning 'write to stdout' as some programs do.
At the moment I am doing
objcopy -I ifmt -O binary ifile tmpfile; hexdump -C tmpfile; rm tmpfile
but was wondering if there was a better way.
I am using bash 4.1.10 on cygwin.
I wrote a comment prescribing a process substitution, but it can't work for objcopy since objcopy will try to open a seekable file (as it may need to move back and forth in the file).
In short: objcopy can't write to a stream as stdout, that's why its output must be a file that can be seeked. Your solution is very likely the only reasonable possibility.
To answer your question
I'm looking for the best way to use pipes when the input program doesn't write to stdout
in a more general fashion (but that's not applicable with objcopy or any command that requires seeking the file), in Bash you can use process substitution: if mycommand takes a parameter that is an output file, and doesn't accept - for standard output and doesn't write to standard output by default, you can use it as:
mycommand >(cat)
or if you want to pipe it through, e.g., hexdump -C:
mycommand >(hexdump -C)
With this, mycommand will see an argument of the form /dev/fd/42 (where 42 may differ), and will be able to open it for writing, as if it where a regular file (but not seekable), and hexdump will get on its standard input the written data.
You can experiment process substitution like so: call the following script mycommand:
#!/bin/bash
if [[ $1 ]]; then
echo "Hi, this is mycommand, and I was called with first argument: \`$1'"
echo "I'm outputting this to the file given as argument" > "$1"
else
echo >&2 "Please provide an argument (file to write to)"
exit 1
fi
This script makes sure that you give a non-empty argument (otherwise shows an error message), outputs this argument to standard output, and a little line in the file the name of which is given as argument.
Then chmod +x mycommand and play with it:
$ ./mycommand
Please provide an argument (file to write to)
$ ./mycommand -
Hi, this is mycommand, and I was called with first argument: `-'
$ ls
- mycommand
$ rm ./-
$ ./mycommand >(cat)
Hi, this is mycommand, and I was called with first argument: `/dev/fd/63'
I'm outputting this to the file given as argument
$ ./mycommand >(tr -d e)
Hi, this is mycommand, and I was called with first argument: `/dev/fd/63'
I'm outputting this to th fil givn as argumnt
$ ./mycommand >(hexdump -C)
Hi, this is mycommand, and I was called with first argument: `/dev/fd/63'
00000000 49 27 6d 20 6f 75 74 70 75 74 74 69 6e 67 20 74 |I'm outputting t|
00000010 68 69 73 20 74 6f 20 74 68 65 20 66 69 6c 65 20 |his to the file |
00000020 67 69 76 65 6e 20 61 73 20 61 72 67 75 6d 65 6e |given as argumen|
00000030 74 0a |t.|
00000032
$ ./mycommand >(cat) > /dev/null
I'm outputting this to the file given as argument
Actually there is a way if you use zsh
(){objcopy -O binary -j .text <input file> $1; cat $1} =(touch -c .) | hexdump -C
Explanation
=() is zsh process substitution, which supports the seeking property needed.
$1 inside the function body referes to the file created by =(touch -c .).
Note that the cmd "touch -c ."
-c is ---no-create so the command itself does nothing, we are simply using it for the side effect =() has, which is a temporary file zsh creates containing the output of touch -c . (which is nothing) and deletes once the command is finished.
(){} is anonymous function syntax which keeps the file created by =() inside the function body around long enough to be cat'ed to stdout.