So I'm having some troubles in finding a way to isolate the PIDs from top using pipelines and not being able to useawk or perl. So far I'm able to isolate the specific Users (Cannot be your username or root) and now I'm not sure how to move on from here, I've tried using cut and several other options but it's not working. Here's my work so far:
top -n 1 | tail -n +8 | grep -Ev '\broot\b | \bmyUserName\b`
This outputs all the information minus the heading, and I need to remove everything else but the PIDs... Could anyone help at all?
EDIT: Also, right now what seems to work is just adding | cut -c 4-11 which shows only the PID, because there is only one other user that is not root on the system. I'm not sure it will work if there's more, but is there any better ideas as to how to make it work?
In theory:
top -n 1 | tail -n +8 | grep -Ev ' root | myUserName ' |
sed -e 's/^[ ]*\([0-9][0-9]*\) .*/\1/'
The sed command looks for start of line, optional blanks followed by a number and a blank and trailing garbage. However, this doesn't work because top generates screen control characters:
28433 jleffler 20 0 1511m 403m 31m S 2 1.3 70:35.76 chrome
looks OK, but when pushed through a hex dump, the output is:
0x0000: 1B 28 42 1B 5B 6D 32 38 34 33 33 20 6A 6C 65 66 .(B.[m28433 jlef
0x0010: 66 6C 65 72 20 20 32 30 20 20 20 30 20 31 35 31 fler 20 0 151
0x0020: 30 6D 20 34 30 34 6D 20 20 33 31 6D 20 53 20 20 0m 404m 31m S
0x0030: 20 20 34 20 20 31 2E 33 20 20 37 30 3A 33 37 2E 4 1.3 70:37.
0x0040: 38 37 20 63 68 72 6F 6D 65 20 20 20 20 20 20 20 87 chrome
To suppress that, use top -b (batch mode):
top -b -n 1 | tail -n +8 | grep -Ev ' root | myUserName ' |
sed -e 's/^[ ]*\([0-9][0-9]*\) .*/\1/'
This should generate a list of PIDs; it did for me.
If you were allowed awk, you might simplify that to:
top -b -n 1 | awk 'NR<=8 || $2~/^(root|myUserName)$/ {next} {print $1}'
And all this is predicated on 'Using top is a good way to go', rather than using ps (which is the normal tool to use for gathering PIDs.
If you want the PID's of all processes not spawned by root or some_user then you could list these processes using ps with -U user and the negation option -N:
ps -U root -U some_user -N -o pid
The -o option specifies that we're only interested in the PID in the output.
Now you can easily do something with these PID's in a loop or similar:
for pid in $(ps -U root -U some_user -N -o pid); do
# something to $pid
done
Related
Grep doesn't seem to match certain strings from man output. It seems to be random in that I can't work out any rhyme or reason as to whether a string will match or not.
man sed | head -7:
SED(1) BSD General Commands Manual SED(1)
NAME
sed -- stream editor
SYNOPSIS
$ man sed | head -7 | grep sed # no match
$ man sed | head -7 | grep stream # match on "stream"
sed -- stream editor
$ man sed | head -7 | grep '\-\-' # match on "--"
sed -- stream editor
$ man sed | head -7 | grep NAME # no match
$ man sed | head -7 | grep SYNOPSIS # no match
This also happens when redirecting the output to a file and grepping that
$ man sed | head -7 > /tmp/sed.man
$ cat /tmp/sed.man | grep sed # no match
$ cat /tmp/sed.man | grep stream # match on "stream"
sed -- stream editor
$ grep sed /tmp/sed.man # no match
$ grep stream /tmp/sed.man # match on "stream"
sed -- stream editor
grep: grep (BSD grep) 2.5.1-FreeBSD
man: version 1.6c
macOS: 10.14.6 Beta
bash: GNU bash, version 5.0.7(1)-release (x86_64-apple-darwin18.5.0)
$ man sed | head -7 | hexdump -C
00000000 0a 53 45 44 28 31 29 20 20 20 20 20 20 20 20 20 |.SED(1) |
00000010 20 20 20 20 20 20 20 20 20 20 20 42 53 44 20 47 | BSD G|
00000020 65 6e 65 72 61 6c 20 43 6f 6d 6d 61 6e 64 73 20 |eneral Commands |
00000030 4d 61 6e 75 61 6c 20 20 20 20 20 20 20 20 20 20 |Manual |
00000040 20 20 20 20 20 20 20 20 20 53 45 44 28 31 29 0a | SED(1).|
00000050 0a 4e 08 4e 41 08 41 4d 08 4d 45 08 45 0a 20 20 |.N.NA.AM.ME.E. |
00000060 20 20 20 73 08 73 65 08 65 64 08 64 20 2d 2d 20 | s.se.ed.d -- |
00000070 73 74 72 65 61 6d 20 65 64 69 74 6f 72 0a 0a 53 |stream editor..S|
00000080 08 53 59 08 59 4e 08 4e 4f 08 4f 50 08 50 53 08 |.SY.YN.NO.OP.PS.|
00000090 53 49 08 49 53 08 53 0a |SI.IS.S.|
00000098
Googling is hard for this problem as any combination of "man" or "grep" doesn't mention my problem that strings (with no special characters) are not matching.
man-pages are using the roff-format (https://man.openbsd.org/roff). Do the following:
man sed > sed.man
vi sed.man
so you see:
SED(1) BSD General Commands Manual SED(1)
N^HNA^HAM^HME^HE
s^Hse^Hed^Hd -- stream editor
to convert a man-page to text without the ^H-stuff. have a look on http://www.schweikhardt.net/man_page_howto.html#q10
create a perl-Skript called strip-headers with the content:
#!/usr/bin/perl -wn
# make it slurp the whole file at once:
undef $/;
# delete first header:
s/^\n*.*\n+//;
# delete last footer:
s/\n+.*\n+$/\n/g;
# delete page breaks:
s/\n\n+[^ \t].*\n\n+(\S+).*\1\n\n+/\n/g;
# collapse two or more blank lines into a single one:
s/\n{3,}/\n\n/g;
# see what is left...
print;
change the rights on the perl-script chmod 750 strip-headers and run it with:
man sed | ./strip-headers | col -bx > sed.man
or
man sed | ./strip-headers | col -bx | head -7 | grep sed
macOS man doesn't support the --ascii flag, so I used col -bx to strip the annoying formatting from man for piping into other commands.
man sed | col -bx | grep SYNOPSIS
col -b: Do not output any backspaces, printing only the last character written to each column position.
col -x: Output multiple spaces instead of tabs.
Notes:
I've read that man is meant to detect whether you're piping to another command or into a file, etc, but that was not my experience. At least for man 1.6c, the default for macOS.
Solution using col: https://unix.stackexchange.com/a/15866
Thanks #Cyrus - I didn't know about hexdump
Thanks #Oliver Gaida - I didn't know cat and vi would show display differently
I am trying to extract lines from file genome.gff that contain a line from file suspicious.txt. suspicious.txt was derived from genome.gff and every line should match.
Using grep on a single line from suspicious.txt works as expected:
grep 'gene10002' genome.gff
NC_007082.3 Gnomon gene 1269632 1273520 . + . ID=gene10002;Dbxref=BEEBASE:GB54789,GeneID:409846;Name=bur;gbkey=Gene;gene=bur;gene_biotype=protein_coding
NC_007082.3 Gnomon mRNA 1269632 1273520 . + . ID=rna21310;Parent=gene10002;Dbxref=GeneID:409846,Genbank:XM_393336.5,BEEBASE:GB54789;Name=XM_393336.5;gbkey=mRNA;gene=bur;product=burgundy;transcript_id=XM_393336.5
But every variation on using grep from a file that I've been able to think of or find online produces no output or an empty file:
grep -f suspicious.txt genome.gff
grep -F -f suspicious.txt genome.gff
while read line; do grep "$line" genome.gff; done<suspicious.txt
while read line; do grep '$line' genome.gff; done<suspicious.txt
while read line; do grep "${line}" genome.gff; done<suspicious.txt
cat suspicious.txt | while read line; do grep '$line' genome.gff; done
cat suspicious.txt | while read line; do grep '$line' genome.gff >> suspicious.gff; done
cat suspicious.txt | while read line; do grep -e "${line}" genome.gff >> suspicious.gff; done
cat "$(cat suspicious_bee_geneIDs_test.txt)" | while read line; do grep -e "${line}" genome.gff >> suspicious.gff; done
Running it as a script also produces an empty file:
#!/bin/bash
SUSP=$1
GFF=$2
while read -r line; do
grep -e "${line}" $GFF >> suspicious_bee_genes.gff
done<$SUSP
This is what the files look like:
head genome.gff
##gff-version 3
#!gff-spec-version 1.21
#!processor NCBI annotwriter
#!genome-build Amel_4.5
#!genome-build-accession NCBI_Assembly:GCF_000002195.4
##sequence-region NC_007070.3 1 29893408
##species http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=7460
NC_007070.3 RefSeq region 1 29893408 . + . ID=id0;Dbxref=taxon:7460;Name=LG1;gbkey=Src;genome=chromosome;linkage- group=LG1;mol_type=genomic DNA;strain=DH4
NC_007070.3 Gnomon gene 181 211962 . - . ID=gene0;Dbxref=BEEBASE:GB42164,GeneID:726912;Name=cort;gbkey=Gene;gene=cort;gene_biotype=protein_coding
NC_007070.3 Gnomon mRNA 181 71559 . - . ID=rna0;Parent=gene0;Dbxref=GeneID:726912,Genbank:XM_006557348.1,BEEBASE:GB42164;Name=XM_006557348.1;gbkey=mRNA;gene=cort;product=cortex%2C transcript variant X2;transcript_id=XM_006557348.1
wc -l genome.gff
457742
head suspicious.txt
gene10002
gene1001
gene1003
gene10038
gene10048
gene10088
gene10132
gene10134
gene10181
gene10209
wc -l suspicious.txt
928
Does anyone know what's going wrong here?
This can happen when the input file is in DOS format: each line will have a trailing CR character at the end, which will break the matching.
One way to check if this is the case is using hexdump, for example (just the first few lines):
$ hexdump -C suspicious.txt
00000000 67 65 6e 65 31 30 30 30 32 0d 0a 67 65 6e 65 31 |gene10002..gene1|
00000010 30 30 31 0d 0a 67 65 6e 65 31 30 30 33 0d 0a 67 |001..gene1003..g|
00000020 65 6e 65 31 30 30 33 38 0d 0a 67 65 6e 65 31 30 |ene10038..gene10|
In the ASCII representation at the right, notice the .. after each gene. These dots correspond to 0d and 0a. The 0d is the CR character.
Without the CR character, the output should look like this:
$ hexdump -C <(tr -d '\r' < suspicious.txt)
00000000 67 65 6e 65 31 30 30 30 32 0a 67 65 6e 65 31 30 |gene10002.gene10|
00000010 30 31 0a 67 65 6e 65 31 30 30 33 0a 67 65 6e 65 |01.gene1003.gene|
00000020 31 30 30 33 38 0a 67 65 6e 65 31 30 30 34 38 0a |10038.gene10048.|
Just one . after each gene, corresponding to 0a, and no 0d.
Another way to see the DOS line endings in the vi editor. If you open the file with vi, the status line would show [dos], or you could run the ex command :set ff? to make it tell you the file format (the status line will say fileformat=dos).
You can remove the CR characters on the fly like this:
grep -f <(tr -d '\r' < suspicious.txt) genome.gff
Or you could remove in vi, by running the ex command :set ff=unix and then save the file. There are other command line tools too that can remove the DOS line ending.
Another possibility is that instead of a trailing CR character, you might have trailing whitespace. The output of hexdump -C should make that perfectly clear. After the trailing whitespace characters are removed, the grep -f should work as expected.
So i'm issuing a query to mysql and it's returning say 1,000 rows,but each iteration of the program could return a different number of rows. I need to break up (without using a mysql limit) this result set into chunks of 100 rows that i can then programatically iterate through in these 100 row chunks.
So
MySQLOutPut='1 2 3 4 ... 10,000"
I need to turn that into an array that looks like
array[1]="1 2 3 ... 100"
array[2]="101 102 103 ... 200"
etc.
I have no clue how to accomplish this elegantly
Using Charles' data generation:
MySQLOutput=$(seq 1 10000 | tr '\n' ' ')
# the sed command will add a newline after every 100 words
# and the mapfile command will read the lines into an array
mapfile -t MySQLOutSplit < <(
sed -r 's/([^[:blank:]]+ ){100}/&\n/g; $s/\n$//' <<< "$MySQLOutput"
)
echo "${#MySQLOutSplit[#]}"
# 100
echo "${MySQLOutSplit[0]}"
# 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
echo "${MySQLOutSplit[99]}"
# 9901 9902 9903 9904 9905 9906 9907 9908 9909 9910 9911 9912 9913 9914 9915 9916 9917 9918 9919 9920 9921 9922 9923 9924 9925 9926 9927 9928 9929 9930 9931 9932 9933 9934 9935 9936 9937 9938 9939 9940 9941 9942 9943 9944 9945 9946 9947 9948 9949 9950 9951 9952 9953 9954 9955 9956 9957 9958 9959 9960 9961 9962 9963 9964 9965 9966 9967 9968 9969 9970 9971 9972 9973 9974 9975 9976 9977 9978 9979 9980 9981 9982 9983 9984 9985 9986 9987 9988 9989 9990 9991 9992 9993 9994 9995 9996 9997 9998 9999 10000
Something like this:
# generate content
MySQLOutput=$(seq 1 10000 | tr '\n' ' ') # seq is awful, don't use in real life
# split into a large array, each item stored individually
read -r -a MySQLoutArr <<<"$MySQLOutput"
# add each batch of 100 items into a new array entry
batchSize=100
MySQLoutSplit=( )
for ((i=0; i<${#MySQLoutArr[#]}; i+=batchSize)); do
MySQLoutSplit+=( "${MySQLoutArr[*]:i:batchSize}" )
done
To explain some of the finer points:
read -r -a foo reads contents into an array named foo, split on IFS, up to the next character specified by read -d (none given here, thus reading only a single line). If you wanted each line to be a new array entry, consider IFS=$'\n' read -r -d '' -a foo, which will read each line into an array, terminated at the first NUL in the input stream.
"${foo[*]:i:batchSize}" expands to a list of items in array foo, starting at index i, and taking the next batchSize items, concatenated into a single string with the first character in $IFS used as a separator.
Hello I am having a problem with checking two variables to see whether or not they are both equal. I have the following script:
Output=$(sudo defaults read /System/Library/User\ Template/English.lproj/Library/Preferences/com.apple.SetupAssistant | grep -o "DidSeeCloudSetup = 1")
Output2=$(sudo defaults read /System/Library/User\ Template/English.lproj/Library/Preferences/com.apple.SetupAssistant | grep -o "LastSeenCloudProductVersion")
Check="DidSeeCloudSetup = 1"
Check2="LastSeenCloudProductVersion"
echo "$Output"
echo "$Check"
if [ "$Output" = "$Check" ]
then
echo "OK"
else
echo "FALSE"
Even though they both contain the same thing it always comes out false... any ideas why?
There is a special character (hex: 10) between $ and Check in your if clause:
00000000 69 66 20 5b 20 22 24 4f 75 74 70 75 74 22 20 3d |if [ "$Output" =|
00000010 20 22 24 10 43 68 65 63 6b 22 20 5d 0a | "$.Check" ].|
So I'm trying to get a list of all the directories i'm currently running a program in, so i can keep track of the numerous jobs i have running at the moment.
When i run the commands individually, they all seem to work, but when i chain them together, something is going wrong... (ll is just the regular ls -l alias)
for pid in `top -n 1 -u will | grep -iP "(programs|to|match)" | awk '{print $1}'`;
do
ll /proc/$pid/fd | head -n 2 | tail -n 1;
done
Why is it that when i have the ll /proc/31353/fd inside the for loop, it cannot access the file, but when i use it normally it works fine?
And piped through hexdump -C:
$ top -n 1 -u will |
grep -iP "(scatci|congen|denprop|swmol3|sword|swedmos|swtrmo)" |
awk '{print $1}' | hexdump -C
00000000 1b 28 42 1b 5b 6d 1b 28 42 1b 5b 6d 32 31 33 35 |.(B.[m.(B.[m2135|
00000010 33 0a 1b 28 42 1b 5b 6d 1b 28 42 1b 5b 6d 32 39 |3..(B.[m.(B.[m29|
00000020 33 33 31 0a 1b 28 42 1b 5b 6d 1b 28 42 1b 5b 6d |331..(B.[m.(B.[m|
00000030 33 30 39 39 36 0a 1b 28 42 1b 5b 6d 1b 28 42 1b |30996..(B.[m.(B.|
00000040 5b 6d 32 36 37 31 38 0a |[m26718.|
00000048
chepner had the right hunch. The output of top is designed for humans, not for parsing. The hexdump shows that top is producing some terminal escape sequences. These escape sequences are part of the first field of the line so the resulting file name is something like /proc/\e(B\e[m\e(B\e[m21353/pid instead of /proc/21353/pid where \e is an escape character.
Use ps, pgrep or pidof instead. Under Linux, you can use the -C option to ps to match an exact program name (repeat the option to allow multiple names). Use the -o option to control the display format.
for pid in $(ps -o pid= -C scatci -C congen -C denprop -C swmol3 -C sword -C swedmos -C swtrmo); do
ls -l /proc/$pid/fd | head -n 2 | tail -n 1
done
If you want to sort by decreasing CPU usage:
for pid in $(ps -o %cpu=,pid= \
-C scatci -C congen -C denprop -C swmol3 -C sword -C swedmos -C swtrmo |
sort -k 1gr |
awk '{print $2}'); do
Additionally, use backticks instead of dollar-parenthesis for command substitution — quotes inside backticks behave somewhat bizarrely, and it's easy to make a mistake there. Quoting inside dollar-parenthesis is intuitive.
try to use "cut" instead of "awk", something like this:
for pid in `top -n 1 -u will | grep -iP "(scatci|congen|denprop|swmol3|sword|swedmos|swtrmo)" | sed 's/ / /g' | cut -d ' ' -f2`; do echo /proc/$pid/fd | head -n 2 | tail -n 1; done