I am having problems removing the last character. I have this command:
df -h | awk 'NR>1 {print$1,$2,$3,$4}'
which yields:
directoryname1 40K 0 9.0G
directoryname2 90K 5.0M 78G
directorynamen 0 62M 70G
I found out that I could remove the letters (K,M,G) by using +0
this command:
df -h | awk 'NR>1 {print$3+0}'
yields:
0
5.0
62
how do I get the same result as the first command I have without the K,M,G?
You have self answered your own question:
df -h | awk 'NR>1 {print$1,$2+0,$3+0,$4+0}'
It works because strings are numeric, for non numeric strings:
df -h | awk 'NR>1 {print$1,substr($2,1,length($2)-1),
substr($3,1,length($3)-1),
substr($4,1,length($4)-1)}'
I'm with #AdrianFrühwirth - Why would you want to do that? Or rather, what are you trying to do?
If you're trying to get all the df numbers with the same unit and without the unit in the output, simply use df without -h to get the number of bytes. If you want KiB, MiB, GiB or some other measure, you can use -BK, -BM or -BG, respectively, or see the SIZE format information in man df.
Related
I have this situation.
In my script, I have to use the hdparm command on specific partion and extract the MB/s value calculated.
I'm able to achieve this thanks the us of grep and regex; so, if with
sudo hdparm -tT /dev/xvda1
the output is:
/dev/xvda1:
Timing cached reads: 12596 MB in 1.99 seconds = 6320.55 MB/sec
Timing buffered disk reads: 594 MB in 3.01 seconds = 197.12 MB/sec
with
sudo hdparm -tT /dev/xvda1 | grep -Po '.* \K[0-9.]+'
the results are:
6320.55
197.12
Now, the next request is to print data in a different way.
The desired output is:
/dev/xvda1: 6320.55 MB/sec, 197.12 MB/sec
But I don't know how to obtain this; summarizing, what is requested is to print the partion and, in a single line, the MB/s values extracted.
Seems like your last question was an XY problem.
If you want to append MB/sec anyways there is no need to remove it in the first place. Extracting 6320.55 MB/sec would have been a lot easier than extracting just 6320.55.
Anyways, an awk script is probably the best solution here:
awk -F' = ' '{a[NR]=$NF} END {printf "%s %s, %s\n", a[1], a[2], a[3]}'
If you don't need exactly that format, the script can be simplified to:
awk -F' = ' '{printf "%s ", $NF}'
which prints /dev/xvda1: 6320.55 MB/sec 197.12 MB/sec .
I have a file like this,
Filesystem State 1024-blocks Used Avail Capacity Mounted on
$ZPMON.DELETEMESTARTED 71686344 58788360 12897984 82% /deleteme
In this file i want to read the 1st column and 5th column without using grep command
i tried this command,but it shows istead of 5th coloumn it shows 6th column output
df -k DELETEME | awk '{print $1 $5 }'
FilesystemAvail
$ZPMON.DELETEMESTARTED82%.
expected output is
Avail
12897984
With single GNU df command:
df -k --output=avail DELETEME
Is there a shell command that simply converts back and forth between a number string in bytes and the "human-readable" number string offered by some commands via the -h option?
To clarify the question: ls -l without the -h option (some output supressed)
> ls -l
163564736 file1.bin
13209 file2.bin
gives the size in bytes, while with the -hoption (some output supressed)
> ls -lh
156M file1.bin
13K file2.bin
the size is human readable in kilobytes and megabytes.
Is there a shell command that simply turns 163564736into 156M and 13209 into 13K and also does the reverse?
numfmt
To:
echo "163564736" | numfmt --to=iec
From:
echo "156M" | numfmt --from=iec
There is no standard (cross-platform) tool to do it. But solution using awk is described here
grep -i -A 5 -B 5 'db_pd.Clients' eightygigsfile.sql
This has been running for an hour on a fairly powerful linux server which is otherwise not overloaded.
Any alternative to grep? Anything about my syntax that can be improved, (egrep,fgrep better?)
The file is actually in a directory which is shared with a mount to another server but the actual diskspace is local so that shouldn't make any difference?
the grep is grabbing up to 93% CPU
Here are a few options:
1) Prefix your grep command with LC_ALL=C to use the C locale instead of UTF-8.
2) Use fgrep because you're searching for a fixed string, not a regular expression.
3) Remove the -i option, if you don't need it.
So your command becomes:
LC_ALL=C fgrep -A 5 -B 5 'db_pd.Clients' eightygigsfile.sql
It will also be faster if you copy your file to RAM disk.
If you have a multicore CPU, I would really recommend GNU parallel. To grep a big file in parallel use:
< eightygigsfile.sql parallel --pipe grep -i -C 5 'db_pd.Clients'
Depending on your disks and CPUs it may be faster to read larger blocks:
< eightygigsfile.sql parallel --pipe --block 10M grep -i -C 5 'db_pd.Clients'
It's not entirely clear from you question, but other options for grep include:
Dropping the -i flag.
Using the -F flag for a fixed string
Disabling NLS with LANG=C
Setting a max number of matches with the -m flag.
Some trivial improvement:
Remove the -i option, if you can, case insensitive is quite slow.
Replace the . by \.
A single point is the regex symbol to match any character, which is also slow
Two lines of attack:
are you sure, you need the -i, or do you habe a possibility to get rid of it?
Do you have more cores to play with? grep is single-threaded, so you might want to start more of them at different offsets.
< eightygigsfile.sql parallel -k -j120% -n10 -m grep -F -i -C 5 'db_pd.Clients'
If you need to search for multiple strings, grep -f strings.txt saves a ton of time. The above is a translation of something that I am currently testing. the -j and -n option value seemed to work best for my use case. The -F grep also made a big difference.
Try ripgrep
It provides much better results compared to grep.
All the above answers were great. What really did help me on my 111GB file was using the LC_ALL=C fgrep -m < maxnum > fixed_string filename.
However, sometimes there may be 0 or more repeating patterns, in which case calculating the maxnum isn't possible. The workaround is to use the start and end patterns for the event(s) you are trying to process, and then work on the line numbers between them. Like so -
startline=$(grep -n -m 1 "$start_pattern" file|awk -F":" {'print $1'})
endline=$(grep -n -m 1 "$end_pattern" file |awk -F":" {'print $1'})
logs=$(tail -n +$startline file |head -n $(($endline - $startline + 1)))
Then work on this subset of logs!
hmm…… what speeds do you need ? i created a synthetic 77.6 GB file with nearly 525 mn rows with plenty of unicode :
rows = 524759550. | UTF8 chars = 54008311367. | bytes = 83332269969.
and randomly selected rows at an avg. rate of 1 every 3^5, using rand() not just NR % 243, to place the string db_pd.Clients at a random position in the middle of the existing text, totaling 2.16 mn rows where the regex pattern hits
rows = 2160088. | UTF8 chars = 42286394. | bytes = 42286394.
% dtp; pvE0 < testfile_gigantic_001.txt|
mawk2 '
_^(_<_)<NF { print (__=NR-(_+=(_^=_<_)+(++_)))<!_\
?_~_:__,++__+_+_ }' FS='db_pd[.]Clients' OFS=','
in0: 77.6GiB 0:00:59 [1.31GiB/s] [1.31GiB/s] [===>] 100%
out9: 40.3MiB 0:00:59 [ 699KiB/s] [ 699KiB/s] [ <=> ]
524755459,524755470
524756132,524756143
524756326,524756337
524756548,524756559
524756782,524756793
524756998,524757009
524757361,524757372
And mawk2 took just 59 seconds to extract out a list of row ranges it needs. From there it should be relatively trivial. Some overlapping may exist.
At throughput rates of 1.3GiB/s, as seen above calculated by pv, it might even be detrimental to use utils like parallel to split the tasks.
I would like to know how much physical memory is available on the system, excluding any swap. Is there a method to get this information in Ruby?
If you are using linux,You usually use a "free" command to find physical memory ie RAM details on the system
output = %x(free)
output will look slightly like the following string
" total used free shared buffers cached\nMem: 251308 201500 49808 0 3456 48508\n-/+ buffers/cache: 149536 101772\nSwap: 524284 88612 435672\n"
You can extract the information you need using simple string manipulations like
output.split(" ")[7] will give total memory
output.split(" ")[8] will give used memory
output.split(" ")[9] will give free memory
Slightly slicker version of AndrewKS's answer:
total_memory_usage_in_k = `ps -Ao rss=`.split.map(&:to_i).inject(&:+)
Well, the Unix command "top" doesn't seem to work in Ruby, so try this:
# In Kilobytes
memory_usages = `ps -A -o rss=`.split("\n")
total_mem_usage = memory_usages.inject { |a, e| a.to_i + e.strip.to_i }
This "seems" correct. I don't guarantee it. Also, this takes a lot more time than the system will so by the time it's finished the physical memory would have changed.
You can use this gem to get various system info http://threez.github.com/ruby-vmstat/
This answers for both, Ruby and Bash and probably also for Python and the rest:
#!/usr/bin/env bash
# This file is in public domain.
# Initial author: martin.vahi#softf1.com
#export S_MEMINFO_FIELD="Inactive"; \
export S_MEMINFO_FIELD="MemTotal"; \
ruby -e "s=%x(cat /proc/meminfo | grep $S_MEMINFO_FIELD: | \
gawk '{gsub(/MemTotal:/,\"\");print}' | \
gawk '{gsub(/kB/,\"*1024\");print}' | \
gawk '{gsub(/KB/,\"*1024\");print}' | \
gawk '{gsub(/KiB/,\"*1024\");print}' | \
gawk '{gsub(/MB/,\"*1048576\");print}' | \
gawk '{gsub(/MiB/,\"*1048576\");print}' | \
gawk '{gsub(/GB/,\"*1073741824\");print}' | \
gawk '{gsub(/GiB/,\"*1073741824\");print}' | \
gawk '{gsub(/TB/,\"*1099511627776\");print}' | \
gawk '{gsub(/TiB/,\"*1099511627776\");print}' | \
gawk '{gsub(/B/,\"*1\");print}' | \
gawk '{gsub(/[^1234567890*]/,\"\");print}' \
); \
s_prod=s.gsub(/[\\s\\n\\r]/,\"\")+\"*1\";\
ar=s_prod.scan(/[\\d]+/);\
i_prod=1;\
ar.each{|s_x| i_prod=i_prod*s_x.to_i};\
print(i_prod.to_s+\" B\")"
The thing to notice here is that the "\" at the ends of the lines are Bash line continuations. Basically it should all be a single-liner, with the exception of the out-commented lines, which must be deleted. The colon at the end of the
grep $S_MEMINFO_FIELD:
is important, because
cat /proc/meminfo | grep Inactive
prints multiple lines, but the rest of the script works with an assumption that the grep outputs only a single line. According to
https://unix.stackexchange.com/questions/263881/convert-meminfo-kb-to-bytes
the /proc/meminfo uses 1024 regardless of whether the unit is "kB" or "KB". I did not have any info about the GiB and GB and TB parts, but I assume that they follow the same style and 1MB=1024*1024KiB.
I created a Linux specific Bash script that takes /proc/meminfo field name as its 1. command line argument and prints out the value in Bytes:
http://longterm.softf1.com/2016/comments/stackoverflow_com/2016_03_07_mmmv_proc_meminfo_filter_t1.bash
(archival copy: https://archive.is/vjcNf )
Thank You for reading my comment.
I hope that it helps. :-)
You may use total gem (I'm the author):
require 'total'
puts Total::Mem.new.bytes