Parse output of command and store in variables - bash

I need to parse the output of the mmls command and store multiple values in variables using a BASH script.
Specifically, I need to store: sector size (512 in the example below), and start values (0,0,63,224910,240975 in the example below). Since the second set of values represent partitions, the number of values captured could vary.
mmls /mnt/E01Mnt/RAW/ewf1
DOS Partition Table
Offset Sector: 0
Units are in 512-byte sectors
Slot Start End Length Description
000: Meta 0000000000 0000000000 0000000001 Primary Table (#0)
001: ------- 0000000000 0000000062 0000000063 Unallocated
002: 000:000 0000000063 0000224909 0000224847 NTFS / exFAT (0x07)
003: 000:001 0000224910 0000240974 0000016065 DOS FAT12 (0x01)
004: ------- 0000240975 0000250878 0000009904 Unallocated

Here's a start:
$ awk '/^Units/{print $4+0} /^[0-9]/{print $3+0}' file
512
0
0
63
224910
240975
Try to solve the rest yourself and then let us know if you have questions.
Explanation: file is a file containing your sample input. You can replace awk '{script}' file with command | awk '{script}' if you're input is coming from the output of some command rather then being stored in a file.
^ is the universal regexp metacharacter for start of string while /.../ in awk means "find this regexp". So the above is looking for lines that start with the text shown (i.e. Units or digits) and then printing the 4th or 3rd space-separated field after adding zero to it to remove any trailing non-digits or leading zeros. man awk.

You need a bit of awk to start with.
values=( $(mmls /mnt/E01Mnt/RAW/ewf1 | awk '
/^Units are in/{match($4,/^[[:digit:]]+/,ss); print ss[0]}
NR>6{print $4}'
) )
Now you have a values array which contains both the sector size(first element) and the start values(subsequent elements) . We could do some array manipulation to separate individual elements.
secsize=${values[0]} # size of sector
declare -a sv # sv for start values
for((i=1;i<${#values[#]};i++))
do
sv+=( ${values[i]} )
done
echo "${sv[#]}" # print start values
unset values # You don't need values anymore.
Note: Requires GNU awk.

Related

search and replace specific positions of a fixed-length file

I have several fixed length files where every position or position range is reserved for a particular field. The first few fields are year, term, name, DOB, gender...
Year starts in position 1 and is of length 2
Term starts in position 3 and is of length 1
Name starts in position 4 and is of length 35
DOB starts in position 39 and is of length 6
Gender starts in position 45 and is of length 1
...
This is true for all files. Not all fields are always present. Example, Name field may be 35 blanks/white spaces since it was not reported. The same may be true about other fields.
I need to search the Name field (whether it has a value or not and replace its contents with a dummy string which could be 'xxxxxxxx' but the length should not exceed 35 and after string replacement the position of all fields should not have changed.
All files have 80 fields.
Sample file containing 3 lines. Each line begins with 182:
182 1 405080711 001 0425594
07 5 4170000000000000 00000000000000000000000000000000000000000000 0000
9 05000002
182 1 205080712 001 0480201
07 5 3300000000000000 00000000000000000000000000000000000000000000 0000
05000004
182 2 005080713 001 0425824
07 5 3080000000000000 00000000000000000000000000000000000000000000 0000
05000005
'''
I am using the following sed command to replace a blank Name field with the below string.
However this overwrite all fields prior to Name which starts at position 35
sed -E 's/^(.{3})(.{36})/First Name of the student-Last Name/' File_name
'''
Open to use any other command such as awk etc.
Actual white spaces between fields may not be showing here due to auto formatting.
In sample line 1 above there are actually 41 spaces between "182" and "1"enter image description here
Appreciate any help.
Using perl:
perl -pe 'BEGIN { $name = sprintf "% -35.35s", "xxxxxxxx" }
substr($_, 3, 35) = $name' input.txt
perl uses 0-based indexes, so this replaces the 35 characters of each line of the input file starting at the 4th character with the value xxxxxxxx padded with enough trailing spaces to make 35 characters total (And if xxxxxxxx is more than 35 characters long, truncates it to 35). The modified line is then printed to standard output. Use perl -i -pe '...' input.txt to modify the file in place.
Or a similar awk version:
awk 'BEGIN { name = sprintf "% -35.35s", "xxxxxxxx" }
{ printf "%s%s%s\n", substr($0, 1, 3), name, substr($0, 39) }' input.txt
awk's substr doesn't have a way to replace part of a string like perl's does, so this one extracts the parts before and after the name field and prints them all out with the new name value. Not as elegant but gets the job done.

In bash, I want to generate based on a set of words a fixed set of 4 characters output for each word and to always match

I got these words
Frank_Sinatra
Dean_Martin
Ray_Charles
I want to generate 4 characters which will always match with those words and never change.
ej:
frk ) Frank_Sinatra
dnm ) Dean_Martin
Ray ) Ray_Charles
and it shall always match these 4 characters when I run it again (not random)
note:
Something like this:
String 32-bit checksum 8-bit checksum
ABC 326 0x146 70 0x46
ACB 410 0x19A 154 0x9A
BAC 350 0x15E 94 0x5E
BCA 450 0x1C2 194 0xC2
CAB 399 0x18F 143 0x8F
CBA 256 0x100 0 0x00
http://www.flounder.com/checksum.htm
Look at this command --->
echo -n Frank_Sinatra | md5sum
d0f7287be11d7bbfe53809088ea3b009 -
but instead of that long string, I wanted just 4 unique characters.
I did it like this:
echo -n "Frank_Sinatra" | md5sum > foo ; sed -i 's/./&\n#/4' foo
grep -v "#" foo > bar
I'm not going to write the entire program for you, but I can share some algorithm that can accomplish this. I can't guarantee that it is the most optimized algorithm.
Problem
Generate a 3-letter identifier for each line in a text file that is unique, such that grep will only match with the intended line.
Assumption
There exists a 3-letter identifier for each line such that grep will only match that line.
Algorithm
For every line in text file
Grab a permutation of the line, run grep on the file using that permutation.
If grep returns more than 2 lines, get a new permutation of the line, go back to previous step.
If grep returns only one line and that line matches our current line, we found a proper identifier. Store this identifier.

how to add one to all fields in a file

suppose I have file containing numbers like:
1 4 7
2 5 8
and I want to add 1 to all these numbers, making the output like:
2 5 8
3 6 9
is there a simple one-line command (e.g. awk) to realize this?
try following once.
awk '{for(i=1;i<=NF;i++){$i=$i+1}} 1' Input_file
EDIT: As per OP's request without loop, here is a solution(written as per shown sample only).
With hardcoding of number of fields.
awk -v RS='[ \n]' '{ORS=NR%3==0?"\n":" ";print $0+1}' Input_file
OR
Without hardcoding number of fields.
awk -v RS='[ \n]' -v col=$(awk 'FNR==1{print NF}' Input_file) '{ORS=NR%col==0?"\n":" ";print $0+1}' Input_file
Explanation: So in EDIT section 1st solution I have hardcoded the number of fields by mentioning 3 there, in OR solution of EDIT, I am creating a variable named col which will read the very first line of Input_file to get the number of fields. Then it will not read all the Input_file, Now coming onto the code I have set Record separator as space or new line to it will add them without using a loop and it will add space each time after incrementing 1 in their values. It will print new line only when number of lines are completely divided by value of col(which is why we have taken number of fields in -v col section).
In native bash (no awk or other external tool needed):
#!/usr/bin/env bash
while read -r -a nums; do # read a line into an array, splitting on spaces
out=( ) # initialize an empty output array for that line
for num in "${nums[#]}"; do # iterate over the input array...
out+=( "$(( num + 1 ))" ) # ...and add n+1 to the output array.
done
printf '%s\n' "${out[*]}" # then print that output array with a newline following
done <in.txt >out.txt # with input from in.txt and output to out.txt
You can do this using gnu awk:
awk -v RS="[[:space:]]+" '{$0++; ORS=RT} 1' file
2 5 8
3 6 9
If you don't mind Perl:
perl -pe 's/(\d+)/$1+1/eg' file
Substitute any number composed of multiple digits (\d+) with that number ($1) plus 1. /e means to execute the replacement calculation, and /g means globally throughout the file.
As mentioned in the comments, the above only works for positive integers - per the OP's original sample file. If you wanted it to work with negative numbers, decimals and still retain text and spacing, you could go for something like this:
perl -pe 's/([-]?[.0-9]+)/$1+1/eg' file
Input file
Some column headers # words
1 4 7 # a comment
2 5 cat dog # spacing and stray words
+5 0 # plus sign
-7 4 # minus sign
+1000.6 # positive decimal
-21.789 # negative decimal
Output
Some column headers # words
2 5 8 # a comment
3 6 cat dog # spacing and stray words
+6 1 # plus sign
-6 5 # minus sign
+1001.6 # positive decimal
-20.789 # negative decimal

Convert Mainframe SORT to Shell Script

Is there any easy way to convert JCL SORT to Shell Script?
Here is the JCL SORT:
OPTION ZDPRINT
SORT FIELDS=(15,1,CH,A)
SUM FIELDS=(16,8,25,8,34,8,43,8,52,8,61,8),FORMAT=ZD
OUTREC BUILD=(14X,15,54,13X)
Only bytes 15 for a length of 54 are relevant from the input data, which is the key and the source values for the summation. Others bytes from the input are not important.
Assuming the data is printable.
The data is sorted on the one-byte key, and each value for records with the same key is summed, separately, for each of the six numbers. A single record is written, per key, with the summed values and with other data (those one bytes in between and at the end) from the first record. The sort is "unstable" (meaning that the order of records presented to the summation is not reproduceable from one execution to the next) so the byte values should theoretically be the same on all records, or be irrelevant.
The output, for each key, is presented as a record containing 14 blanks (14X) then the 54 bytes starting at position 15 (which is the one-byte key) and then followed by 13 blanks (13X). The numbers should be right-aligned and left-zero-filled [OP to confirm, and amend sample data and expected output].
Assuming the sum will only contain positive number and will not be signed, and that for any number which is less than 999999990 there will be leading zeros for any unused positions (numbers are character, right-aligned and left-zero-filled).
Assuming the one-byte key will only be alphabetic.
The data has already been converted to ASCII from EBCDIC.
Sample Input:
00000000000000A11111111A11111111A11111111A11111111A11111111A111111110000000000000
00000000000000B22222222A22222222A22222222A22222222A22222222A222222220000000000000
00000000000000C33333333A33333333A33333333A33333333A33333333A333333330000000000000
00000000000000A44444444B44444444B44444444B44444444B44444444B444444440000000000000
Expected Output:
A55555555A55555555A55555555A55555555A55555555A55555555
B22222222A22222222A22222222A22222222A22222222A22222222
C33333333A33333333A33333333A33333333A33333333A33333333
(14 preceding blanks and 13 trailing blanks)
Expected Volume: tenth thousands
I have figured an answer:
awk -v FIELDWIDTHS="14 1 8 1 8 1 8 1 8 1 8 1 8 13" \
'{if(!($2 in a)) {a[$2]=$2; c[$2]=$4; e[$2]=$6; g[$2]=$8; i[$2]=$10; k[$2]=$12} \
b[$2]+=$3; d[$2]+=$5; f[$2]+=$7; h[$2]+=$9; j[$2]+=$11; l[$2]+=$13;} END \
{for(id in a) printf("%14s%s%s%s%s%s%s%s%s%s%s%s%s%13s\n","",a[id],b[id],c[id],d[id],e[id],f[id],g[id],h[id],i[id],j[id],k[id],l[id],"");}' input
Explaination:
1) Split the string
awk -v FIELDWIDTHS="14 1 8 1 8 1 8 1 8 1 8 1 8 13"
2) Let $2 be the key and $4, $6, $8, $10, $12 will only set value for the first time
{if(!($2 in a)) {a[$2]=$2; c[$2]=$4; e[$2]=$6; g[$2]=$8; i[$2]=$10; k[$2]=$12}
3) Others will be summed up
b[$2]+=$3; d[$2]+=$5; f[$2]+=$7; h[$2]+=$9; j[$2]+=$11; l[$2]+=$13;} END
4) Print for each key
{for(id in a) printf("%14s%s%s%s%s%s%s%s%s%s%s%s%s%13s\n","",a[id],b[id],c[id],d[id],e[id],f[id],g[id],h[id],i[id],j[id],k[id],l[id],"");}
okay I have tried something
1) extracting duplicate keys from file and storing it in duplicates file.
awk '{k=substr($0,1,15);a[k]++}END{for(i in a)if(a[i]>1)print i}' sample > duplicates
OR
awk '{k=substr($0,1,15);print k}' sample | sort | uniq -c | awk '$1>1{print $2}' > duplicates
2) For duplicates, doing the calculation and creating newfile with specificied format
while read line
do
grep ^$line sample | awk -F[A-Z] -v key=$line '{for(i=2;i<=7;i++)f[i]=f[i]+$i}END{printf("%14s"," ");for(i=2;i<=7;i++){printf("%s%.8s",substr(key,15,1),f[i]);if(i==7)printf("%13s\n"," ")}}' > newfile
done < duplicates
3) for unique ones,format and append to newfile
grep -v -f duplicates sample | sed 's/0/ /g' >> newfile ## gives error if 0 is within data instead of start and end in a row.
OR
grep -v -f duplicates sample | awk '{printf("%14s%s%13s\n"," ",substr($0,15,54)," ")}' >> newfile
if you have any doubt, let me know.

How to replace all matches with an incrementing number in BASH?

I have a text file like this:
AAAAAA this is some content.
This is AAAAAA some more content AAAAAA. AAAAAA
This is yet AAAAAA some more [AAAAAA] content.
I need to replace all occurrence of AAAAAA with an incremented number, e.g., the output would look like this:
1 this is some content.
This is 2 some more content 3. 4
This is yet 5 some more [6] content.
How can I replace all of the matches with an incrementing number?
Here is one way of doing it:
$ awk '{for(x=1;x<=NF;x++)if($x~/AAAAAA/){sub(/AAAAAA/,++i)}}1' file
1 this is some content.
This is 2 some more content 3. 4
This is yet 5 some more [6] content.
A perl solution:
perl -pe 'BEGIN{$A=1;} s/AAAAAA/$A++/ge' test.dat
This might work for you (GNU sed):
sed -r ':a;/AAAAAA/{x;:b;s/9(_*)$/_\1/;tb;s/^(_*)$/0\1/;s/$/:0123456789/;s/([^_])(_*):.*\1(.).*/\3\2/;s/_/0/g;x;G;s/AAAAAA(.*)\n(.*)/\2\1/;ta}' file
This is a toy example, perl or awk would be a better fit for a solution.
The solution only acts on lines which contain the required string (AAAAAA).
The hold buffer is used as a place to keep the incremented integer.
In overview: when a required string is encountered, the integer in the hold space is incremented, appended to the current line, swapped for the required string and the process is then repeated until all occurences of the string are accounted for.
Incrementing an integer simply swaps the last digit (other than trailing 9's) for the next integer in sequence i.e. 0 to 1, 1 to 2 ... 8 to 9. Where trailing 9's occur, each trailing 9 is replaced by a non-integer character e.g '_'. If the number being incremented consists entirely of trailing 9's a 0 is added to the front of the number so that it can be incremented to 1. Following the increment operation, the trailing 9's (now _'s) are replaced by '0's.
As an example say the integer 9 is to be incremented:
9 is replaced by _, a 0 is prepended (0_), the 0 is swapped for 1 (1_), the _ is replaced by 0. resulting in the number 10.
See comments directed at #jaypal for further notes.
Maybe something like this
#!/bin/bash
NR=1
cat filename while read line
do
line=$(echo $line | sed 's/AAAAA/$NR/')
echo ${line}
NR=$((NR + 1 ))
done
Perl did the job for me
perl -pi -e 's/\b'DROP'\b/$&.'_'.++$A /ge' /folder/subfolder/subsubfolder/*
Input:
DROP
drop
$drop
$DROP
$DROP="DROP"
$DROP='DROP'
$DROP=$DROP
$DROP="DROP";
$DROP='DROP';
$DROP=$DROP;
$var="DROP_ACTION"
drops
DROPS
CODROP
'DROP'
"DROP"
/DROP/
Output:
DROP_1
drop
$drop
$DROP_2
$DROP_3="DROP_4"
$DROP_5='DROP_6'
$DROP_7=$DROP_8
$DROP_9="DROP_10";
$DROP_11='DROP_12';
$DROP_13=$DROP_14;
$var="DROP_ACTION"
drops
DROPS
CODROP
'DROP_15'
"DROP_16"
/DROP_17/

Resources