How to extract phone number and Pin from each text line - bash
Sample Text from the log file
2021/08/29 10:25:37 20210202GL1 Message Params [userid:user1] [timestamp:20210829] [from:TEST] [to:0214736848] [text:You requested for Pin reset. Your Customer ID: 0214736848 and PIN: 4581]
2021/08/27 00:03:18 20210202GL2 Message Params [userid:user1] [timestamp:20210827] [from:TEST] [to:0214736457] [text:You requested for Pin reset. Your Customer ID: 0214736457 and PIN: 6193]
2021/08/27 10:25:16 Thank you for joining our service; Your ID is 0214736849 and PIN is 5949
Other wording and formatting can change but ID and PIN don't change
Expected out put for each line
0214736848#4581
0214736457#6193
0214736849#5949
Below is what I have tried out using bash though am currently able to extract only the numeric values
while read p; do
NUM=''
counter=1;
text=$(echo "$p" | grep -o -E '[0-9]+')
for line in $text
do
if [ "$counter" -eq 1 ] #if is equal to 1
then
NUM+="$line" #concatenate string
else
NUM+="#$line" #concatenate string
fi
let counter++ #Increment counter
done
printf "$NUM\n"
done < logfile.log
Current output though not the expected.
2021#08#29#00#03#18#20210202#2#1#20210826#0214736457#0214736457#6193
2021#08#27#10#25#37#20210202#1#1#20210825#0214736848#0214736848#4581
2021#08#27#10#25#16#0214736849#5949
Another variation using gawk and 2 capture groups, matching 1 or more digits per group:
awk '
match($0, /ID: ([0-9]+) and PIN: ([0-9]+)/, m) {
print m[1]"#"m[2]
}
' file
Output
0214736848#4581
0214736457#6193
For the updated question, you could either match : or is if you want a more precise match, and the capture group values will be 2 and 4.
awk '
match($0, /ID(:| is) ([0-9]+) and PIN(:| is) ([0-9]+)/, m) {
print m[2]"#"m[4]
}
' file
Output
0214736848#4581
0214736457#6193
0214736849#5949
Using sed capture groups you can do:
sed 's/.* Your Customer ID: \([0-9]*\) and PIN: \([0-9]*\).*/\1#\2/g' file.txt
With your shown samples please try following awk code, you could simple do it with using different field separators. Simple explanation would be, making Customer ID: OR and PIN: OR ]$ as field separators and then keeping them in mind printing only 2nd and 3rd fields along with # as per required output by OP.
awk -v FS='Customer ID: | and PIN: |]$' '{print $2"#"$3}' Input_file
With bash and a regex:
while IFS='] ' read -r line; do
[[ "$line" =~ ID:\ ([^\ ]+).*PIN:\ ([^\ ]+)] ]]
echo "${BASH_REMATCH[1]}#${BASH_REMATCH[2]}"
done <file
Output:
0214736848#4581
0214736457#6193
Given the updated input in your question then using any sed in any shell on every Unix box:
$ sed 's/.* ID[: ][^0-9]*\([0-9]*\).* PIN[: ][^0-9]*\([0-9]*\).*/\1#\2/' file
0214736848#4581
0214736457#6193
0214736849#5949
Original answer:
Using any awk in any shell on every Unix box:
$ awk -v OFS='#' '{print $18, $21+0}' file
0214736848#4581
0214736457#6193
Related
How to send shell script output in a tablular form and send the mail
I am a shell script which will give few lines as a output. Below is the output I am getting from shell script. My script flow is like first it will check weather we are having that file, if I am having it should give me file name and modified date. If I am not having it should give me file name and not found in a tabular form and send email. Also it should add header to the output. CMC_daily_File.xlsx Not Found CareOneHMA.xlsx Jun 11 Output File Name Modified Date CMC_daily_File.xlsx Not Found CareOneHMA.xlsx Jun 11 UPDATE sample of script #!/bin/bash if [ -e /saddwsgnas/radsfftor/coffe/COE_daily_File.xlsx ]; then cd /sasgnas/radstor/coe/ ls -la COE_daily_File.xlsx | awk '{print $9, $6"_"$7}' else echo "CMC_COE_daily_File.xlsx Not_Found" fi Output CMC_COE_daily_File.xlsx Jun_11
I thought I might offer you some options with a slightly modified script. I use the stat command to obtain the file modification time in more expansive format, as well as specifying an arbitrary, pre-defined, spacer character to divide the column data. That way, you can focus on displaying the content in its original, untampered form. This would also allow the formatted reporting of filenames which contain spaces without affecting the logic for formatting/aligning columns. The column command is told about that spacer character and it will adjust the width of columns to the widest content in each column. (I only wish that it also allowed you to specify a column divider character to be printed, but that is not part of its features/functions.) I also added the extra AWK action, on the chance that you might be interested in making the results stand out more. #!/bin/sh #QUESTION: https://stackoverflow.com/questions/74571967/how-to-send-shell-script-output-in-a-tablular-form-and-send-the-mail SPACER="|" SOURCE_DIR="/saddwsgnas/radsfftor/coe" SOURCE_DIR="." { printf "File Name${SPACER}Modified Date\n" #for file in COE_daily_File.xlsx for file in test_55.sh awkReportXmlTagMissingPropertyFieldAssignment.sh test_54.sh do if [ -e "${SOURCE_DIR}/${file}" ]; then cd "${SOURCE_DIR}" #ls -la "${file}" | awk '{print $9, $6"_"$7}' echo "${file}${SPACER}"$(stat --format "%y" "${file}" | cut -f1 -d\. | awk '{ print $1, $2 }' ) else echo "${file}${SPACER}Not Found" fi done } | column -x -t -s "|" | awk '{ ### Refer to: # https://man7.org/linux/man-pages/man4/console_codes.4.html # https://www.ecma-international.org/publications-and-standards/standards/ecma-48/ if( NR == 1 ){ printf("\033[93;3m%s\033[0m\n", $0) ; }else{ print $0 ; } ; }' Without that last awk command, the output session for that script was as follows: ericthered#OasisMega1:/0__WORK$ ./test_55.sh File Name Modified Date test_55.sh 2022-11-27 14:07:15 awkReportXmlTagMissingPropertyFieldAssignment.sh 2022-11-05 21:28:00 test_54.sh 2022-11-27 00:11:34 ericthered#OasisMega1:/0__WORK$ With that last awk command, you get this:
How to get a number with variable number of digits from a string in a file using bash script?
I have the following file: APP_VERSION.ts export const APP_VERSION = 1; This is the only content of that file, and the APP_VERSION variable will be incremented as needed. So, the APP_VERSION could be a single digit number or multiple digit number, like 15 or 999, etc. I need to use that value in one of my bash scripts. use-app-version.sh APP_VERSION=`cat src/constants/APP_VERSION.ts` echo $APP_VERSION I know I can read it with cat. But how can I parse that string so I can get exactly the APP_VERSION value, whether it's 1 or 999, for example.
sed -En 's/(^.*APP_VERSION.*)([[:digit:]]+.*)(\;.*$)/\2/p' src/constants/APP_VERSION Using sed, split the line into three sections defined by opening and closing brackets. Substitute the line for second section on ( the version value) and print.
You may use this awk: app_ver=$(awk -F '[[:blank:];=]+' '$(NF-2) == "APP_VERSION" {print $(NF-1)}' src/constants/APP_VERSION.ts) echo "$app_ver" 1
You can concat some commands to remove everything else: APP_VERSION=`cat src/constants/APP_VERSION.ts | awk -F '=' '{print $2}' | tr -d ' ' | tr -d ';'` 1 - Cat get all file content 2 - AWK gets all content after '=' 3 - Remove space 4 - Remove ;
A simple APP_VERSION=$(grep --text -Eo '[0-9]+' src/constants/APP_VERSION.ts) should be enough
With bash only: APP_VERSION=$(cat src/constants/APP_VERSION.ts) APP_VERSION=${APP_VERSION%;} APP_VERSION=${APP_VERSION/*= } Line 2 removes the trailing ';', line 3 removes everything before "= ". Alternatively, you could set APP_VERSION as an array, take 5th element, and remove trailing ';'. Or, another solution, using IFS: IFS='=;' read a APP_VERSION < src/constants/APP_VERSION.ts In this version, the space will remain before version number.
Assuming that the task can be rephrased to "extract the digits from a file", there are a few options: Delete all characters that aren't digits with tr: version=$(tr -cd '[:digit:]' < infile) Use grep to match all digits and retain nothing but the match: version=$(grep -Eo '[[:digit:]]+' infile) Read file into string and delete all non-digits with just Bash: contents=$(< infile) version=${contents//[![:digit:]]}
Wrong search result in a file through Bash script
I am searching an event field in a file but is giving wrong output. I am searching gpio-keys event in input devices for which I have written a script, but I'm unable to print anything in output file (in my case I am writing in a button device file it is null always). Please help me to figure out this. Where am I doing wrong in script file? Bash script: #!/bin/bash if grep -q "gpio-keys" /proc/bus/input/devices ; then EVENT=$(cat /proc/bus/input/devices | grep "Handlers=kbd") foo= `echo $EVENT | awk '{for(i=1;i<=NF;i++) if($i=="evbug")printf($(i-1))}'` #foo=${EVENT:(-7)} echo -n $foo > /home/ubuntu/Setups/buttonDevice fi
i am still not able to get anything in buttondevce That's no wonder, since in the input line H: Handlers=kbd event0 there's nowhere the evbug your awk script is looking for. I my case it is event0 but it may vary also depends on how kernel allows. If it is event0 or similar, then it's nonsensical to look for evbug. Change the statement if($i=="evbug")printf($(i-1)) to if ($i~"event") print $i (using regular expression match). I have rewritten my script like above. but through it, I have got two events(event0, event3) but … my input devices are many but i want the gpio-keys event Aha - in order to take only the handler line from the gpio-keys section, you can use sed with an address range: EVENT=`sed -n '/gpio-keys/,/Handlers=kbd/s/.*Handlers=kbd //p' </proc/bus/input/devices`
Prakash, I don't have access to your google drive. But I just want to give you some suggestion:- foo= `echo $EVENT | awk '{for(i=1;i<=NF;i++) if($i=="evbug")printf($(i-1))}'` This is old style now. Better use like below:- foo=$(echo $EVENT | awk '{for(i=1;i<=NF;i++) if($i=="evbug")printf($(i-1))}') Also always use double quotes "" when echoing a variable. See below:- echo -n "$foo" > /home/ubuntu/Setups/buttonDevice Try with the below code it will work for you #!/bin/bash if grep "gpio-keys" /proc/bus/input/devices >/dev/null ; then cat /proc/bus/input/devices | grep "Handlers=kbd" | awk '{for(i=1;i<=NF;i++){ if($i ~ /eve/){printf "%s \n", $i} } }') > /home/ubuntu/Setups/buttonDevice fi The output in buttonDevice would be event0 event1 . . . . event100
appending text to specific line in file bash
So I have a file that contains some lines of text separated by ','. I want to create a script that counts how much parts a line has and if the line contains 16 parts i want to add a new one. So far its working great. The only thing that is not working is appending the ',' at the end. See my example below: Original file: a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a b,b,b,b,b,b a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a b,b,b,b,b,b a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a Expected result: a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,xx b,b,b,b,b,b a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a b,b,b,b,b,b a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,xx This is my code: while read p; do if [[ $p == "HEA"* ]] then IFS=',' read -ra ADDR <<< "$p" echo ${#ADDR[#]} arrayCount=${#ADDR[#]} if [ "${arrayCount}" -eq 16 ]; then sed -i "/$p/ s/\$/,xx/g" $f fi fi done <$f Result: a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a ,xx b,b,b,b,b,b a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a b,b,b,b,b,b a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a ,xx What im doing wrong? I'm sure its something small but i cant find it..
It can be done using awk: awk -F, 'NF==16{$0 = $0 FS "xx"} 1' file a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,xx b,b,b,b,b,b a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a b,b,b,b,b,b a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,xx -F, sets input field separator as comma NF==16 is the condition that says execute block inside { and } if # of fields is 16 $0 = $0 FS "xx" appends xx at end of line 1 is the default awk action that means print the output
For using sed answer should be in the following: Use ${line_number} s/..../..../ format - to target a specific line, you need to find out the line number first. Use the special char & to denote the matched string The sed statement should look like the following: sed -i "${line_number}s/.*/&xx/" I would prefer to leave it to you to play around with it but if you would prefer i can give you a full working sample.
How to split a string in bash delimited by tab
I'm trying to split a tab delimitted field in bash. I am aware of this answer: how to split a string in shell and get the last field But that does not answer for a tab character. I want to do get the part of a string before the tab character, so I'm doing this: x=`head -1 my-file.txt` echo ${x%\t*} But the \t is matching on the letter 't' and not on a tab. What is the best way to do this? Thanks
If your file look something like this (with tab as separator): 1st-field 2nd-field you can use cut to extract the first field (operates on tab by default): $ cut -f1 input 1st-field If you're using awk, there is no need to use tail to get the last line, changing the input to: 1:1st-field 2nd-field 2:1st-field 2nd-field 3:1st-field 2nd-field 4:1st-field 2nd-field 5:1st-field 2nd-field 6:1st-field 2nd-field 7:1st-field 2nd-field 8:1st-field 2nd-field 9:1st-field 2nd-field 10:1st-field 2nd-field Solution using awk: $ awk 'END {print $1}' input 10:1st-field Pure bash-solution: #!/bin/bash while read a b;do last=$a; done < input echo $last outputs: $ ./tab.sh 10:1st-field Lastly, a solution using sed $ sed '$s/\(^[^\t]*\).*$/\1/' input 10:1st-field here, $ is the range operator; i.e. operate on the last line only. For your original question, use a literal tab, i.e. x="1st-field 2nd-field" echo ${x% *} outputs: 1st-field
Use $'ANSI-C' strings in the parameter expansion: $ x=$'abc\tdef\tghi' $ echo "$s" abc def ghi $ echo ">>${x%%$'\t'*}<<" >>abc<<
read field1 field2 <<< ${tabDelimitedField} or read field1 field2 <<< $(command_producing_tab_delimited_output)
Use awk. echo $yourfield | awk '{print $1}' or, in your case, for the first field from the the last line of a file tail yourfile | awk '{x=$1}END{print x}'
There is an easy way for a tab separated string : convert it to an array. Create a string with tabs ($ added before for '\t' interpretation) : AAA=$'ABC\tDEF\tGHI' Split the string as an array using parenthesis : BBB=($AAA) Get access to any element : echo ${BBB[0]} ABC echo ${BBB[1]} DEF echo ${BBB[2]} GHI
x=first$'\t'second echo "${x%$'\t'*}" See QUOTING in man bash
The answer from https://stackoverflow.com/users/1815797/gniourf-gniourf hints at the use of built in field parsing in bash, but does not really complete the answer. The use of the IFS shell parameter to set the input field separate will complete the picture and give the ability to parse files which are tab-delimited, of a fixed number of fields, in pure bash. echo -e "a\tb\tc\nd\te\tf" > myfile while IFS='<literaltab>' read f1 f2 f3;do echo "$f1 = $f2 + $f3"; done < myfile a = b + c d = e + f Where, of course, is replaced by a real tab, not \t. Often, Control-V Tab does this in a terminal.