Bash split word with same characters - bash

How can I split string which contain more of the same characters.
For example name=John:adress=London. I need result name as variable and John:adress=London as value.
I have no idea how to. Thanks.

You can use cut.
# print first field
echo "name=John:#(ADDRESS=(LONDON=(STREET=XY)))" | cut -d = -f 1
# print remaining fields
echo "name=John:#(ADDRESS=(LONDON=(STREET=XY)))" | cut -d = -f 2-

You can use a cut and command grouping
INPUT='name=#(ADDRESS=(LONDON=(STREET=XY)))'
NAME=$(echo "$STR" | cut -d '=' -f 1)
INFO=$(echo "$STR" | cut -d '=' -f 2-)
The single quotes in the first line prevent any special bash symbols to be interpreted literally. The variable $NAME accepts the value of a command grouping, signified by $(). The $INPUT is echoed into the cut command, where the delimiter = is specified by the -d flag, and the first field (-f flag) is specified.
Next, the variable $INFO is assigned the value of the command grouping, where the second field until the end is signified. The dash after the two in this part: -f 2- tells bash to select everything after the first = sign to the end.
The first equals sign will not be in the $INFO variable at the end.

Related

Need to split 1st string before delimiter which is comma(,)

Need to split the 1st string before delimiter comma.
For example
A="ABC:20.10.0-5,DEF:21.10.0-9,XYZ:20.10.0-9"
We need to extract 1st string before the comma(,) and the result should be like this -
B="ABC:20.10.0-5"
After this, I need to extract the numbers after colon(:) and before the dash(-). So the final value should be -
C="20.10.0"
It can be done with simple shell substitution:
A="ABC:20.10.0-5,DEF:21.10.0-9,XYZ:20.10.0-9"
B="${A%%,*}" # Remove everything after the first comma and the comma itself
nodash="${B%%-*}" # Remove everything after the dash and the dash itself
C="${nodash##*:}" # Remove everything before the colon and the colon itself
You can refer to the below code. I wasn't sure where exactly are you running it as there's only one tag and also that if you actually wanted to print double quotes in the output. Hence, i added them as well. I am assuming you are aware of cut command.
-bash-4.2$ cat test2.sh
#!/bin/bash
A="ABC:20.10.0-5,DEF:21.10.0-9,XYZ:20.10.0-9"
echo "A="\"$A\"
B=`echo $A | cut -d"," -f1`
echo "B="\"$B\"
C=`echo $B | cut -d":" -f2 | cut -d"-" -f1`
echo "C="\"$C\"
-bash-4.2$ ./test2.sh
A="ABC:20.10.0-5,DEF:21.10.0-9,XYZ:20.10.0-9"
B="ABC:20.10.0-5"
C="20.10.0"

Duplicate the output of bash script

Below is the piece of code of my bash script, I want to get duplicate output of that script.
This is how my script runs
#bash check_script -a used_memory
Output is: used_memory: 812632
Desired Output: used_memory: 812632 | used_memory: 812632
get_vals() {
metrics=`command -h $hostname -p $port -a $pass info | grep -w $opt_var | cut -d ':' -f2 > ${filename}`
}
output() {
get_vals
if [ -s ${filename} ];
then
val1=`cat ${filename}`
echo "$opt_var: $val1"
# rm $filename;
exit $ST_OK;
else
echo "Parameter not found"
exit $ST_UK
fi
}
But when i used echo "$opt_var: $val1 | $opt_var: $val1" the output become: | used_memory: 812632
$opt_var is an argument.
I had a similar problem when capturing results from cat with Windows-formatted text files. One way to circumvent this issue is to pipe your result to dos2unix, e.g.:
val1=`cat ${filename} | dos2unix`
Also, if you want to duplicate lines, you can use sed:
sed 's/^\(.*\)$/\1 | \1/'
Then pipe it to your echo command:
echo "$opt_var: $val1" | sed 's/^\(.*\)$/\1 | \1/'
The sed expression works like that:
's/<before>/<after>/' means that you want to substitute <before> with <after>
on the <before> side: ^.*$ is a regular expression meaning you get the entire line, ^\(.*\)$ is basically the same regex but you get the entire line and you capture everything (capturing is performed inside the \(\) expression)
on the <after> side: \1 | \1 means you write the 1st captured expression (\1), then the space character, then the pipe character, then the space character and then the 1st captured expression again
So it captures your entire line and duplicates it with a "|" separator in the middle.

cut a string after a specified pattern (comma)

I want to cut a string and assign it to a variable after first occurrence of comma.
my_string="a,b,c,d,e,f"
Output expected:
output="b,c,d,e,f"
When I use the command
output=`echo $my_string | cut -d ',' f2
I am getting only b as output.
Adding a dash '-' to the end of your -f2 will output the remainder of the string.
$ echo "a,b,c,d,e,f,g"|cut -d, -f2-
b,c,d,e,f,g
With parameter expansion instead of cut:
$ my_string="a,b,c,d,e,f"
$ output="${my_string#*,}"
$ echo "$output"
b,c,d,e,f
${my_string#*,} stands for "remove everything up to and including the first comma from my_string" (see the Bash manual).
You must add the minus sign (-) after the position you are looking for.
a=`echo $my_string|cut -d "," -f 2-`
echo $a
b,c,d,e,f

How to process large csv files efficiently using shell script, to get better performance than that for following script?

I have a large csv file input_file with 5 columns. I want to do two things to second column:
(1) Remove last character
(2) Append leading and trailing single quote
Following are the sample rows from input_file.dat
420374,2014-04-06T18:44:58.314Z,214537888,12462,1
420374,2014-04-06T18:44:58.325Z,214537850,10471,1
281626,2014-04-06T09:40:13.032Z,214535653,1883,1
Sample output would look like :
420374,'2014-04-06T18:44:58.314',214537888,12462,1
420374,'2014-04-06T18:44:58.325',214537850,10471,1
281626,'2014-04-06T09:40:13.032',214535653,1883,1
I have written a following code to do the same.
#!/bin/sh
inputfilename=input_file.dat
outputfilename=output_file.dat
count=1
while read line
do
echo $count
count=$((count + 1))
v1=$(echo $line | cut -d ',' -f1)
v2=$(echo $line | cut -d ',' -f2)
v3=$(echo $line | cut -d ',' -f3)
v4=$(echo $line | cut -d ',' -f4)
v5=$(echo $line | cut -d ',' -f5)
v2len=${#v2}
v2len=$((v2len -1))
newv2=${v2:0:$v2len}
newv2="'$newv2'"
row=$v1,$newv2,$v3,$v4,$v5
echo $row >> $outputfilename
done < $inputfilename
But it's taking lot of time.
Is there any efficient way to achieve this?
You can do this with awk
awk -v q="'" 'BEGIN{FS=OFS=","} {$2=q substr($2,1,length($2)-1) q}1' input_file.dat
How it works:
BEGIN{FS=OFS=","} : set input and output field separator (FS, OFS) to ,.
-v q="'" : assign a literal single quote to the variable q (to avoid complex escaping in the awk expression)
{$2=q substr($2,1,length($2)-1) q} : Replace the second field ($2) with a single quote (q) followed by the value of the 2nd field without the last character (substr(string, start, length)) and appending a literal single quote (q) at the end.
1 : Just invoke the default action, which is print the current (edited) line.

Using grep to get the line number of first occurrence of a string in a file

I am using bash script for testing purpose.During my testing I have to find the line number of first occurrence of a string in a file. I have tried "awk" and "grep" both, but non of them return the value.
Awk example
#/!bin/bash
....
VAR=searchstring
...
cpLines=$(awk '/$VAR/{print NR}' $MYDIR/Configuration.xml
this does not expand $VAR. If I use the value of VAR it works, but I want to use VAR
Grep example
#/!bin/bash
...
VAR=searchstring
...
cpLines=grep -n -m 1 $VAR $MYDIR/Configuration.xml |cut -f1 -d:
this gives error line 20: -n: command not found
grep -n -m 1 SEARCH_TERM FILE_PATH |sed 's/\([0-9]*\).*/\1/'
grep switches
-n = include line number
-m 1 = match one
sed options (stream editor):
's/X/Y/' - replace X with Y
\([0-9]*\) - regular expression to match digits zero or multiple times occurred, escaped parentheses, the string matched with regex in parentheses will be the \1 argument in the Y (replacement string)
\([0-9]*\).* - .* will match any character occurring zero or multiple times.
You need $() for variable substitution in grep
cpLines=$(grep -n -m 1 $VAR $MYDIR/Configuration.xml |cut -f1 -d: )
Try something like:
awk -v search="$var" '$0~search{print NR; exit}' inputFile
In awk, / / will interpret awk variable literally. You need to use match (~) operator. What we are doing here is looking for the variable against your input line. If it matches, we print the line number stored in NR and exit.
-v allows you to create an awk variable (search) in above example. You then assign it your bash variable ($var).
grep -n -m 1 SEARCH_TERM FILE_PATH | grep -Po '^[0-9]+'
explanation:
-Po = -P -o
-P use perl regex
-o only print matched string (not the whole line)
Try pipping;
grep -P 'SEARCH TERM' fileName.txt | wc -l

Resources