Insert with sed n repeated characters - bash

Creating a printout file from a mysql query, I insert a separation line after every TOTAL string with:
sed -i /^TOTAL/i'-------------------------------------------------- ' file.txt
Is there any more elegante way to repeat n "-" characters instead of typing them?
For instance, if I had to simply generate a line without finding/inserting, I would use:
echo -$-{1..50} | tr -d ' '
but don't know how to do something similar with sed into a file.
Thanks!

Just combine the two:
sed -i /^TOTAL/i"$(echo -$___{1..50} | tr -d ' ')" file.txt

With perl, you can repeat a character N times, see :
perl -pe 's/^TOTAL.*/"-"x50 . "\n$&"/e' file.txt
or :
perl -pe 's/^TOTAL.*/sprintf("%s\n%s", "-"x50, $&)/e' file.txt
and you keep a syntax close to sed.

Another way using builtin printf and bash brace expansion :
sed -i "/^TOTAL/i $(printf '%.0s-' {0..50})" file.txt

Related

Getting last X fields from a specific line in a CSV file using bash

I'm trying to get as bash variable list of users which are in my csv file. Problem is that number of users is random and can be from 1-5.
Example CSV file:
"record1_data1","record1_data2","record1_data3","user1","user2"
"record2_data1","record2_data2","record2_data3","user1","user2","user3","user4"
"record3_data1","record3_data2","record3_data3","user1"
I would like to get something like
list_of_users="cat file.csv | grep "record2_data2" | <something> "
echo $list_of_users
user1,user2,user3,user4
I'm trying this:
cat file.csv | grep "record2_data2" | awk -F, -v OFS=',' '{print $4,$5,$6,$7,$8 }' | sed 's/"//g'
My result is:
user2,user3,user4,,
Question:
How to remove all "," from the end of my result? Sometimes it is just one but sometimes can be user1,,,,
Can I do it in better way? Users always starts after 3rd column in my file.
This will do what your code seems to be trying to do (print the users for a given string record2_data2 which only exists in the 2nd field):
$ awk -F',' '{gsub(/"/,"")} $2=="record2_data2"{sub(/([^,]*,){3}/,""); print}' file.csv
user1,user2,user3,user4
but I don't see how that's related to your question subject of Getting last X records from CSV file using bash so idk if it's what you really want or not.
Better to use a bash array, and join it into a CSV string when needed:
#!/usr/bin/env bash
readarray -t listofusers < <(cut -d, -f4- file.csv | tr -d '"' | tr ',' $'\n' | sort -u))
IFS=,
printf "%s\n" "${listofusers[*]}"
cut -d, -f4- file.csv | tr -d '"' | tr ',' $'\n' | sort -u is the important bit - it first only prints out the fourth and following fields of the CSV input file, removes quotes, turns commas into newlines, and then sorts the resulting usernames, removing duplicates. That output is then read into an array with the readarray builtin, and you can manipulate it and the individual elements however you need.
GNU sed solution, let file.csv content be
"record1_data1","record1_data2","record1_data3","user1","user2"
"record2_data1","record2_data2","record2_data3","user1","user2","user3","user4"
"record3_data1","record3_data2","record3_data3","user1"
then
sed -n -e 's/"//g' -e '/record2_data/ s/[^,]*,[^,]*,[^,]*,// p' file.csv
gives output
user1,user2,user3,user4
Explanation: -n turns off automatic printing, expressions meaning is as follow: 1st substitute globally " using empty string i.e. delete them, 2nd for line containing record2_data substitute (s) everything up to and including 3rd , with empty string i.e. delete it and print (p) such changed line.
(tested in GNU sed 4.2.2)
awk -F',' '
/record2_data2/{
for(i=4;i<=NF;i++) o=sprintf("%s%s,",o,$i);
gsub(/"|,$/,"",o);
print o
}' file.csv
user1,user2,user3,user4
This might work for you (GNU sed):
sed -E '/record2_data/!d;s/"([^"]*)"(,)?/\1\2/4g;s///g' file
Delete all records except for that containing record2_data.
Remove double quotes from the fourth field onward.
Remove any double quoted fields.

printing only specific lines with sed

I have following File (wishlist.txt):
Alligatoah Musik_ist_keine_lösung;https:///uhfhf
Alligatoah STRW;https:///uhfhf?i
Amewu Entwicklungshilfe;https:///uhfhf?i
and want to have the first word of line n.
so for n = 1:
Alligatoah
What i have so far is:
sed -e 's/\s.*//g' wishlist.txt
is there a elegant way to get rid of all lines except n?
Edit:
How to pass a bash variable "$i" to sed since
sed -n '$is/ .*//p' $wishlist
and
sed -n "\`${i}\`s/ .*//p" $wishlist
doesn't work
A couple of other techniques to get the first word of the 3rd line:
awk -v line=3 'NR == line {print $1; exit}' file
or
head -n 3 file | tail -n 1 | cut -d ' ' -f 1
Something like this. For the 1st word of the 3rd line.
sed -n '3s/\s.*//p' wishlist.txt
To use a variable: Note: Double quotes.
line=3; sed -n "${line}s/\s.*//p" wishlist.txt
sed supports "addresses", so you can tell it what lines to operate on. To print only the first line, you can use
sed -e '1!d; s/\s.*//'
where 1!d means: on lines other then 1, delete the line.

Replace pipe character "|" with escaped pip character "\|" in string in bash script

I am trying to replace a pipe character in an String with the escaped character in it:
Input: "text|jdbc"
Output: "text\|jdbc"
I tried different things with tr:
echo "text|jdbc" | tr "|" "\\|"
...
But none of them worked.
Any help would be appreciated.
Thank you,
tr is good for one-to-one mapping of characters (read "translate").
\| is two characters, you cannot use tr for this. You can use sed:
echo 'text|jdbc' | sed -e 's/|/\\|/'
This example replaces one |. If you want to replace multiple, add the g flag:
echo 'text|jdbc' | sed -e 's/|/\\|/g'
An interesting tip by #JuanTomas is to use a different separator character for better readability, for example:
echo 'text|jdbc' | sed -e 's_|_\\|_g'
You can take advantage of the fact that | is a special character in bash, which means the %q modifier used by printf will escape it for you:
$ printf '%q\n' "text|jdbc"
text\|jdbc
A more general solution that doesn't require | to be treated specially is
$ f="text|jdbc"
$ echo "${f//|/\\|}"
text\|jdbc
${f//foo/bar} expands f and replaces every occurance of foo with bar. The operator here is /; when followed by another /, it replaces all occurrences of the search pattern instead of just the first one. For example:
$ f="text|jdbc|two"
$ echo "${f/|/\\|}"
text\|jdbc|two
$ echo "${f//|/\\|}"
text\|jdbc\|two
You can try with awk:
echo "text|jdbc" | awk -F'|' '$1=$1' OFS="\\\|"

sed, capture only the number

I have this text file:
some text A=10 some text
some more text A more text
some other text A=30 other text
I'm trying to use sed to capture only the numeric value of A. Using this
cat textfile | sed -r 's/.*A=(\S+).*/\1/'
I get:
10
some more text A more text
30
But what i really need is:
10
0
30
If the string A= does not exist output a 0. How can I accomplish this?
I cannot think on a one-liner, so this is my approach:
while read line
do
grep -Po '(?<=A=)\d+' <<< "$line" || echo "0"
done < file
I am using the look-behind grep to get any number after A=. In case there is none, the || (else) will print a 0.
I love code-golf!
sed -e 's/^/A=0 /; s/.*\<A=\(\d\+\).*/\1/'
This prepends A=0 to the line before substituting.
try this one-liner:
awk -F'A=' 'NF==1{print "0";next}{sub(/ .*/,"",$2);print $2}' file
with your data:
kent$ echo "some text A=10 some text
some more text A more text
some other text A=30 other text"|awk -F'A=' 'NF==1{print "0";next}{sub(/.*/,"",$2);print $2}'
10
0
30
gawk
awk '{$0=gensub(/^.*A=?([[:digit:]]+).*$/, "\\1", "g"); print($0+0)}' file.txt
This might work for you (GNU sed):
sed '/.*A=\([0-9][0-9]*\).*/s//\1/;t;s/.*/0/' file
Look for the string A= followed by one or more numbers and if it occurs replace the whole line by the back reference. Otherwise replace the whole of the line by 0.
I think the best way is to do two different commands - the first replaces lines without 'A=' with the line 'A=0', the second does what you did.
So
cat textfile | sed -r 's/^([^A]|A[^=)*$/A=0/' | sed -r 's/.*A=(\S+).*/\1/'
How about:
sed -r -e 's/.*A=(\S+).*/\1/' -e 's/.*A.*/0/'
Some grep-sed-cut combination:
grep -o 'A=\?[0-9]*' input | sed 's/A$/A=0/' | cut -d= -f2
Produces:
10
0
30

Only get hash value using md5sum (without filename)

I use md5sum to generate a hash value for a file.
But I only need to receive the hash value, not the file name.
md5=`md5sum ${my_iso_file}`
echo ${md5}
Output:
3abb17b66815bc7946cefe727737d295 ./iso/somefile.iso
How can I 'strip' the file name and only retain the value?
A simple array assignment works... Note that the first element of a Bash array can be addressed by just the name without the [0] index, i.e., $md5 contains only the 32 characters of md5sum.
md5=($(md5sum file))
echo $md5
# 53c8fdfcbb60cf8e1a1ee90601cc8fe2
Using AWK:
md5=`md5sum ${my_iso_file} | awk '{ print $1 }'`
You can use cut to split the line on spaces and return only the first such field:
md5=$(md5sum "$my_iso_file" | cut -d ' ' -f 1)
On Mac OS X:
md5 -q file
md5="$(md5sum "${my_iso_file}")"
md5="${md5%% *}" # remove the first space and everything after it
echo "${md5}"
Another way is to do:
md5sum filename | cut -f 1 -d " "
cut will split the line to each space and return only the first field.
By leaning on head:
md5_for_file=`md5sum ${my_iso_file}|head -c 32`
One way:
set -- $(md5sum $file)
md5=$1
Another way:
md5=$(md5sum $file | while read sum file; do echo $sum; done)
Another way:
md5=$(set -- $(md5sum $file); echo $1)
(Do not try that with backticks unless you're very brave and very good with backslashes.)
The advantage of these solutions over other solutions is that they only invoke md5sum and the shell, rather than other programs such as awk or sed. Whether that actually matters is then a separate question; you'd probably be hard pressed to notice the difference.
If you need to print it and don't need a newline, you can use:
printf $(md5sum filename)
md5=$(md5sum < $file | tr -d ' -')
md5=`md5sum ${my_iso_file} | cut -b-32`
md5sum puts a backslash before the hash if there is a backslash in the file name. The first 32 characters or anything before the first space may not be a proper hash.
It will not happen when using standard input (file name will be just -), so pixelbeat's answer will work, but many others will require adding something like | tail -c 32.
if you're concerned about screwy filenames :
md5sum < "${file_name}" | awk NF=1
f244e67ca3e71fff91cdf9b8bd3aa7a5
other messier ways to deal with this :
md5sum "${file_name}" | awk NF=NF OFS= FS=' .*$'
or
| awk '_{ exit }++_' RS=' '
f244e67ca3e71fff91cdf9b8bd3aa7a5
to do it entirely inside awk :
mawk 'BEGIN {
__ = ARGV[ --ARGC ]
_ = sprintf("%c",(_+=(_^=_<_)+_)^_+_*++_)
RS = FS
gsub(_,"&\\\\&",__)
( _=" md5sum < "((_)(__)_) ) | getline
print $(_*close(_)) }' "${file_name}"
f244e67ca3e71fff91cdf9b8bd3aa7a5
Well, I had the same problem today, but I was trying to get the file MD5 hash when running the find command.
I got the most voted question and wrapped it in a function called md5 to run in the find command. The mission for me was to calculate the hash for all files in a folder and output it as hash:filename.
md5() { md5sum $1 | awk '{ printf "%s",$1 }'; }
export -f md5
find -type f -exec bash -c 'md5 "$0"' {} \; -exec echo -n ':' \; -print
So, I'd got some pieces from here and also from 'find -exec' a shell function in Linux
For the sake of completeness, a way with sed using a regular expression and a capture group:
md5=$(md5sum "${my_iso_file}" | sed -r 's:\\*([^ ]*).*:\1:')
The regular expression is capturing everything in a group until a space is reached. To get a capture group working, you need to capture everything in sed.
(More about sed and capture groups here: How can I output only captured groups with sed?)
As delimiter in sed, I use colons because they are not valid in file paths and I don't have to escape the slashes in the filepath.
Another way:
md5=$(md5sum ${my_iso_file} | sed '/ .*//' )
md5=$(md5sum < index.html | head -c -4)

Resources