Can the cut command accept newline as a delimiter? - bash

I have a file, foo.txt, with 2 lines:
foo bar
hello world
I am trying to output the second line using:
cut -d$'\n' -f2 foo.txt
But it is printing both lines instead.
Is cut able to accept a newline as a delimiter?

Background
cut operates on all lines of the input, so I don't think that a newline can
work as a delimiter.
From cut(1p) of POSIX.1-2017:
NAME
cut - cut out selected fields of each line of a file
SYNOPSIS
cut -b list [-n] [file...]
cut -c list [file...]
cut -f list [-d delim] [-s] [file...]
DESCRIPTION
The cut utility shall cut out bytes (-b option), characters (-c option), or
character-delimited fields ( -f option) from each line in one or more
files, concatenate them, and write them to standard output.
Example:
$ printf 'foo bar1\nfoo bar2\n'
foo bar1
foo bar2
$ printf 'foo bar1\nfoo bar2\n' | cut -f 2 -d ' '
bar1
bar2
Solution
To print only the second line of input, sed can be used:
$ printf 'foo bar1\nfoo bar2\n'
foo bar1
foo bar2
$ printf 'foo bar1\nfoo bar2\n' | sed '2p;d'
foo bar2

It's not very clear what you want, but this is splitting on empty lines and collapsing the others using awk:
awk 'BEGIN {RS=""} {gsub("\n",""); print}'

this might not be a very elegant solution, but i often just use head and tail, since they typically outperform sed or awk in speed
#!/bin/sh
input='test0\ntest1\ntest2\ntest3\ntest4\ntest5'
echo -e "$input" | head --lines 3
# test0
# test1
# test2
echo -e "$input" | tail --lines 3
# test3
# test4
# test5
get_index() {
echo -e "$1" |
head --lines "$(( $2 + 1 ))" |
tail --lines "1"
}
get_index "$input" "0"
# test0
get_index "$input" "4"
# test4

Related

How to use awk to find a char in a string in bash

I have a char variable called sign and a given string sub. I need to find out how many times this sign appears in the sub and cannot use grep.
For example:
sign = c
sub = mechanic cup cat
echo "$sub" | awk <code i am asking for> | wc -l
And the output should be 4 because c appears 4 times. What should be inside <>?
sign=c
sub='mechanic cup cat'
echo "$sub" |
awk -v sign="$sign" -F '' '{for (i=1;i<=NF;i++){if ($i==sign) cnt++}} END{print cnt}'
Edit:
Changes for the requirements in the comment:
Test if the length of sign is 1 (no = present). If true, change sign and sub to lowercase to ignore the case.
Use ${sign:0:1} to only pass the first character to awk.
sign=c
sub='mechanic Cup cat'
if [ "${#sign}" -eq 1 ]; then
sign=${sign,,}
sub=${sub,,}
fi
echo "$sub" |
awk -v sign="${sign:0:1}" -F '' '{for (i=1;i<=NF;i++){if ($i==sign) cnt++}} END{print cnt}'
A combination of Quasimodo's comment and Freddy's lower-case example:
$ sign=c
$ sub='mechanic Cup cat'
A tr + wc solution if ${sign} is a single character.
Count the number of times ${sign} shows up in ${sub}, ignoring case:
$ tr -cd [${sign,,}] <<< ${sub,,} | wc -c
4
Where:
${sign,,} & {sub,,} - convert to all lowercase
tr -cd [...] - find all characters listed inside the brackets ([]), -d says to delete/remove said characters while -c says to take the complement (ie, remove all but the characters in the brackets), so -cp [${sign,,] says to remove all but the character stored in ${sign}
<<< .... - here string (allows passing a variable/string in as an argument to tr
wc -c count the number of chracers
NOTE: This only works if ${sign} contains a single character.
A sed solution that should work regardless of the number of characters in ${sign}.
$ sub='mechanic Cup cat'
First we embed a new line character before each occurrence of ${sign,,}:
$ sign=c
$ sed "s/\(${sign,,}\)/\n\1/g" <<< ${sub,,}
me
chani
c
cup
cat
$ sign=cup
$ sed "s/\(${sign,,}\)/\n\1/g" <<< ${sub,,}
mechanic
cup cat
Where:
\(${sign,,}\) - find the pattern that matches ${sign} (all lowercase) and assign to position 1
\n\1 - place a newline (\n) in the stream just before our pattern in position 1
At this point we just want the lines that start with ${sign,,}, which is where tail +2 comes into play (ie, display lines 2 through n):
$ sign=c
$ sed "s/\(${sign,,}\)/\n\1/g" <<< ${sub,,} | tail +2
chani
c
cup
cat
$ sign=cup
$ sed "s/\(${sign,,}\)/\n\1/g" <<< ${sub,,} | tail +2
cup cat
And now we pipe to wc -l to get a line count (ie, count the number of times ${sign} shows up in ${sub} - ignoring case):
$ sign=c
$ sed "s/\(${sign,,}\)/\n\1/g" <<< ${sub,,} | tail +2 | wc -l
4
$ sign=cup
$ sed "s/\(${sign,,}\)/\n\1/g" <<< ${sub,,} | tail +2 | wc -l
1

Bash - Counter for multiple parameters in file

I created a command, which works, but not exactly as I want. So I would like to upgrade this command to right output.
My command:
awk '{print $1}' ios-example.com.access | sort | uniq -c | sort -nr
Output of my command:
8 192.27.69.191
2 82.202.69.253
Input file:
https://pajda.fit.vutbr.cz/ios/ios-19-1-logs/blob/master/ios-example.com.access.log
Output I need(hashtags instead of numbers):
198.27.69.191 (8): ########
82.202.69.253 (2): ##
cat ios-example.com.access | sort | uniq -c | awk 'ht="#"{for(i=1;i<$1;i++){ht=ht"#"} str=sprintf("%s (%d): %s", $2,$1, ht); print str}'
expecting file with content like:
ipadress1
ipadress1
ipadress1
ipadress2
ipadress2
ipadress1
ipadress2
ipadress1
Using xargs with sh and printf. Comments in between the lines. Live version at tutorialspoint.
# sorry cat
cat <<EOF |
8 192.27.69.191
2 82.202.69.253
EOF
# for each 2 arguments
xargs -n2 sh -c '
# format the output as "$2 ($1): "
printf "%s (%s): " "$2" "$1"
# repeat the character `#` $1 times
seq "$1" | xargs printf "#%.0s"
# lastly a newline
printf "\n"
' --
I think we could shorten that a bit with:
xargs -n2 sh -c 'printf "%s (%s): %s\n" "$2" "$1" $(printf "#%.0s" $(seq $1))' --
or maybe just echo, if the input is sufficiently safe:
xargs -n2 sh -c 'echo "$2 ($1): $(printf "#%.0s" $(seq $1))"' --
You can upgrade your command by adding another awk to the list, or you can just use a single awk for the whole thing:
awk '{a[$1]++}
END { for(i in a) {
printf "%s (%d):" ,i,a[i]
for(j=0;j<a[i];++j) printf "#"; printf "\n"
}
}' file

Pipe stdout to command which itself needs to read from own stdin

I would like to get the stdout from a process into another process not using stdin, as that one is used for another purpose.
In short I want to accomplish something like that:
echo "a" >&4
cat | grep -f /dev/fd/4
I got it running using an file as source for file descriptor 4, but that is not what I want:
# Variant 1
cat file | grep -f /dev/fd/4 4<pattern
# Variant 2
exec 4<pattern
cat | grep -f /dev/fd/4
exec 4<&-
My best try is that, but I got the following error message:
# Variant 3
cat | (
echo "a" >&4
grep -f /dev/fd/4
) <&4
Error message:
test.sh: line 5: 4: Bad file descriptor
What is the best way to accomplish that?
You don't need to use multiple streams to do this:
$ printf foo > pattern
$ printf '%s\n' foo bar | grep -f pattern
foo
If instead of a static file you want to use the output of a command as the input to -f you can use a process substitution:
$ printf '%s\n' foo bar | grep -f <(echo foo)
foo
For POSIX shells that lack process substitution, (e.g. dash, ash, yash, etc.).
If the command allows string input, (grep allows it), and the input string containing search targets isn't especially large, (i.e. the string doesn't exceed the length limit for the command line), there's always command substitution:
$ printf '%s\n' foo bar baz | grep $(echo foo)
foo
Or if the input file is multi-line, separating quoted search items with '\n' works the same as grep OR \|:
$ printf '%s\n' foo bar baz | grep "$(printf "%s\n" foo bar)"
foo
bar

Extract data between delimiters from a Shell Script variable

I have this shell script variable, var. It keeps 3 entries separated by new line. From this variable var, I want to extract 2, and 0.078688. Just these two numbers.
var="USER_ID=2
# 0.078688
Suhas"
These are the code I tried:
echo "$var" | grep -o -P '(?<=\=).*(?=\n)' # For extracting 2
echo "$var" | awk -v FS="(# |\n)" '{print $2}' # For extracting 0.078688
None of the above working. What is the problem here? How to fix this ?
Just use tr alone for retaining the numerical digits, the dot (.) and the white-space and remove everything else.
tr -cd '0-9. ' <<<"$var"
2 0.078688
From the man page, of tr for usage of -c, -d flags,
tr [OPTION]... SET1 [SET2]
-c, -C, --complement
use the complement of SET1
-d, --delete
delete characters in SET1, do not translate
To store it in variables,
IFS=' ' read -r var1 var2 < <(tr -cd '0-9. ' <<<"$var")
printf "%s\n" "$var1"
2
printf "%s\n" "$var2"
2
0.078688
Or in an array as
IFS=' ' read -ra numArray < <(tr -cd '0-9. ' <<<"$var")
printf "%s\n" "${numArray[#]}"
2
0.078688
Note:- The -cd flags in tr are POSIX compliant and will work on any systems that has tr installed.
echo "$var" |grep -oP 'USER_ID=\K.*'
2
echo "$var" |grep -oP '# \K.*'
0.078688
Your solution is near to perfect, you need to chance \n to $ which represent end of line.
echo "$var" |awk -F'# ' '/#/{print $2}'
0.078688
echo "$var" |awk -F'=' '/USER_ID/{print $2}'
2
You can do it with pure bash using a regex:
#!/bin/bash
var="USER_ID=2
# 0.078688
Suhas"
[[ ${var} =~ =([0-9]+).*#[[:space:]]([0-9\.]+) ]] && result1="${BASH_REMATCH[1]}" && result2="${BASH_REMATCH[2]}"
echo "${result1}"
echo "${result2}"
With awk:
First value:
echo "$var" | grep 'USER_ID' | awk -F "=" '{print $2}'
Second value:
echo "$var" | grep '#' | awk '{print $2}'
Assuming this is the format of data as your sample
# For extracting 2
echo "$var" | sed -e '/.*=/!d' -e 's///'
echo "$var" | awk -F '=' 'NR==1{ print $2}'
# For extracting 0.078688
echo "$var" | sed -e '/.*#[[:blank:]]*/!d' -e 's///'
echo "$var" | awk -F '#' 'NR==2{ print $2}'

Get the third element of a line into a file with script shell

I'm doing a script shell and I want to read data inside a file. In the file, I have something like :
/path/to/file1 something 0
/path/to/file2 something2 1
/path/to/file3 something3 2
What I want is to get the third element of the line but I don't know how to do it.
In my code, I have:
while read line;
do
//must echo the third element of the line
done < file | sort -n -k 2 -t " "
I already tried with awk but it didn't work.
How should I do please ?
This works if fields are separated by space:
$ echo 'foo bar baz' | cut --delimiter=' ' --fields=3
baz
This works for most whitespace separators:
$ echo 'foo bar baz' | awk '{print $3}'
baz
you can try something like this;
while read line;
do
path=$(echo $line | awk '{print $1}')
secondColumn=$(echo $line | awk '{print $2}')
thirdColumn=$(echo $line | awk '{print $3}')
echo $path
echo $secondColumn
echo $thirdColumn
done < test

Resources