Users who are logged on, in alphabetical order, printed on one line - bash

Users who are logged on, in alphabetical order, printed on one line.
What are the minimum amount of changes to get this to work because in a bash script?
this is the given script:
for name in $#
do
who | grep -w "^name" | sed 's/ .*//' | uniq
done | sort | tr '\n' ' '
echo

Single line commands:
who | awk '{print $1}' | sort | uniq | tr '\n' ' '
who: list of logged in users.
awk '{print $1}': keep only the first word or each line, which is the usernames.
sort: put the usernames in alphabetical order.
uniq: remove duplicates.
tr '\n' ' ': remove carriage returns, and replace them with spaces.
Ex
$ who
steve tty7 Mar 5 16:25 (:0)
bernard tty7 Mar 5 16:25 (:0)
sarah tty7 Mar 5 16:25 (:0)
$ who | awk '{print $1}' | sort | uniq | tr '\n' ' '
bernard sara steve
Your code did grep -w "^name", which tells grep to output the lines that start with "name". Not the lines that begin with the value of variable "name". For that you would need to do grep -w "^$name".

Try this Shellcheck-clean code:
for name in "$#"
do
who | sed 's/[[:space:]].*//' | grep -xF -- "$name"
done | sort -u | paste -sd ' '
$# should always have double quotes on it ("$#"). See Accessing bash command line args $# vs $*. Shellcheck correctly complains if double quotes are not used.
sed 's/[[:space:]].*//' removes the first whitespace character, and everything after it, on every input line. Using [[:space:]] instead of a literal space character means that the code will still work if the who output uses tabs as separators. It may be easier to read too. The sed command is run first to ensure that usernames occupy whole lines so it's easier to avoid spurious matches at the next pipeline stage.
grep -xF -- "$name" searches for whole lines in the input that are the "$name" string. The -x option forces matching of whole lines. That prevents, for instance, the username mary matching the username mary.jane (a valid username on at least some Linux systems). The -F option means that regular expression patterns in "$name" are treated as literal strings. That prevents, for instance, the name t.m matching the name tim. The -- prevents a leading hyphen in "$name" being treated as a grep option. No system that I know of allows usernames to have leading hyphens, but there's nothing to stop such an invalid name being provided as a command line argument to the code. The -w option to grep wouldn't be useful here because valid names may contain non-word characters (e.g. t.m).
sort -u takes the output of the for loop (an unsorted list of usernames, one per line, possibly with repetitions) and sorts it. The -u option causes it to remove duplicates (like piping to uniq, but saves a process creation).
paste -sd ' ' puts all the lines in to input on a single line, separated by spaces (specified by the -d option and ' ' (space) option argument), and terminated with a newline character. tr '\n' ' ' would have a similar effect but it produces an unterminated line with a trailing space character.

All you need is:
who | sort -k1,1 -u | awk '{u=u s $1; s=OFS} END{print u}'
That will output a blank-separated list of all logged in users, all on 1 line, with a terminating newline to make it a valid POSIX text file, and without an undesirable trailing blank char.

Related

Counting number of different words in a txt file in Bash

Well, I do not know much about programming at bash, I'm new at it so I'm struggling to find a code to iterate all the lines in a txt file, and count how many words are different.
Example: If a txt file has "Nory was a Catholic because her mother was a Catholic"
So the result must be 7
$ grep -o '[^[:space:]]*' file | sort -u | wc -l
7
Sure. I assume you are ok with defining "words" as things that are separated by space? In which case, try something like this:
cat filename | sed -r -e "s/[ ]+/ /g" -e "s/ /\n/g" | sort -u | wc -l
This command says:
Dump contents of filename
Replace multiple spaces with a single space
Replace spaces with newline
Sort and "uniquify" the list
Print out the count of lines
Per the comment, you can technically get away without using cat if you'd like, with the following:
sed -r -e "s/[ ]+/ /g" -e "s/ /\n/g" filename | sort -u | wc -l
Further, from another comment, you could optionally use tr (importantly with it's -s flag to handle repeated spaces) instead of sed with something like:
tr -s " " "\n" < filename | sort -u | wc -l
The moral of the story is there are several ways this kind of thing can be accomplished, not to mention the other full answers that are given here :-) My personal favorite answer at this point is Ed Morton's which I've upvoted accordingly.
You could also lowercase the text so words compares regardless of casing.
Also filter words with the [:alnum:] character class, rather than [a-zA-Z0-9_] that is only valid for US-ASCII, and will fail dramatically with Greek or Turkish.
#!/usr/bin/env bash
echo "The uniq words are the words that appears at least once, regardless of casing." |
# Turn text to lowercase
tr '[:upper:]' '[:lower:]' |
# Split alphanumeric with newlines
tr -sc '[:alnum:]' '\n' |
# Sort uniq words
sort -u |
# Count lines of unique words
wc -l
I would do it like so, with comments:
echo "Nory was a Catholic because her mother was a Catholic" |
# tr replace
# -s - squeeze
# -c - complementary
# [a-zA-Z0-9_] - all letters, number and underscore
# but complementary set, so all non letters, not numbers and not underscores.
# replace them by newline
tr -sc '[a-zA-Z0-9_]' '\n' |
# and sort unique and display count
sort -u | wc -l
Tested on repl bash.
Decided to use [a-zA-Z0-9_], because this is how GNU sed \w extension matches a word.
cat yourfile.txt | xargs -n1 | sort | uniq -c > youroutputfile.txt
xargs -n1 = put one word per line
sort = sorts
uniq -c = counts occurrences of distinct values
source

Delete words in a line using grep or sed

I want to delete three words with a special character on a line such as
Input:
\cf4 \cb6 1749,1789 \cb3 \
Output:
1749,1789
I have tried a couple sed and grep statements but so far none have worked, mainly due to the character \.
My unsuccessful attempt:
sed -i 's/ [.\c ] //g' inputfile.ext >output file.ext
Awk accepts a regex Field Separator (in this case, comma or space):
$ awk -F'[ ,]' '$0 = $3 "." $4' <<< '\cf4 \cb6 1749,1789 \cb3 \'
1749.1789
-F'[ ,]' - Use a single character from the set space/comma as Field Separator
$0 = $3 "." $4 - If we can set the entire line $0 to Field 3 $4 followed by a literal period "." followed by Field 4 $4, do the default behavior (print entire line)
Replace <<< 'input' with file if every line of that file has the same delimeters (spaces/comma) and number of fields. If your input file is more complex than the sample you shared, please edit your question to show actual input.
The backslash is a special meta-character that confuses bash.
We treat it like any other meta-character, by escaping it, with--you guessed it--a backslash!
But first, we need to grep this pattern out of our file
grep '\\... \\... [0-9]+,[0-9]+ \\... \\' our_file # Close enough!
Now, just sed out those pesky backslashes
| sed -e 's/\\//g' # Don't forget the g, otherwise it'll only strip out 1 backlash
Now, finally, sed out the clusters of 2 alpha followed by a number and a space!
| sed -e 's/[a-z][a-z][0-9] //g'
And, finally....
grep '\\... \\... [0-9]+,[0-9]+ \\... \\' our_file | sed -e 's/\\//g' | sed -e 's/[a-z][a-z][0-9] //g'
Output:
1749,1789
My guess is you are having trouble because you have backslashes in input and can't figure out how to get backslashes into your regex. Since backslashes are escape characters to shell and regex you end up having to type four backslashes to get one into your regex.
Ben Van Camp already posted an answer that uses single quotes to make the escaping a little easier; however I shall now post an answer that simply avoids the problem altogether.
grep -o '[0-9]*,[0-9]*' | tr , .
Locks on to the comma and selects the digits on either side and outputs the number. Alternately if comma is not guaranteed we can do it this way:
egrep -o ' [0-9,]*|^[0-9,]*' | tr , . | tr -d ' '
Both of these assume there's only one usable number per line.
$ awk '{sub(/,/,".",$3); print $3}' file
1749.1789
$ sed 's/\([^ ]* \)\{2\}\([^ ]*\).*/\2/; s/,/./' file
1749.1789

Using sed to extract strings from a text file

I have text data in this form:
^Well/Well[ADV]+ADV ^John/John[N]+N ^has/have[V]+V+3sg+PRES ^a/a[ART]
^quite/quite[ADV]+ADV ^different/different[ADJ]+ADJ ^not/not[PART]
^necessarily/necessarily[ADV]+ADV ^more/more[ADV]+ADV
^elaborated/elaborate[V]+V+PPART ^theology/theology[N]+N *edu$
And I want it to be processed to this form:
Well John have a quite different not necessarily more elaborate theology
Basically, I need every string between the starting character / and the ending character [.
Here is what I tried, but I just get empty files...
#!/bin/bash
for file in probe/*.txt
do sed '///,/[/d' $file > $file.aa
mv $file.aa $file
done
awk to the rescue!
$ awk -F/ -v RS=^ -v ORS=' ' '{print $1}' file
Well John has a quite different not necessarily more elaborated theology
Explanation set record separator (RS) to ^ to separate your logical groups, also set the field separator (FS) to / and print the first field as your requirement. Finally, setting the output field separator (OFS) to space (instead of the default new line) keeps the extracted fields on the same line.
With GNU grep and Perl compatible regular expressions (-P):
$ echo $(grep -Po '(?<=/)[^[]*' infile)
Well John have a quite different not necessarily more elaborate theology
-o retains just the matches, (?<=/) is a positive look-behind ("make sure there is a /, but don't include it in the match"), and [^[]* is "a sequence of characters other than [".
grep -Po prints one match per line; by using the output of grep as arguments to echo, we convert the newlines into spaces (could also be done by piping to tr '\n' ' ').
cat file|grep -oE "\/[^\[]*\[" |sed -e 's#^/##' -e 's/\[$//' | tr -s "\n" " "

How to grep all characters in file

I have a CSV file with this lines:
----------+79975532211,----------+79975532212
4995876655,4995876658
I try to grep this lines in Bash script
#!/bin/bash
config='/test/config.conf'
sourcecsv=/test/sourse.csv
cat $sourcecsv | while read line
do
Oldnumber=$(echo $line | cut -d',' -f1)
cat $config | grep "\\$Oldnumber" -B 8
done
But when script grep value 4995876655 I get error:
grep: Invalid back reference
How I can grep all values in my file?
Instead of:
cat $config | grep "\\$Oldnumber" -B 8
You should do:
grep -B 8 -F -- "$Oldnumber" "$config"
If you really mean to grep for all strings between commas, you can do it all in one go.
tr ',' '\n' </test/sourse.csv |
grep -F -f - -B 8 /test/config.conf
If you need to obtain the matches in sequence (all matches for the first string followed by all matches for the second, etc) then maybe loop over them with a proper while loop:
tr ',' '\n' </test/sourse.csv |
while read -r Oldnumber; do
grep -F -B 8 -e "$Oldnumber" /test/config.conf
done
Keeping the file names in variables does not seem to offer any advantage here.
If you mean to search for the strings preceded by a literal backslash, you can add it back; the -F option I added turns all strings into literals. If you need metacharacters, and take out the -F option, you need to double the backslashes (inside double quotes, a single backslash needs to be represented as double; and to get a literal backslash in a regular expression, you need two of them).

shell replace cr\lf by comma

I have input.txt
1
2
3
4
5
I need to get such output.txt
1,2,3,4,5
How to do it?
Try this:
tr '\n' ',' < input.txt > output.txt
With sed, you could use:
sed -e 'H;${x;s/\n/,/g;s/^,//;p;};d'
The H appends the pattern space to the hold space (saving the current line in the hold space). The ${...} surrounds actions that apply to the last line only. Those actions are: x swap hold and pattern space; s/\n/,/g substitute embedded newlines with commas; s/^,// delete the leading comma (there's a newline at the start of the hold space); and p print. The d deletes the pattern space - no printing.
You could also use, therefore:
sed -n -e 'H;${x;s/\n/,/g;s/^,//;p;}'
The -n suppresses default printing so the final d is no longer needed.
This solution assumes that the CRLF line endings are the local native line ending (so you are working on DOS) and that sed will therefore generate the local native line ending in the print operation. If you have DOS-format input but want Unix-format (LF only) output, then you have to work a bit harder - but you also need to stipulate this explicitly in the question.
It worked OK for me on MacOS X 10.6.5 with the numbers 1..5, and 1..50, and 1..5000 (23,893 characters in the single line of output); I'm not sure that I'd want to push it any harder than that.
In response to #Jonathan's comment to #eumiro's answer:
tr -s '\r\n' ',' < input.txt | sed -e 's/,$/\n/' > output.txt
tr and sed used be very good but when it comes to file parsing and regex you can't beat perl
(Not sure why people think that sed and tr are closer to shell than perl... )
perl -pe 's/\n/$1,/' your_file
if you want pure shell to do it then look at string matching
${string/#substring/replacement}
Use paste command. Here is using pipes:
echo "1\n2\n3\n4\n5" | paste -s -d, /dev/stdin
Here is using a file:
echo "1\n2\n3\n4\n5" > /tmp/input.txt
paste -s -d, /tmp/input.txt
Per man pages the s concatenates all lines and d allows to define the delimiter character.
Awk versions:
awk '{printf("%s,",$0)}' input.txt
awk 'BEGIN{ORS=","} {print $0}' input.txt
Output - 1,2,3,4,5,
Since you asked for 1,2,3,4,5, as compared to 1,2,3,4,5, (note the comma after 5, most of the solutions above also include the trailing comma), here are two more versions with Awk (with wc and sed) to get rid of the last comma:
i='input.txt'; awk -v c=$(wc -l $i | cut -d' ' -f1) '{printf("%s",$0);if(NR<c){printf(",")}}' $i
awk '{printf("%s,",$0)}' input.txt | sed 's/,\s*$//'
printf "1\n2\n3" | tr '\n' ','
if you want to output that to a file just do
printf "1\n2\n3" | tr '\n' ',' > myFile
if you have the content in a file do
cat myInput.txt | tr '\n' ',' > myOutput.txt
python version:
python -c 'import sys; print(",".join(sys.stdin.read().splitlines()))'
Doesn't have the trailing comma problem (because join works that way), and splitlines splits data on native line endings (and removes them).
cat input.txt | sed -e 's|$|,|' | xargs -i echo "{}"

Resources