BASH Hackerrank Solution works on PC, but not on Hackerrank - bash

Trying to solve this Hackerrank problem " Equalize the Array " using BASH. Here's my Solution:
read size
mostfreq=$(tr "[:space:]" '\n' | sort -n | uniq -c | sort -r -k1 | head -c7 | tr -d "[:space:]")
expr $size - $mostfreq
Passes all test cases except
22
51 51 51 51 51 51 51 51 51 51 51 51 51 51 51 51 51 51 51 51 51 51
When I run on PC, it produces the expected output 0. (0 = minimum deletions to produce an array of only duplicates). However, when I run using Hackerrank platform it gives me a runtime error. I was wondering if anyone has had a similar problem when using BASH on the Hackerrank platform.

Basically, on Hackerrank, you will be getting an error if your expression evaluates to 0. So, any test containing an array which elements are all equal will fail, even the simplest one consisting of a single element:
1
The way I overcame this issue was to evaluate the expression using '' and then printing the result using echo as below:
result=`expr $size - $mostfreq`
echo $result
Hence, the full code may look as follows:
read size
mostfreq=$(tr "[:space:]" '\n' | sort -n | uniq -c | sort -r -k1 | head -c7 | tr -d "[:space:]")
result=`expr $size - $mostfreq`
echo $result

Related

How to add the elements in a for loop [duplicate]

This question already has answers here:
Summing values of a column using awk command
(2 answers)
Closed 1 year ago.
so basically my code looks through data and greps whatever it begins with, and so I've been trying to figure out a way where I'm able to add the those values.
the sample input is
35 45 75 76
34 45 53 55
33 34 32 21
my code:
for id in $(awk '{ print $1 }' < $3); do echo $id; done
I'm printing it right now to see the values but basically whats outputted is
35
34
33
I'm trying to add them all together but I cant figure out how, some help would be appreciated.
my desired output would be
103
Lots of ways to do this, a few ideas ...
$ cat numbers.dat
35 45 75 76
34 45 53 55
33 34 32 21
Tweaking OP's current code:
$ sum=0
$ for id in $(awk '{ print $1 }' < numbers.dat); do ((sum+=id)); done
$ echo "${sum}"
102
Eliminating awk:
$ sum=0
$ while read -r id rest_of_line; do sum=$((sum+id)); done < numbers.dat
$ echo "${sum}"
102
Using just awk (looks like Aivean beat me to it):
$ awk '{sum+=$1} END {print sum}' numbers.dat
102
awk '{ sum += $1 } END { print sum }'
Test:
35 45 75 76
34 45 53 55
33 34 32 21
Result:
102
(sum(35, 34, 33) = 102, that's what you want, right?)
Here is the detailed explanation of how this works:
$1 is the first column of the input.
sum is the variable that holds the sum of all the values in the first column.
END { print sum } is the action to be performed after all the input has been processed.
So the awk program is basically summing up the first column of the input and printing the result.
This answer was partially generated by Davinci Codex model, supervised and verified by me.

While loop in bash getting duplicate result

$ cat grades.dat
santosh 65 65 65 65
john 85 92 78 94 88
andrea 89 90 75 90 86
jasper 84 88 80 92 84
santosh 99 99 99 99 99
Scripts:-
#!/usr/bin/bash
filename="$1"
while read line
do
a=`grep -w "santosh" $1 | awk '{print$1}' |wc -l`
echo "total is count of the file is $a";
done <"$filename"
O/p
total is count of the file is 2
total is count of the file is 2
total is count of the file is 2
total is count of the file is 2
total is count of the file is 2
Real O/P should be
total is count of the file is 2 like this right..please let me know,where i am missing in above scripts.
Whilst others have shown you better ways to solve your problem, the answer to your question is in the following line:
a=`grep -w "santosh" $1 | awk '{print$1}' |wc -l`
You are storing names in the variable "line" through the while loop, but it is never used. Instead your loop is always looking for "santosh" which does appear twice and because you run the same query for all 5 lines in the file being searched, you therefore get 5 lines of the exact same output.
You could alter your current script like so:
a=$(grep -w "$line" "$filename" | awk '{print$1}' | wc -l)
The above is not meant to be a solution as others have pointed out, but it does solve your issue.

Count duplicated couple of lines

I have a configuration file with this format:
cod 11
loc1 23
pto1 33
loc2 55
pto2 66
cod 12
loc1 55
pto1 66
loc2 88
pto2 77
...
I want to count how many times a pair of numbers appear in sequence loc/pto (indipendently of loc/pto number). In the example, the couple 55/66 appears 2 times (once as loc1/pto1 and one as loc2/pto2).
I have googled around and tried some combination of grep, uniq and awk but I only managed in count single line or number duplicated. I read the man documentation of those commands not finding any clue relative to my problem.
You could use the following:
$ sort file | uniq -f1 -dc
2 loc1 55
2 pto1 66
-f1 is skipping the 1st field when comparing lines
-dc is printing duplicate line with its associated count
Despite no visible effort on the part of the OP, this was an interesting question to work out.
awk '{for (i=1 ; i < 10 ; i++) if (NR == i) array[i]=$2} END {for (i=1 ; i < 10 ; i++) print array[i] "," array[i+1]}' file | sort | uniq -c
Output-
1 11,23
1 12,55
1 23,33
1 33,55
2 55,66
1 66,12
1 66,88
1 88,
The output tells you that 55 is followed by 66 twice. Other pairs only occur once.
Explanation-
I define an array in awk whoe elements are the ith number in the second column. The part after the END concatenates the ith and i+1th element. Then there is a sort | uniq -c to see if these pairs occur more than once.
If you want to know how many times a duplicate number appeared in the file:
awk '{print $2}' <filename> | sort | uniq -dc
Output:
2 55
2 66
If you want to know how many times a number appeared in the file regardless of being duplicate or not:
awk '{print $2}' <filename> | sort | uniq -c
Output:
1 11
1 12
1 23
1 33
2 55
2 66
1 77
1 88
If you want to print the full line on duplicate match based on second column:
awk '{print $2}' <filename> | sort | uniq -d | grep -F -f - <filename>
Output:
loc2 55
pto2 66
loc1 55
pto1 66

Print out the value with the highest number of occurrences in a file

In a bash shell script, I want to go through a list of numbers and then print out the number that occurs most often. If there are several different numbers appearing an equal amount of times, I want to print the highest number. For example, in a file like this:
10
10
10
15
15
20
20
20
20
I want to print the value 20.
How can I achieve this?
If the numbers are in a file, one per line:
sort < myfile | uniq -c | sort -r | head -1
without the count:
A=$(sort < myfile | uniq -c | sort -r | head -1)
set $A
echo $2
You can use this command -
echo 10 10 10 15 15 20 20 20 20 | sed 's/ /\n/g' | sort | uniq -c | sort -V | tail -n 1 | awk '{print $2}'
It will print the number you want.

Ways to speed up my bash script?

Yet, i know its a lot faster than doing things by hand. But is there anyway to maybe speed up this script? Multi-thread or something? I'm new to unix and this is my first script =). Open for suggestions or any changes made. Script seems to pause a lot on a certain generated domain randomly.
#!/bin/bash
for domain in $(pwgen -1A0B 2 10000);
do
whois $domain.com | egrep -q '^No match|^NOT FOUND|^Not fo|AVAILABLE|^No Data Fou|has not been regi|No entri'
if [ $? -eq 0 ]; then
echo "$domain.com : available"
else
echo "$domain.com"
fi
done
Before splitting and distribution,
WARNING This seem not to be useful: Asking pwgen to build 10'000 lines formed by two characters between a and z... Also there is only echo $((26*26)) -> 676 possibilities (in fact, as pwgen try to build speakable words, there is only 625 possibilities).
pwgen -1A0B 2 10000 | sort | uniq -c | sort -n | tail
27 ju
27 mu
27 vs
27 xt
27 zx
28 df
28 sy
28 zc
29 dp
29 zd
So with this command, you will do upto 29 times same thing.
Trying 10x to run pwgen -1A0B 2 10000 for printing how much different combinaison is proposed and which combinaison was proposed more time and less time:
for ((i=10;i--;)); do
echo $(
(
(
pwgen -1A0B 2 10000 |
sort |
uniq -c |
sort -n |
tee /dev/fd/6 |
wc -l >/dev/fd/7
) 6>&1 | (
head -n1
tail -n1
)
) 7>&1
)
done
6 bd 625 31 bn
3 bj 625 29 sq
6 je 625 30 ey
4 ac 625 30 sz
5 ds 625 29 wf
4 xw 625 28 qb
4 jj 625 30 pa
6 io 625 29 sg
4 vw 625 30 kb
5 fz 625 31 os
this print:
| | | | |
| | | | \- max proposed pattern
| | | \---- number of times max proposed pattern was issued
| | \-------- number of different differents purposes
| \----------- min proposed pattern
\-------------- number of times min proposed pattern was issued
Create a file with desired domain names first. Call this domains.lst:
pwgen -1A0B 2 10000 > domains.lst
Then create smaller files out of this:
split --lines=100 domains.lst domains.lst.
Then create a script which gets a file-name and processes that file using whois. Also creates an output file input.out.
Create another script that uses & to start the above script in the background for all small chunks. Merge the outputs after all background tasks finish.

Resources