Shell script to print the lines which contains a word added by the user - bash

I have a file named data.txt, which contains the following:
1440;150;1000000;pizza;hamburger
1000;180;56124;coke;sprite;water;juice
566;40;10000;cake;pizza;coke
I want to make a program which asks for an input from the user then prints out the lines which contains the given word.
For example:
If I enter coke, it should print out the second and third line. If I enter hambuger it should only print out the first line.
Here is the code that I tried but it doesn't work. Can anybody help me please?
echo "Enter a word"`
read word
while read line; do
numbersinthefile=$(echo $line | cut -d';' -f4);
if [ $numbersinthefile -eq $num ]; then
echo $line;
fi
done
Earlier I forgot to mention that I want the program to allow multiple inputs from the user. Example:
If I type in "pizza sprite", it gives me the first and second line.

That's a simple grep, isn't it?
read -p "Enter a word: " word
grep -F "$word" file
Add -w to match coke with coke only, and not with co or ok.
read -p "Enter a word: " word
grep -Fw "$word" file

Could you please try following once.
cat script.ksh
echo "Please enter word which you want to look for in Input_file:"
read value
awk -v val="$value" '$0 ~ val' Input_file
After running above code following is how it will work.
./script.ksh
Please enter word which you want to look for in Input_file:
coke
1000;180;56124;coke;sprite;water;juice
566;40;10000;cake;pizza;coke
EDIT: In case you want to pass multiple values to script then how about passing them as an arguments to program itself?
cat script.ksh
for var in "$#"
do
awk -v val="$var" '$0 ~ val' Input_file
done
Then run script in following fashion.
script.ksh test coke cake etc

Here is one in awk that accepts partial matches:
$ awk '
BEGIN {
FS=";" # file field sep
printf "Feed me text: " # text for prompt
if((getline s < "-")<=0) # read user input
exit # exit if unsuccessful
}
{
for(i=4;i<=NF;i++) # iterate fields from file records >= 4
if($i~s) { # if match (~ since there was a space in eof NR==3)
print
next # only output each matching record once
}
}' file
Output
Feed me text: coke
1000;180;56124;coke;sprite;water;juice
566;40;10000;cake;pizza;coke

Related

How to skip repeated entries in a .csv file

I'm new to bash scripting. I have a text file containing a list of subdomains (URLs) and I'm creating a .csv file (subdomainIP.csv) that has 2 columns: the 1st column contains subdomains (Subdomain) and the 2nd one contains IP addresses (IP). The columns are separated by ",". My code intends to read each line of URLs.txt, finds its IP address and enter the selected subdomain and its IP address in the .csv file.
Whenever I find the IP address of a domain and I want to add it as a new entry to .csv file, I want to check the previous entries of the 2nd column. If there is a similar IP address, I don't want to add the new entry, but if there isn't any similar case, I want to add the new entry. I have done this by adding these lines to my code:
awk '{ if ($IP ~ $ipValue) print "No add"
else echo "${line}, ${ipValue}" >> subdomainIP.csv}' subdomainIP.csv
but I receive this error:
awk: cmd. line:2: else echo "${line}, ${ipValue}" >> subdomainIP.csv}
awk: cmd. line:2: ^ syntax error
What's wrong?
Would you please try the following:
declare -A seen # memorize the appearance of IPs
echo "Subdomain,IP" > subdomainIP.csv # let's overwrite, not appending
while IFS= read -r line; do
ipValue= # initialize the value
while IFS= read -r ip; do
if [[ $ip =~ ^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
ipValue+="${ip}-" # append the results with "-"
fi
done < <(dig +short "$line") # assuming the result has multi-line
ipValue=${ipValue%-} # remove trailing "-" if any
if [[ -n $ipValue ]] && (( seen[$ipValue]++ == 0 )); then
# if the IP is not empty and not in the previous list
echo "$line,$ipValue" >> subdomainIP.csv
fi
done < URLs.txt
The associative array seen may be a key for the purpose. It is indexed
by an arbitrary string (ip adddress in the case) and can memorize the value
associated with the string. It will be suitable to check the appearance
of the ip address across the input lines.
There are some issues in your code. Here's a few of them.
If the awk script is in single quotes, as in awk 'script' file, any variables $var in script will not expand. If you want to perform variable expansion, use double quotes. Compare echo hello | awk "{ print \"$PATH\" }" vs echo hello | awk '{ print "$PATH" }'.
However, if you do so, than the shell will try to expand $0, $1, $NF, ... and this is certainly not what you want. Therefore you can concatenate single- and double-quoted strings as needed, e.g. echo hello | awk '{ print "$0:"$0 >> "log"; print "$PATH:'"$PATH"'" >> "log" }'
Based on what I see from O'Reilly's sed & awk, when you redirect to file from within an awk script, you have to quote the file name, as I've done in the command above for the file named log.

Printing number of lines with in shell with echo

I know that the simplest way to print out the specific value of line/bytes/words is to use wc -l < filename.sh, but when i try to use it in conjunction with the echo command, it's printing the physical command itself and not the output.
My intended output is "this file has x lines", with x being number of lines, but when i try to do things like echo "this line has" wc -l < filename.sh "lines", it's printing the command itself. I've also tried this without breaking the quotation, among several other things.
is it just the command itself that's not applicable alongside echo, or am i missing something extremely obvious here?
echo "this line has $(wc -l < filename.sh) lines"
printf is versatile:
printf 'this file has %s lines\n' $(wc -l < filename.sh)
$(command) converts the output of command into an argument.
Try this one:
echo "this file has `wc -l < filename.sh | awk '{print $1}'` lines"
Explanation:
wc -l < filename.sh retrieves the line number of the file
awk '{print $1}' prints the number without any blanks
`` means executing the command first in order to get the result
Without any subshell or pipe, awk have an inbuilt variable NR which holds the number of record in the input file. Print is written inside END block to print the result at the end else, it will print the line number of each line.
awk 'END{print "This line has " NR " lines" }' file

Bash Shell: Infinite Loop

The problem is the following I have a file that each line has this form:
id|lastName|firstName|gender|birthday|joinDate|IP|browser
i want to sort alphabetically all the firstnames in that file and print them one on each line but each name only once
i have created the following program but for some reason it creates an infinite loop:
array1=()
while read LINE
do
if [ ${LINE:0:1} != '#' ]
then
IFS="|"
array=($LINE)
if [[ "${array1[#]}" != "${array[2]}" ]]
then
array1+=("${array[2]}")
fi
fi
done < $3
echo ${array1[#]} | awk 'BEGIN{RS=" ";} {print $1}' | sort
NOTES
if [ ${LINE:0:1} != '#' ] : this command is used because there are comments in the file that i dont want to print
$3 : filename
array1 : is used for all the seperate names
Wow, there's a MUCH simpler and cleaner way to achieve this, without having to mess with the IFS variable or using arrays. You can use "for" to do this:
First I created a file with the same structure as yours:
$ cat file
id|lastName|Douglas|gender|birthday|joinDate|IP|browser
id|lastName|Tim|gender|birthday|joinDate|IP|browser
id|lastName|Andrew|gender|birthday|joinDate|IP|browser
id|lastName|Sasha|gender|birthday|joinDate|IP|browser
#id|lastName|Carly|gender|birthday|joinDate|IP|browser
id|lastName|Madson|gender|birthday|joinDate|IP|browser
Here's the script I wrote using "for":
#!/bin/bash
for LINE in `cat file | grep -v "^#" | awk -F'|' '{print$3}' | sort -u`
do
echo $LINE
done
And here's the output of this script:
$ ./script.sh
Andrew
Douglas
Madson
Sasha
Tim
Explanation:
for LINE in `cat file`
Creates a loop that reads each line of "file". The commands between ` are run by linux, for example, if you wanted to store the date inside of a variable you could use "VARDATE=`date`".
grep -v "^#"
The option -v is used to exclude results matching the pattern, in this case the pattern is "^#". The "^" character means "line begins with". So grep -v "^#" means "exclude lines beginning with #".
awk -F'|' '{print$3}'
The -F option switches the column delimiter from the default (the default is a space) to whatever you put between ' after it, in this case the "|" character.
The '{print$3}' prints the 3rd column.
sort -u
And the "sort -u" command to sort the names alphabetically.

search lines in bash for specific character and display line

I am trying to write search a string in bash and echo the line of that string that contains the + character with some text is a special case. The code does run but I get both lines in the input file displayed. Thank you :)
bash
#!/bin/bash
printf "Please enter the variant the following are examples"
echo " c.274G>T or c.274-10G>A"
printf "variant(s), use a comma between multiple: "; IFS="," read -a variant
for ((i=0; i<${#variant[#]}; i++))
do printf "NM_000163.4:%s\n" ${variant[$i]} >> c:/Users/cmccabe/Desktop/Python27/input.txt
done
awk '{for(i=1;i<=NF;++i)if($i~/+/)print $i}' input.txt
echo "$i" "is a special case"
input.txt
NM_000163.4:c.138C>A
NM_000163.4:c.266+83G>T
desired output ( this line contains a + in it)
NM_000163.4:c.266+83G>T is a special case
edit:
looks like I need to escape the + and that is part of my problem
you can change your awk script as below and get rid of echo.
$ awk '/+/{print $0,"is a special case"}' file
NM_000163.4:c.266+83G>T is a special case
As far as I understand your problem, you can do it with a single sed command:
sed -n '/+/ {s/$/is a special case/ ; p}' input.txt
On lines containing +, it replaces the end ($) with your text, thus appending it. After that the line is printed.

Split file by multiple line breaks

Let's say you have the following input file
Some text. It may contain line
breaks.
Some other part of the text
Yet an other part of
the text
And you want to iterate each text part (seperated by two line breaks (\n\n)), so that
in the first iteration I would only get:
Some text. It may contain line
breaks.
In the second iteration I would get:
Some other part of the text
And in the last iteration I would get:
Yet an other part of
the text
I tried this, but it doesn't seem to work because IFS only supports one character?
cat $inputfile | while IFS=$'\n\n' read part; do
# do something with $part
done
This is the solution of anubhava in pure bash:
#!/bin/bash
COUNT=1; echo -n "$COUNT: "
while read LINE
do
[ "$LINE" ] && echo "$LINE" || { (( ++COUNT )); echo -n "$COUNT: " ;}
done
Use awk with null RS:
awk '{print NR ":", $0}' RS= file
1: Some Text. It may contains line
breaks.
2: Some Other Part of the Text
3: Yet an other Part of
the Text
You can clearly see that your input file has 3 records now (each record is printed with record # in output).

Resources