Empty statement operator catch only number or alphabet - bash

I have two files one of them have that data file1:
content
file2 is created with vi and I just put some enters and have two or three rows but still no data just content.
That is not working for me, when have rows added but no other thing.
if [ ! -s file2 ]
print "file2 is empty"
else
print "file2 has content"
fi
In that case is turn: file2 has content
The idea is to catch in that file if there any alphabet or number anything else like space or enter to be empty.

if perl -ne'exit 1 if /\S/' file ; then
echo 'Only contains blank lines'
fi
Come to think of it, grep would also do the trick.
if ! grep -q '[^[:space:]]' file ; then
echo 'Only contains blank lines'
fi
These are better than anubhava's solution because they consider lines containing only spaces and tabs to be blank lines. It's never a good idea to assign significance to trailing whitespace.

You can use $(<file2) instead to check zero-content file:
if [[ -z "$(<file2)" ]]; then
print "file2 is empty"
else
print "file2 has content"
fi
[[ -z "$(<file2)" ]] will only be true for zero content file or a file with only new lines,

awk way
awk 'NF{x++}END{print x?"File has content":"File empty"}' file
You could also stop processing the file when content is found this way
awk 'x+=NF{exit}END{print x?"File has content":"File empty"}' file

Related

Check file empty or not

I have file which does not have any data in it
Need to check below scenario and return file is empty otherwise not empty
if file contains no data but as only spaces return it as FILE is EMPTY
if file contains no data but as only tabs return it as FILE is EMPTY
if file contains no data but as only empty new line return it as FILE is EMPTY
Does this below code will satisfy all my above cases ? or any best approach all in one go
if [ -s /d/dem.txt ]
then
echo "FILE IS NOT EMPTY AS SOME DATA"
else
echo "FILE IS EMPTY NOT DATA AVAILABLE"
fi
You may use this awk for this:
awk 'NF {exit 1}' file && echo "empty" || echo "not empty"
Condition NF will be true only if there is non-whitespace character in the file.
Your description is a bit unclear (what do you want to do with a file that contains spaces, tabs, and newlines?), but it sounds like you just want to know if the file contains any non-whitespace characters. So:
if grep -q '[^[:space:]]' "$file"; then
printf "%s\n" "$file is not empty";
else
printf "%s\n" "$file contains only whitespace"
fi
If you had run your code you would have realized that no, -s considers that files with spaces, tabs and/or new lines are not empty. I would do it like this:
myfile="some_file.txt"
T=$(sed -e 's/\s//g' "$i")
if [ -n "$T" ]; then
echo "$i is NOT empty"
else
echo "$i is empty"
fi

Reformatting a csv file, script is confused by ' %." '

I'm using bash on cygwin.
I have to take a .csv file that is a subset of a much larger set of settings and shuffle the new csv settings (same keys, different values) into the 1000-plus-line original, making a new .json file.
I have put together a script to automate this. The first step in the process is to "clean up" the csv file by extracting lines that start with "mme " and "sms ". Everything else is to pass through cleanly to the "clean" .csv file.
This routine is as follows:
# clean up the settings, throwing out mme and sms entries
cat extract.csv | while read -r LINE; do
if [[ $LINE == "mme "* ]]
then
printf "$LINE\n" >> mme_settings.csv
elif [[ $LINE == "sms "* ]]
then
printf "$LINE\n" >> sms_settings.csv
else
printf "$LINE\n" >> extract_clean.csv
fi
done
My problem is that this thing stubs its toe on the following string at the end of one entry: 100%." When it's done with the line, it simply elides the %." and the new-line marker following it, and smears the two lines together:
... 100next.entry.keyname...
I would love to reach in and simply manually delimit the % sign, but it's not a realistic option for my use case. Clearly I'm missing something. My suspicion is that I am in some wise abusing cat or read in the first line.
If there is some place I should have looked to find the answer before bugging you all, by all means point me in that direction and I'll sod off.
Syntax for printf is :
printf format [argument]...
In [ printf ] format string, anything followed by % is a format specifier as described in the link above. What you would like to do is :
while read -r line; do # Replaced LINE with line, full uppercase variable are reserved for the syste,
if [[ "$line" = "mme "* ]] # Here* would glob for anything that comes next
then
printf "%s\n" $line >> mme_settings.csv
elif [[ "$line" = "sms "* ]]
then
printf "%s\n" $line >> sms_settings.csv
else
printf "%s\n" $line >> extract_clean.csv
fi
done<extract.csv # Avoided the useless use of cat
As pointed out, your problem is expanding a parameter containing a formatting instruction in the formatting argument of printf, which can be solved by using echo instead or moving the parameter to be expanded out of the formatting string, as demonstrated in other answers.
I recommend not looping over your whole file with Bash in the first place, as it's notoriously slow; you're extracting lines starting with certain patterns, which is a job at which grep excels:
grep '^mme ' extract.csv > mme_settings.csv
grep '^sms ' extract.csv > sms_settings.csv
grep -v '^mme \|^sms ' extract.csv > extract_clean.csv
The third command uses the -v option (extract lines that don't match) and alternation to exclude lines both starting with mme and sms.

Shell and text manipulation: marking the ends of sentences and controlling space between paragraphs

After processing it should be like
I need to go london, after i reach the; Uk.
But i need five hours? To reach it.
But I get:
I need to go london, after i reach the; Uk.
.
.
.
But i need five hours? To reach it.
It works but it adds a dot at the first line but I need to add dot just at end of paragraph that has no dot. Also, if I have more than one blank line, or if there is no blank line, between paragraphs, I need to ensure exactly one blank line between paragraphs.
How do I deal with these issues?
You can remove blank lines between paragraphs with awk:
awk '{gsub(/\n\n+/,"\n\n");printf $0}' RS="^$" file
and to avoid dots at beginning of lines, you could change your last sed command to:
/\(^$\)\|\([!?;.,]\s*$\)/! s/\s*$/.&/
You can use awk
awk '{if(NR == 1) print $0"\n"; else if($1 != ".") print $0}' file
You could also read from the file and use echo -e to double space the lines:
n=0
while read f _ ; do
((n++))
if [[ $f != . ]]; then
if [ $n -eq 1 ]; then
echo "$f\n"
else
echo -e "\n$f"
fi
fi
done < file

using head -n within if condition

I am currently trying to evaluate txt-files in a directory using bash. I want to know if the third line of the txt-file matches a certain string. The file starts with two empty lines, then the target string. I tested the following one liner:
if [[ $(head -n 3 a_txt_file.txt) == "target_string" ]]; then echo yes; else echo no; fi
I can imagine that since head -n 3 also prints out the two empty lines, I have to add them to the if condition. But "\n\ntarget_string" and "\n\ntarget_string\n" also don't work.
How would one do this correctly (And I guess it can be done more elegantly as well)?
Try this instead - it will print only the third line:
sed -n 3p file.txt
If you just need to remove the top two lines:
head -n 3 | tail -1
You'll want to use sed instead of head. This gets the third line, tests if it matches, and then you can do whatever you want with it if it does match.
if [[ $(sed '3q;d' test_text.txt ) == "target_string" ]]; then echo yes; else echo no; fi
Besides sed you can try awk to print 3rd line
awk 'NR==3'
A pure bash solution:
if { read; read; read line; } < test_text.txt
[[ $line = target_string ]]
then
echo yes
else
echo no
fi < test_text.txt
This takes advantage of the fact that the condition of the if statement can be a sequence of commands. First, read twice from the file to discard the empty lines; the third sets line to the 3rd line. After that, you can test it against the target string.

Parsing a config file in bash

Here's my config file (dansguardian-config):
banned-phrase duck
banned-site allaboutbirds.org
I want to write a bash script that will read this config file and create some other files for me. Here's what I have so far, it's mostly pseudo-code:
while read line
do
# if line starts with "banned-phrase"
# add rest of line to file bannedphraselist
# fi
# if line starts with "banned-site"
# add rest of line to file bannedsitelist
# fi
done < dansguardian-config
I'm not sure if I need to use grep, sed, awk, or what.
Hope that makes sense. I just really hate DansGuardian lists.
With awk:
$ cat config
banned-phrase duck frog bird
banned-phrase horse
banned-site allaboutbirds.org duckduckgoose.net
banned-site froggingbirds.gov
$ awk '$1=="banned-phrase"{for(i=2;i<=NF;i++)print $i >"bannedphraselist"}
$1=="banned-site"{for(i=2;i<=NF;i++)print $i >"bannedsitelist"}' config
$ cat bannedphraselist
duck
frog
bird
horse
$ cat bannedsitelist
allaboutbirds.org
duckduckgoose.net
froggingbirds.gov
Explanation:
In awk by default each line is separated into fields by whitespace and each field is handled by $i where i is the ith field i.e. the first field on each line is $1, the second field on each line is $2 upto $NF where NF is the variable that contains the number of fields on the given line.
So the script is simple:
Check the first field against our required strings $1=="banned-phrase"
If the first field matched then loop over all the other fields for(i=2;i<=NF;i++) and print each field print $i and redirect the output to the file >"bannedphraselist".
You could do
sed -n 's/^banned-phrase *//p' dansguardian-config > bannedphraselist
sed -n 's/^banned-site *//p' dansguardian-config > bannedsitelist
Although that means reading the file twice. I doubt that the possible performance loss matters though.
You can read multiple variables at once; by default they're split on whitespace.
while read command target; do
case "$command" in
banned-phrase) echo "$target" >>bannedphraselist;;
banned-site) echo "$target" >>bannedsitelist;;
"") ;; # blank line
*) echo >&2 "$0: unrecognized config directive '$command'";;
esac
done < dansguardian-config
Just as an example. A smarter implementation would read the list files first, make sure things weren't already banned, etc.
What is the problem with all the solutions which uses echo text >> file? It can be checked with strace that in every such step the file is opened, then positioned to the end, then text is written and file is closed. So if there is 1000 times echo text >> file then there will be 1000 open, lseek, write, close. The number of open, lseek and close can be reduced a lot on the following way:
while read key val; do
case $key in
banned-phrase) echo $val>&2;;
banned-site) echo $val;;
esac
done >bannedsitelist 2>bannedphraselist <dansguardian-config
The stdout and stderr is redirected to files and kept open while the loop is alive. So the files are opened once and closed once. No need of lseek. Also the file caching is used more in this way as the unnecessary calls to close will not flush the buffers each time.
while read name value
do
if [ $name = banned-phrase ]
then
echo $value >> bannedphraselist
elif [ $name = banned-site ]
then
echo $value >> bannedsitelist
fi
done < dansguardian-config
Better to use awk:
awk '$1 ~ /^banned-phrase/{print $2 >> "bannedphraselist"}
$1 ~ /^banned-site/{print $2 >> "bannedsitelist"}' dansguardian-config

Resources