file without extension: how to notice in bash script? - bash

I made a very simple script which tells me a file name and extension.
The script works as follows:
for file in * ; do
if [[ -f $file ]] ; then
filename=${file##*/}
basename=${filename%\.*}
extension=${filename##*.}
if [[ -n $extension ]] ; then
echo "FILE: " $basename " ; ESTENSIONE " $extension
fi
fi
done
The problem is that when I have a file without extension (e.g. Makefile) it says that the extension is the filename itself (e.g. extension= Makefile).
Am I doing something wrong?

Well, the result you get is the expected one; I don't know if that means you're doing something wrong or not.
The way the pattern replacements work is that if the pattern doesn't match, nothing is replaced. Here you have ${filename##*.} which says remove all characters up to and including the final period. But if there's no period in the name, then the pattern doesn't match and nothing is removed, so you simply get the same result as ${filename}.
I should point out that the backslash in ${filename%\.*} is useless: the pattern here is shell globbing not regular expressions, so you don't need to escape a period. You can just write ${filename%.*}.
ETA:
There's no way to do what you want in one step. You have two choices; you can either test to see if the extension is the same as the filename and if so set it to empty:
extension=${filename##*.}
[ "$extension" = "$filename" ] && extension=
or you can strip off the basename, which you already computed, then get rid of any leading periods:
extension=${filename#$basename}
extension=${extension##*.}

Extensions don't have any privileged status in Unix file systems; they are just a part of the file name that people treat specially. You'll have to check if the file contains a . first.
basename=${filename%\.*}
if [[ $filename = *.* ]]; then
extension=${filename##*.}
echo "FILE: " $basename " ; ESTENSIONE " $extension
else
extension=""
fi

Related

Finding a file extension in a string using shell script

I have a long string, which contains a filename somewhere in it. I want to return just the filename.
How can I do this in a shell script, i.e. using sed, awk etc?
The following works in python, but I need it to work in a shell script.
import re
def find_filename(string, match):
string_list = string.split()
match_list = []
for word in string_list:
if match in word:
match_list.append(word)
#remove any characters after file extension
fullfilename = match_list[0][:-1]
#get just the filename without full directory
justfilename = fullfilename.split("/")
return justfilename[-1]
mystr = "the string contains a lot of irrelevant information and then a filename: /home/test/this_filename.txt: and then more irrelevant info"
file_ext = ".txt"
filename = find_filename(mystr, file_ext)
print(filename)
this_filename.txt
EDIT adding shell script requirement
I would call shell script like this:
./test.sh "the string contains a lot of irrelevant information and then a filename: /home/test/this_filename.txt: and then more irrelevant info" ".txt"
test.sh
#!/bin/bash
longstring=$1
fileext=$2
echo $longstring
echo $fileext
With bash and a regex:
#!/bin/bash
longstring="$1"
fileext="$2"
regex="[^/]+\\$fileext"
[[ "$longstring" =~ $regex ]] && echo "${BASH_REMATCH[0]}"
Output:
this_filename.txt
Tested only with your example.
See: The Stack Overflow Regular Expressions FAQ
Considering that you want to get file name with extension and then check if file is present or not in system, if this is the case could you please try following. Adding an additional check which is checking if 2 arguments are NOT passed to script then exit from program.
cat script.bash
if [[ "$#" -ne 2 ]]
then
echo "Please do enter do arguments as per script's need, exiting from program now."
exit 1;
fi
fileName=$(echo "$1" | awk -v ext="$2" 'match($0,/\/[^ :]*/){print substr($0,RSTART,RLENGTH) ext}')
echo "File name with file extension is: $fileName"
if [[ -f "$fileName" ]]
then
echo "File $fileName is present"
else
echo "File $fileName is NOT present."
fi

Unix File extension validation Fails

Something I am missing that I am not able to figure out.. Need some thoughts..
I am trying to check file extensions in a directory only file extensions I get .txt or .TXT ) .. but both should be treated as different as I am performing different validations for .txt and .TXT files..
I have the below files
aa.394.63.txt
aa.394.23.TXT
Here is my code
for file in "$SEARCH_DIR"/*; do
extn=$(echo $file | awk -F '.' '{print $NF}')
echo "extn:" $extn
if [ $extn=="txt" ]; then
echo "txt Loop"
elif [$extn=="TXT" ]; then
echo "TXT loop"
fi
But this script always be true for the "txt" validations never goes to the "TXT loop".. I think unix is case sensitive and it should be treated as separate.. Pls advise what am I missing ..
You're using test in the form of [] to test your conditions. You must include spaces around the brackets and the equality operators.
From the test man page:
[ is a synonym for test but requires a final argument of ]
...
Spaces around the brackets are important - each operator and operand must be a separate argument.
https://ss64.com/bash/test.html
This means that you need to pay careful attention to spaces in your test constructs. You should also note that variables should be quoted when you're testing them with [], as they may have undergone word splitting (not relevant in this case, but probably a good habit).
Because you're using [] to test conditons, rather than the bash [[]] construct, you should use a single = framed with whitespace as a test for string equality.
The following is a slightly amended version of your code and should work:
#!/bin/bash
SEARCH_DIR=./search
for file in "$SEARCH_DIR"/*; do
extn=$(echo $file | awk -F '.' '{print $NF}')
echo "extn:" "$extn"
if [ "$extn" = "txt" ]; then
echo "txt Loop"
elif [ "$extn" = "TXT" ]; then
echo "TXT loop"
fi
done
References
test man page
Comparison operators in bash

bash script not filtering

I'm hoping this is a simple question, since I've never done shell scripting before. I'm trying to filter certain files out of a list of results. While the script executes and prints out a list of files, it's not filtering out the ones I don't want. Thanks for any help you can provide!
#!/bin/bash
# Purpose: Identify all *md files in H2 repo where there is no audit date
#
#
#
# Example call: no_audits.sh
#
# If that call doesn't work, try ./no_audits.sh
#
# NOTE: Script assumes you are executing from within the scripts directory of
# your local H2 git repo.
#
# Process:
# 1) Go to H2 repo content directory (assumption is you are in the scripts dir)
# 2) Use for loop to go through all *md files in each content sub dir
# and list all file names and directories where audit date is null
#
#set counter
count=0
# Go to content directory and loop through all 'md' files in sub dirs
cd ../content
FILES=`find . -type f -name '*md' -print`
for f in $FILES
do
if [[ $f == "*all*" ]] || [[ $f == "*index*" ]] ;
then
# code to skip
echo " Skipping file: " $f
continue
else
# find audit_date in file metadata
adate=`grep audit_date $f`
# separate actual dates from rest of the grepped line
aadate=`echo $adate | awk -F\' '{print $2}'`
# if create date is null - proceed
if [[ -z "$aadate" ]] ;
then
# print a list of all files without audit dates
echo "Audit date: " $aadate " " $f;
count=$((count+1));
fi
fi
done
echo $count " files without audit dates "
First, to address the immediate issue:
[[ $f == "*all*" ]]
is only true if the exact contents of f is the string *all* -- with the wildcards as literal characters. If you want to check for a substring, then the asterisks shouldn't be quoted:
[[ $f = *all* ]]
...is a better-practice solution. (Note the use of = rather than == -- this isn't essential, but is a good habit to be in, as the POSIX test command is only specified to permit = as a string comparison operator; if one writes [ "$f" == foo ] by habit, one can get unexpected failures on platforms with a strictly compliant /bin/sh).
That said, a ground-up implementation of this script intended to follow best practices might look more like the following:
#!/usr/bin/env bash
count=0
while IFS= read -r -d '' filename; do
aadate=$(awk -F"'" '/audit_date/ { print $2; exit; }' <"$filename")
if [[ -z $aadate ]]; then
(( ++count ))
printf 'File %q has no audit date\n' "$filename"
else
printf 'File %q has audit date %s\n' "$filename" "$aadate"
fi
done < <(find . -not '(' -name '*all*' -o -name '*index*' ')' -type f -name '*md' -print0)
echo "Found $count files without audit dates" >&2
Note:
An arbitrary list of filenames cannot be stored in a single bash string (because all characters that might otherwise be used to determine where the first name ends and the next name begins could be present in the name itself). Instead, read one NUL-delimited filename at a time -- emitted with find -print0, read with IFS= read -r -d ''; this is discussed in [BashFAQ #1].
Filtering out unwanted names can be done internal to find.
There's no need to preprocess input to awk using grep, as awk is capable of searching through input files itself.
< <(...) is used to avoid the behavior in BashFAQ #24, wherein content piped to a while loop causes variables set or modified within that loop to become unavailable after its exit.
printf '...%q...\n' "$name" is safer than echo "...$name..." when handling unknown filenames, as printf will emit printable content that accurately represents those names even if they contain unprintable characters or characters which, when emitted directly to a terminal, act to modify that terminal's configuration.
Nevermind, I found the answer here:
bash script to check file name begins with expected string
I tried various versions of the wildcard/filename and ended up with:
if [[ "$f" == *all.md ]] || [[ "$f" == *index.md ]] ;
The link above said not to put those in quotes, and removing the quotes did the trick!

Bash - check for a string in file path

How can I check for a string in a file path in bash? I am trying:
if [[$(echo "${filePathVar}" | sed 's#//#:#g#') == *"File.java"* ]]
to replace all forward slashes with a colon (:) in the path. It's not working. Bash is seeing the file path string as a file path and throws the error "No such file or directory". The intention is for it to see the file path as a string.
Example: filePathVar could be
**/myloc/src/File.java
in which case the check should return true.
Please note that I am writing this script inside a Jenkins job as a build step.
Updates as of 12/15/15
The following returns Not found, which is wrong.
#!/bin/bash
sources="**/src/TESTS/A.java **/src/TESTS/B.java"
if [[ "${sources}" = ~B.java[^/]*$ ]];
then
echo "Found!!"
else
echo "Not Found!!"
fi
The following returns Found which also also wrong (removed the space around the comparator =).
#!/bin/bash
sources="**/src/TESTS/A.java **/src/TESTS/C.java"
if [[ "${sources}"=~B.java[^/]*$ ]];
then
echo "Found!!"
else
echo "Not Found!!"
fi
The comparison operation is clearly not working.
It is easier to use bash's builtin regex matching facility:
$ filePathVar=/myLoc/src/File.java
if [[ "$filePathVar" =~ File.java[^/]*$ ]]; then echo Match; else echo No Match; fi
Match
Inside [[...]], the operator =~ does regex matching. The regular expression File.java[^/]* matches any string that contains File.java optionally followed by anything except /.
It worked in a simpler way as below:
#!/bin/bash
sources="**/src/TESTS/A.java **/src/TESTS/B.java"
if [[ $sources == *"A.java"* ]]
then
echo "Found!!"
else
echo "Not Found!!"
fi

Bash - if and for statements

I am little unfamiliar with the 'if...then...fi' and the 'for' statements syntax.
Could anyone explain what the "$2/$fn" and "/etc/*release" in the code snippets below mean?...specifically on the use of the forward slash....and the asterisk...
if [ -f "$filename" ]; then
if [ ! -f "$2/$fn" ]; then
echo "$fn is missing from $2"
missing=$((missing + 1))
fi
fi
and
function system_info
{
if ls /etc/*release 1>/dev/null 2>&1; then
echo "<h2>System release info</h2>"
echo "<pre>"
for i in /etc/*release; do
# Since we can't be sure of the
# length of the file, only
# display the first line.
head -n 1 $i
done
uname -orp
echo "</pre>"
fi
} # end of system_info
...thx for the help...
/etc/*release : here the * will match any number of any characters, so any thing /etc/0release , /etc/asdfasdfr_release etc will be matched. Simply stated, it defined all the files in the /etc/ directory which ends with the string release.
The $2 is the 2nd commandline argument to the shell script, and $fn is some other shell variable. The "$2/$fn" after the variable substitutions will make a string, and the [ -f "$2/$fn" ] will test if the string formed after the substitution forms a path to a regular file which is specified by the -f switch. If it is a regular file then the body of if is executed.
In the for loop the loop will loop for all the files ending with the string release in the directory /etc (the path). At each iteration i will contain the next such file name, and for each iteration the first 1 line of the file is displayed with the head command by getting the file name from variable i within the body.
It is better to check the manual man bash and for if condition check man test . Here is a good resource: http://tldp.org/LDP/Bash-Beginners-Guide/html/
The forward slash is the path separator, and the * is a file glob character. $2/$fn is a path where $2 specifies the directory and $fn is the filename. /etc/*release expands to the space separated list of all the files in /etc whose name ends in "release"
Dollar sign marks variable. The "-f" operator means "file exsists".
So,
[ -f "$filename" ]
checks if there is file named the same as value contained in $filename variable.
Simmilar, if we assume that $2 = "some_folder", and $fn = "some_file", expression
[ ! -f "$2/$fn" ]
returns true if file some_folder/some_file doesn't exsist.
Now, about asterisk - it marks "zero or more of any character(s)". So, expression:
for i in /etc/*release; do
will iterate trough all folders named by that pattern, for example:
/etc/release, /etc/666release, /etc/wtf_release...
I hope this helps.

Resources