I am having a hard time with this issue.
I've got several files in a folder with the general name format:
[ file type ] file name ( date ) file specs . file extension
File type can be anything such as: Webinar, Presentation, Symposium, Oral Talk etc... Please notice that it is surrounded by [] and might include spaces.
File name can be anything, please notice it might include spaces.
Date is in the general format dd_mm_yyyy. Please notice that it is surrounded by ().
File specs gives general information about the file attributes (and is not important.
I want to write a script so I could rename all files inside the folder to the following format:
date [ file type ] file name . file extension
() around date should be discarded, the date should change for format ddmmyyyy, [] around file type should be maintained, and file specs should be ignored.
Example:
[Oral Talk] Prospects for future research (13_11_2017) 1080p 320kpbs.mp4
should change to:
13112017 [Oral Talk] Prospects for future research.mp4
But then, this should be iterated for all files in the folder.
#!/usr/bin/env bash
# use a regular expression with match groups to pick out pieces of the name
re='\[(.+)\] (.+) [(]([[:digit:]_]+)[)](.*)[.]([[:alnum:]]+)'
# 1 2 3 4 5
# if we were passed a directory name, change to it before running
if [[ $1 ]]; then cd -- "$1" || exit; fi
for name in *.*; do
if [[ $name =~ $re ]]; then
date=${BASH_REMATCH[3]}
ext=${BASH_REMATCH[5]}
type=${BASH_REMATCH[1]}
topic=${BASH_REMATCH[2]}
new_name="${date//_/} [$type] $topic.$ext"
printf 'Renaming %q to %q\n' "$name" "$new_name" >&2
[[ $dry_run ]] || mv -- "$name" "$new_name"
else
printf 'Filename %q does not match pattern; ignoring\n' "$name" >&2
fi
done
If saved under the name renaming-script and run as:
dry_run=1 ./renaming-script directory-to-rename
...this will merely print a report of which contents it would rename. With the dry_run=1 removed, it actually performs those rename operations.
Related
I'm running this in bash and even though there is a .txt file it prints out "no new folders to create" in the terminal.
Am I missing something?
FILES=cluster-02/*/*
for f in $FILES
do
if [[ $f == *."txt" ]]
then
cat $f | xargs mkdir -p
else
echo "No new folders to create"
fi
done;
As mentioned in the first comment, the behaviour is indeed as you might expect from your script: you run through all files, text files and other ones. In case your file is a text file, you perform the if-case and in case your file is another type of file, you perform the else-case.
In order to solve this, you might decide not to take the other files into account (only handle text files), I think you might do this as follows:
FILES=cluster-02/*/*.txt
You're looping over multiple files, so the first result may trigger the if and the second can show the else.
You could save the wildcard result in an array, check if there's something in it, and loop if so:
shopt -s nullglob
FILES=( foo/* )
if (( ${#FILES[#]} )); then
for f in "${FILES[#]}"; do
if [[ $f == *."txt" ]]; then
echo $f
fi
done
else
echo "No new folders to create"
fi
#!/usr/bin/env bash
# Create an array containing a list of files
# This is safer to avoid issues with files having special characters such
# as spaces, glob-characters, or other characters that might be cumbersome
# Note: if no files are found, the array contains a single element with the
# string "cluster-02/*/*"
file_list=( cluster-02/*/* )
# loop over the content of the file list
# ensure to quote the list to avoid the same pitfalls as above
for _file in "${file_list[#]}"
do
[ "${_file%.txt}" == "${_file}" ] && continue # skip, not a txt
[ -f "${_file}" ] || continue # check if the file exists
[ -r "${_file}" ] || continue # check if the file is readable
[ -s "${_file}" ] || continue # check if the file is empty
< "${_file}" xargs mkdir -p -- # add -- to avoid issues with entries starting with -
_c=1
done;
[ "${_c}" ] || echo "No new folders to create"
I'm hoping this is a simple question, since I've never done shell scripting before. I'm trying to filter certain files out of a list of results. While the script executes and prints out a list of files, it's not filtering out the ones I don't want. Thanks for any help you can provide!
#!/bin/bash
# Purpose: Identify all *md files in H2 repo where there is no audit date
#
#
#
# Example call: no_audits.sh
#
# If that call doesn't work, try ./no_audits.sh
#
# NOTE: Script assumes you are executing from within the scripts directory of
# your local H2 git repo.
#
# Process:
# 1) Go to H2 repo content directory (assumption is you are in the scripts dir)
# 2) Use for loop to go through all *md files in each content sub dir
# and list all file names and directories where audit date is null
#
#set counter
count=0
# Go to content directory and loop through all 'md' files in sub dirs
cd ../content
FILES=`find . -type f -name '*md' -print`
for f in $FILES
do
if [[ $f == "*all*" ]] || [[ $f == "*index*" ]] ;
then
# code to skip
echo " Skipping file: " $f
continue
else
# find audit_date in file metadata
adate=`grep audit_date $f`
# separate actual dates from rest of the grepped line
aadate=`echo $adate | awk -F\' '{print $2}'`
# if create date is null - proceed
if [[ -z "$aadate" ]] ;
then
# print a list of all files without audit dates
echo "Audit date: " $aadate " " $f;
count=$((count+1));
fi
fi
done
echo $count " files without audit dates "
First, to address the immediate issue:
[[ $f == "*all*" ]]
is only true if the exact contents of f is the string *all* -- with the wildcards as literal characters. If you want to check for a substring, then the asterisks shouldn't be quoted:
[[ $f = *all* ]]
...is a better-practice solution. (Note the use of = rather than == -- this isn't essential, but is a good habit to be in, as the POSIX test command is only specified to permit = as a string comparison operator; if one writes [ "$f" == foo ] by habit, one can get unexpected failures on platforms with a strictly compliant /bin/sh).
That said, a ground-up implementation of this script intended to follow best practices might look more like the following:
#!/usr/bin/env bash
count=0
while IFS= read -r -d '' filename; do
aadate=$(awk -F"'" '/audit_date/ { print $2; exit; }' <"$filename")
if [[ -z $aadate ]]; then
(( ++count ))
printf 'File %q has no audit date\n' "$filename"
else
printf 'File %q has audit date %s\n' "$filename" "$aadate"
fi
done < <(find . -not '(' -name '*all*' -o -name '*index*' ')' -type f -name '*md' -print0)
echo "Found $count files without audit dates" >&2
Note:
An arbitrary list of filenames cannot be stored in a single bash string (because all characters that might otherwise be used to determine where the first name ends and the next name begins could be present in the name itself). Instead, read one NUL-delimited filename at a time -- emitted with find -print0, read with IFS= read -r -d ''; this is discussed in [BashFAQ #1].
Filtering out unwanted names can be done internal to find.
There's no need to preprocess input to awk using grep, as awk is capable of searching through input files itself.
< <(...) is used to avoid the behavior in BashFAQ #24, wherein content piped to a while loop causes variables set or modified within that loop to become unavailable after its exit.
printf '...%q...\n' "$name" is safer than echo "...$name..." when handling unknown filenames, as printf will emit printable content that accurately represents those names even if they contain unprintable characters or characters which, when emitted directly to a terminal, act to modify that terminal's configuration.
Nevermind, I found the answer here:
bash script to check file name begins with expected string
I tried various versions of the wildcard/filename and ended up with:
if [[ "$f" == *all.md ]] || [[ "$f" == *index.md ]] ;
The link above said not to put those in quotes, and removing the quotes did the trick!
I am trying to get the filename in a folder with only one file in it.
FYI: The $FOLDER_TMP contains a space in it, that is why I use printf
function nameofkeyfile(){
FOLDER_TMP="${PWD%/*/*}/folder/"
FOLDER=$(printf %q "${FOLDER_TMP}")
FILENAME=ls "$FOLDER" # Error: No such file or directory
# or this: FILENAME=$(ls "$FOLDER") # Error: No such file or directory
FNAME=`basename $FILENAME`
}
The problem is the line:
FILENAME=ls "$FOLDER" # Error: No such file or directory
Do you know why - and yes the folder is there?
And if I echo the $FOLDER it gives me the right folder.
I am trying to get the filename in a folder with only one file in it.
You definitely have the wrong approach.
Instead, consider using globbing like so:
The assignment
fname=( "${PWD%/*/*}"/folder/* )
will populate the array fname will the expansion of the given glob: that is, all files in the directory "${PWD%/*/*}"/folder/, if any. If there are no files at all, your array will contain the glob, verbatim.
Hence, a more robust approach is the following:
nameofkeyfile() {
fname=( "${PWD%/*/*}"/folder/* )
# Now check that there's at most one element in the array
if (( ${#fname[#]} > 1 )); then
echo "Oh no, there are too many files in your folder"
return 1
fi
# Now check that there is a file
if [[ ! -f ${fname[0]} ]]; then
echo "Oh no, there are no files in your folder"
return 1
fi
# Here, all is good!
echo "Your file is: $fname"
}
This uses Bash (named) arrays. If you want the function to be POSIX-compliant, it's rather straightforward since POSIX shells have an unnamed array (the positional parameters):
# POSIX-compliant version
nameofkeyfile() {
set -- "${PWD%/*/*}"/folder/*
# Now check that there's at most one element in the array
if [ "$#" -gt 1 ]; then
echo "Oh no, there are too many files in your folder"
return 1
fi
# Now check that there is a file
if [ ! -f "$1" ]; then
echo "Oh no, there are no files in your folder"
return 1
fi
# Here, all is good!
echo "Your file is: $1, I'll store it in variable fname for you"
fname=$1
}
I didn't strip the full path from the filename, but that's really easy (don't use basename for that!):1
fname=${fname##*/}
More precisely: in the Bash version, you'd use:
fname=${fname[0]##*/}
and in the POSIX version you'd use:
fname=${1##*/}
1there's a catch when using parameter expansions to get the basename, it's the case of /. But it seems you won't be in this case, so it's all safe!
To store the output ls "$FOLDER" in a variable, put it in a sub-shell:
FILENAME=$(ls "$FOLDER")
Another problem is the printf.
It adds escaping backslashes in the string,
and when you try to list the directory in the next step,
those backslashes are used literally by the shell.
So drop the printf:
function nameofkeyfile() {
FOLDER="${PWD%/*/*}/folder/"
FILENAME=$(ls "$FOLDER")
FNAME=$(basename $FILENAME)
}
Lastly, it's better to use $(...) than `...`:
I have a bunch of files that need to be renamed and the new name is in a text file.
Example file name:
ASBC_Fishbone_Ia.pdf
Example entry in text file:
Ia. Propagation—Design Considerations
Expected new file name:
Ia. Propagation—Design Considerations.pdf
or
Ia._Propagation—Design_Considerations
What would be a good way of going about this using typical linux cli tools? I'm thinking some combination of ls, grep and rename?
You can try:
#!/bin/bash
# Do not allow the script to run if it's not Bash or Bash version is < 4.0 .
[ -n "$BASH_VERSION" ] && [[ BASH_VERSINFO -ge 4 ]] || exit 1
# Do not allow presenting glob pattern if no match is found.
shopt -s nullglob
# Use an associative array.
declare -A MAP=() || exit 1
while IFS=$'\t' read -r CODE NAME; do
# Maps name with code e.g. MAP['Ia']='Propagation—Design Considerations'
MAP[${CODE%.}]=$NAME
done < /path/to/text_file
# Change directory. Not needed if files are in current directory.
cd "/path/to/dir/containing/files" || exit 1
for FILE in *_*.pdf; do
# Get code from filename.
CODE=${FILE##*_} CODE=${CODE%.pdf}
# Skip if no code was extracted from file.
[[ -n $CODE ]] || continue
# Get name from map based from code.
NAME=${MAP[$CODE]}
# Skip if no new name was registered based on code.
[[ -n $NAME ]] || continue
# Generate new name.
NEW_NAME="${CODE}. $NAME.pdf"
# Replace spaces with _ at your preference. Uncomment if wanted.
# NEW_NAME=${NEWNAME// /_}
# Rename file. Remove echo if you find it correct already.
echo mv -- "$FILE" "$NEW_NAME"
done
Background:
I have a bunch of filenames named username.sub in single letter directories under script_testing (first letter of username is the folder name). For every username.sub, I need to check if the line user.$username.contacts exists and, if not, append the line followed by a real tab.
Question:
Given the code I have below, why is it not appending to the file? I think I am missing something simple. I keep getting "contacts already subscribed" even if that line is not there.
#!/bin/bash
Path_to_files=/home/user/script_testing/^[A-z]+$/
FULLNAME="${Path_to_files##*/}"
NAME="${FULLNAME%.*}"
if grep 'contacts' $NAME.sub; then
echo 'contacts already subscribed'
else
echo "subscribing to contacts"
echo -e user.$NAME.Contacts \t >> $NAME.sub
fi
You're grepping for the word contacts - which, depending on what else you have in those files, may always be present.
Instead, use grep -q "^user\.$NAME\.Contacts" to look for your line.
Fixed with the following:
#!/bin/bash
#testing directory
#p=$HOME/script_testing
for f in "$p"/*/*.sub ; do
# if this is a file
if [ -f "$f" ]; then
# define variables
F="${f##*/}"
u="${F%%.*}"
cont=$(grep "user.$u.Contacts" "$f")
cal=$(grep "user.$u.Calendar" "$f")
# if our file doesn't contain Contacts subscription
if [ -z "$cont" ]; then
# add Contacts subscription
echo -e "user.$u.Contacts\t" >> "$f"
#fi
# if our file doesn't contain Calendar subscription
elif [ -z "$cal" ]; then
# add Calendar subscription
echo -e "user.$u.Calendar\t" >> "$f"
fi
fi
done
Also added extra line(s) to append. Please, let me know if there is an issue with this so I can learn, but I haven't encountered any problems.