Generate multiple output files for loop - bash

i'm trying to generate a new output file from each existing file in a directory of .txt files. I want to check line by line in each file for two substrings. And append the lines that match that substring to each new output file.
I'm having trouble generating the new files.
This is what i currently have:
#!/bin/sh
# My first Script
success="(Compiling)\s\".*\"\s\-\s(Succeeded)"
failure="(Compiling)\s\".*\"\s\-\s(Failed)"
count_success=0
count_failure=0
for i in ~/Documents/reports/*;
do
while read -r line;
do
if [[$success=~$line]]; then
echo $line >> output_$i
count_success++
elif [[$failure=~$]]; then
echo $line >> output_$i
count_failure++
fi
done
done
echo "$count_success of jobs ran succesfully"
echo "$count_failure of jobs didn't work"
~
Any help would be appreciated, thanks

Please, use https://www.shellcheck.net/ to check your shell scripts.
If you use Visual Studio Code, you could install "ShellCheck" (by Timon Wong) extension.
About your porgram.
Assume bash
Define different extensions for input and output files (really important if there are in the same directory)
Loop on report, input, files only
Clear output file
Read input file
if sequence:
if [[ ... ]] with space after [[ and before ]]
spaces before and after operators (=~)
reverse operands order for operators =~
Prevent globbing with "..."
#! /bin/bash
# Input file extension
declare -r EXT_REPORT=".txt"
# Output file extension
declare -r EXT_OUTPUT=".output"
# RE
declare -r success="(Compiling)\s\".*\"\s\-\s(Succeeded)"
declare -r failure="(Compiling)\s\".*\"\s\-\s(Failed)"
# Counters
declare -i count_success=0
declare -i count_failure=0
for REPORT_FILE in ~/Documents/reports/*"${EXT_REPORT}"; do
# Clear output file
: > "${REPORT_FILE}${EXT_OUTPUT}"
# Read input file (see named file in "done" line)
while read -r line; do
# does the line match the success pattern ?
if [[ $line =~ $success ]]; then
echo "$line" >> "${REPORT_FILE}${EXT_OUTPUT}"
count_success+=1
# does the line match the failure pattern ?
elif [[ $line =~ $failure ]]; then
echo "$line" >> "${REPORT_FILE}${EXT_OUTPUT}"
count_failure+=1
fi
done < "$REPORT_FILE"
done
echo "$count_success of jobs ran succesfully"
echo "$count_failure of jobs didn't work"

What about using grep?
success='Compiling\s".*"\s-\sSucceeded'
failure='Compiling\s".*"\s-\sFailed'
count_success=0
count_failure=0
for i in ~/Documents/reports/*; do
(( count_success += $(grep -E "$success" "$i" | tee "output_$i" | wc -l) ))
(( count_failure += $(grep -E "$failure" "$i" | tee -a "output_$i" | wc -l) ))
done
echo "$count_success of jobs ran succesfully"
echo "$count_failure of jobs didn't work"

Related

bash 5 - continue while loop from a called function

I recenly moved from bash 4.2 to 5.0 and don't understand why this function don't skip when called from a while loop
alreadyInQueue ()
{
# skip files if already in queue
for task in "$HOME/.encode_queue/queue/"*
do
# Break if no task found
[[ -f "$task" ]] || break
# Sed line two from task file in queue (/dev/null error on empty queue)
line=$( sed '2q;d' "$task" 2>/dev/null )
# Store initial IFS
OLD_IFS=$IFS
# Extract tag and file from line
IFS='|' read nothing tag file <<< "$line"
# Restore IFS
IFS=$OLD_IFS
# Skip files already in queue with same preset (tag)
if [[ "$tag" == "${tag_prst[x]}" && "$file" == "$1" ]]; then
# Silent skip $2 argument: echo only if $2 = 1
[[ "$2" -eq "1" ]] && echo -e "\n** INFO ** Encode Queue, skip file already in queue:\n$i"
# Continue n
continue 2
fi
done
}
while loop calling function
# Find specified files
find "$job_path" "${args_files[#]}" | sort | while read -r i
do
# Extracts var from $i
fileSub
# Skip files already in queue
alreadyInQueue "$i" "1"
echo "i should be skipped"
done
script echo: ** INFO ** Encode Queue, skip file already in queue: ...
but doesn't continue to next file iteration
When continue is not executed inside a function call it works
# Find specified files
find "$job_path" "${args_files[#]}" | sort | while read -r i
do
# Extracts var from $i
fileSub
# Skip files already in queue
#alreadyInQueue "$i" "1"
# skip files if already in queue waiting to be encoded
for task in "$HOME/.encode_queue/queue/"*
do
# Break if no task found
[[ -f "$task" ]] || break
# Sed line two from task file in queue (/dev/null error on empty queue)
line=$( sed '2q;d' "$task" 2>/dev/null )
# Store initial IFS
OLD_IFS=$IFS
# Extract tag and file from line
IFS='|' read nothing tag file <<< "$line"
# Restore IFS
IFS=$OLD_IFS
# Skip files already in queue with same preset (tag)
if [[ "$tag" == "${tag_prst[x]}" && "$file" == "$i" ]]; then
# Silent skip $2 argument: echo only if $2 = 1
[[ "1" -eq "1" ]] && echo -e "\n** INFO ** Encode Queue, skip file already in queue:\n$i"
# Continue n
continue 2
fi
done
echo "i should be skipped"
done
help appreciated
This was a bug fix made in bash 4.4:
xx. Fixed a bug that could allow `break' or `continue' executed from shell
functions to affect loops running outside of the function.

Read line by line from parameter file in function

I have a function with a parameter file. And I want to read it line by line.
Condition
If the lines are between <?bash and ?> then I do bash -c '$line' else I display the line.
Here my file (file):
<html><head></head><body><p>Hello
<?bash
echo "world !"
?>
</p></body></html>
Here my Bash script (bashtml):
#!/bin/bash
function generation()
{
while read line
do
if [ $line = '<?bash' ]
then
while [ $line != '?>' ]
do
bash -c '$line'
done
else
echo $line
fi
done
}
generation $file
I execute this script:
./bashhtml
I am novice in Bash script and I'm lost
I think this is what you mean. However, this code is HIGHLY DANGEROUS! Any command inserted into those bash tags would be executed under your user id. It could change your password, delete all your files, read or alter data, and so on. Don't do it!
#!/bin/bash
function generation
{
# If you don't use local (or declare) then variables are global
local file="$1" # Parameter passed to function, in a local variable
local start=False # A flag to indicate tags
local line
while read -r line
do
if [[ $line == '<?bash' ]]
then
start=True
elif [[ $line == '?>' ]]
then
start=False
elif "$start"
then
bash -c "$line" # Double quotes needed here
else
echo "$line"
fi
done < "$file" # Notice how the filename is redirected into read
}
infile="$1" # This gets the filename from the command-line
generation "$infile" # This calls the function

While loop does not execute

I currently have this code:
listing=$(find "$PWD")
fullnames=""
while read listing;
do
if [ -f "$listing" ]
then
path=`echo "$listing" | awk -F/ '{print $(NF)}'`
fullnames="$fullnames $path"
echo $fullnames
fi
done
For some reason, this script isn't working, and I think it has something to do with the way that I'm writing the while loop / declaring listing. Basically, the code is supposed to pull out the actual names of the files, i.e. blah.txt, from the find $PWD.
read listing does not read a value from the string listing; it sets the value of listing with a line read from standard input. Try this:
# Ignoring the possibility of file names that contain newlines
while read; do
[[ -f $REPLY ]] || continue
path=${REPLY##*/}
fullnames+=( $path )
echo "${fullnames[#]}"
done < <( find "$PWD" )
With bash 4 or later, you can simplify this with
shopt -s globstar
for f in **/*; do
[[ -f $f ]] || continue
path+=( "$f" )
done
fullnames=${paths[#]##*/}

Reading a config file from a shell script

I am looking for a shell script analog to something like Pythons's ConfigParser or Perl's Config::INI. I have sourced files in the past to accomplish this, but I'd prefer to read rather than execute my "config file". Does anyone know of anything comparable to the above modules available for shell (or bash) scripts?
Thanks,
Jerry
You don't want source it, so you should:
1.read the config, 2.verify lines 3.eval them
CONFIGFILE="/path/to/config"
echo "=$ADMIN= =$TODO= =$FILE=" #these variables are not defined here
eval $(sed '/:/!d;/^ *#/d;s/:/ /;' < "$CONFIGFILE" | while read -r key val
do
#verify here
#...
str="$key='$val'"
echo "$str"
done)
echo =$ADMIN= =$TODO= =$FILE= #here are defined
sample of config file
ADMIN: root
TODO: delete
var=badly_formtatted_line_without_colon
#comment
FILE: /path/to/file
if you run the above sample should get (not tested):
== == ==
=root= =delete= =/path/to/file=
sure this is not the best solution - maybe someone post a nicer one.
You might want to take a look at cfget which can be installed with sudo apt-get install cfget.
#!/bin/bash
# Author: CJ
# Date..: 01/03/2013
## sample INI file save below to a file, replace "^I" with tab
#^I [ SECTION ONE ]
#TOKEN_TWO^I ="Value1 two "
#TOKEN_ONE=Value1 One
#TOKEN_THREE=^I"Value1^I three" # a comment string
#TOKEN_FOUR^I=^I"^IValue1 four"
#
#[SECTION_TWO]
#TOKEN_ONE=Value1 One ^I^I^I# another comment string
#TOKEN_TWO^I ="Value1 two "
#TOKEN_THREE=^I"Value1^I three"
#TOKEN_FOUR^I=^I"^IValue1 four"
## sample INI file
export INI= # allows access to the parsed INI values in toto by children
iniParse() {
# Make word separator Linefeed(\n)
OIFS="${IFS}"
IFS=$(echo)
SECTION=_
while read LINE; do {
IFS="${OIFS}"
# Skip blank lines
TMP="$(echo "${LINE}"|sed -e "s/^[ \t]*//")"
if [ 0 -ne ${#TMP} ]; then
# Ignore comment lines
if [ '#' == "${LINE:0:1}" -o '*' == "${LINE:0:1}" ]; then
continue
fi # if [ '#' == "${LINE:0:1}" -o '*' == "${LINE:0:1}" ]; then
# Section label
if [ "[" == "${LINE:0:1}" ]; then
LINE="${LINE/[/}"
LINE="${LINE/]/}"
LINE="${LINE/ /_}"
SECTION=$(echo "${LINE}")_
else
LINE="$(echo "${LINE}"|sed -e "s/^[ \t]*//")"
LINE="$(echo "${LINE}"|cut -d# -f1)"
TOKEN="$(echo "${LINE:0}"|cut -d= -f1)"
EQOFS=${#TOKEN}
TOKEN="$(echo "${TOKEN}"|sed -e "s/[ \t]*//g")"
VALUE="${LINE:${EQOFS}}"
VALUE="$(echo "${VALUE}"|sed -e "s/^[ \t=]*//")"
VALUE="$(echo "${VALUE}"|sed -e "s/[ \t]*$//")"
if [ "${VALUE:0:1}" == '"' ]; then
echo -n "${SECTION}${TOKEN}=${VALUE}"
echo -e "\r"
else
echo -n "${SECTION}${TOKEN}="\"${VALUE}\"""
echo -e "\r"
fi # if [ "${VALUE:0:1}" == '"' ]; then
fi # if [ "[" == "${LINE:0:1}" ]; then
fi # if [ 0 -ne ${#TMP} ]; then
IFS=$(echo)
} done <<< "$1"
IFS="${OIFS}" # restore original IFS value
} # iniParse()
# call this function with the INI filespec
iniReader() {
if [ -z "$1" ]; then return 1; fi
TMPINI="$(<$1)"
TMPINI="$(echo "${TMPINI}"|sed -e "s/\r//g")"
TMPINI="$(echo "${TMPINI}"|sed -e "s/[ \t]*\[[ \t]*/[/g")"
TMPINI="$(echo "${TMPINI}"|sed -e "s/[ \t]*\][ \t]*/]/g")"
INI=`iniParse "${TMPINI}"`
INI="$(echo "${INI}"|sed -e "s/\r/\n/g")"
eval "${INI}"
return 0
} # iniReader() {
# sample usage
if iniReader $1 ; then
echo INI read, exit_code $? # exit_code == 0
cat <<< "${INI}"
cat <<< "${SECTION_ONE_TOKEN_FOUR}"
cat <<< "${SECTION_ONE_TOKEN_THREE}"
cat <<< "${SECTION_TWO_TOKEN_TWO}"
cat <<< "${SECTION_TWO_TOKEN_ONE}"
else
echo usage: $0 filename.ini
fi # if iniReader $1 ; then
grep based alternative seems to be more readable:
CONFIG_FILE='/your/config/file.ini'
eval $(grep '^\[\|^#' CONFIG_FILE -v | while read line
do echo $line
done)
Where:
-v grep option means exclude matching lines
^\[\|^# selects all lines which starts with [ or # (configparser sections and comments)
It will work ONLY if your config file doesn't have spaces around = (if you would like to generate config with Python use space_around_delimiters=False see https://docs.python.org/3/library/configparser.html#configparser.ConfigParser.write)
Supported config example:
FIRST_VAR="a"
[lines started with [ will be ignored]
secondvar="b"
# some comment
anotherVar="c"
You can use bash it-self to interpret ini values, by:
$ source <(grep = file.ini)
Sample file:
[section-a]
var1=value1
var2=value2
See more examples: How do I grab an INI value within a shell script?
Or you can use bash ini-parser which can be found at The Old School DevOps blog site.

Shell script to validate logger date format in log file

I need to validate my log files:
-All new log lines shall start with date.
-This date will respect the ISO 8601 standard. Example:
2011-02-03 12:51:45,220Z -
Using shell script, I can validate it looping on each line and verifying the date pattern.
The code is below:
#!/bin/bash
processLine(){
# get all args
line="$#"
result=`echo $line | egrep "[0-9]{4}-[0-9]{2}-[0-9]{2} [012][0-9]:[0-9]{2}:[0-9]{2},[0-9]{3}Z" -a -c`
if [ "$result" == "0" ]; then
echo "The log is not with correct date format: "
echo $line
exit 1
fi
}
# Make sure we get file name as command line argument
if [ "$1" == "" ]; then
echo "You must enter a logfile"
exit 0
else
file="$1"
# make sure file exist and readable
if [ ! -f $file ]; then
echo "$file : does not exists"
exit 1
elif [ ! -r $file ]; then
echo "$file: can not read"
exit 2
fi
fi
# Set loop separator to end of line
BAKIFS=$IFS
IFS=$(echo -en "\n\b")
exec 3<&0
exec 0<"$file"
while read -r line
do
# use $line variable to process line in processLine() function
processLine $line
done
exec 0<&3
# restore $IFS which was used to determine what the field separators are
IFS=$BAKIFS
echo SUCCESS
But, there is a problem. Some logs contains stacktraces or something that uses more than one line, in other words, stacktrace is an example, it can be anything. Stacktrace example:
2011-02-03 12:51:45,220Z [ERROR] - File not found
java.io.FileNotFoundException: fred.txt
at java.io.FileInputStream.<init>(FileInputStream.java)
at java.io.FileInputStream.<init>(FileInputStream.java)
at ExTest.readMyFile(ExTest.java:19)
at ExTest.main(ExTest.java:7)
...
will not pass with my script, but is valid!
Then, if I run my script passing a log file with stacktraces for example, my script will failed, because it loops line by line.
I have the correct pattern and I need to validade the logger date format, but I don't have wrong date format pattern to skip lines.
I don't know how I can solve this problem. Does somebody can help me?
Thanks
You need to anchor your search for the date to the start of the line (otherwise the date could appear anywhere in the line - not just at the beginning).
The following snippet will loop over all lines that do not begin with a valid date. You still have to determine if the lines constitute errors or not.
DATEFMT='^[0-9]{4}-[0-9]{2}-[0-9]{2} [012][0-9]:[0-9]{2}:[0-9]{2},[0-9]{3}Z'
egrep -v ${DATEFMT} /path/to/log | while read LINE; do
echo ${LINE} # did not begin with date.
done
So just (silently) discard a single stack trace. In somewhat verbose bash:
STATE=idle
while read -r line; do
case $STATE in
idle)
if [[ $line =~ ^java\..*Exception ]]; then
STATE=readingexception
else
processLine "$line"
fi
;;
readingexception)
if ! [[ $line =~ ^' '*'at ' ]]; then
STATE=idle
processLine "$line"
fi
;;
*)
echo "Urk! internal error [$STATE]" >&2
exit 1
;;
esac
done <logfile
This relies on processLine not continuing on error, else you will need to track a tad more state to avoid two consecutive stack traces.
This makes 2 assumptions.
lines that begin with whitespace are continuations of previous lines. we're matching a leading space, or a leading tab.
lines that have non-whitespace characters starting at ^ are new log lines.
If a line matching #2 doesn't match the date format, we have an error, so print the error, and include the line number.
count=0
processLine() {
count=$(( count + 1 ))
line="$#"
result=$( echo $line | egrep '^[0-9]{4}-[0-9]{2}-[0-9]{2} [012][0-9]:[0-9]{2}:[0-9]{2},[0-9]{3}Z' -a -c )
if (( $result == 0 )); then
# if result = 0, then my line did not start with the proper date.
# if the line starts with whitespace, then it may be a continuation
# of a multi-line log entry (like a java stacktrace)
continues=$( echo $line | egrep "^ |^ " -a -c )
if (( $continues == 0 )); then
# if we got here, then the line did not start with a proper date,
# AND the line did not start with white space. This is a bad line.
echo "The line is not with correct date format: "
echo "$count: $line"
exit 1
fi
fi
}
Create a condition to check if the line starts with a date. If not, skip that line as it is part of a multi-line log.
processLine(){
# get all args
line="$#"
result=`echo $line | egrep "[0-9]{4}-[0-9]{2}-[0-9]{2} [012][0-9]:[0-9]{2}:[0-9]{2},[0-9]{3}Z" -a -c`
if [ "$result" == "0" ]; then
echo "Log entry is multi-lined - continuing."
fi
}

Resources