Changing words in text files using multiple dictionaries

Changing words in text files using multiple dictionaries - bash

I have a bunch of files which need to be translated using custom dictionaries. Each file contains a line indicating which dictionary to use. Here's an example:
*A:
!
=1
*>A_intro
1r
=2
1r
=3
1r
=4
1r
=5
2A:maj
*-
In the file above, *A: indicates to use dictA.
I can translate this part easily using the following syntax:
sed -f dictA < myfile
My problem is that some files require a change of dictionary half way in the text. For example:
*B:
1B:maj
2E:maj/5
2B:maj
2E:maj/5
*C:
2F:maj/5
2C:maj
2F:maj/5
2C:maj
*-
I would like to write a script to automate the translation process. Using this example, I would like the script to read the first line, select dictB, use dictB to translate each line until it reads *C:, select dictC, and then keep going.

Thanks #Cyrus. That was useful. Here's what I ended up doing.
#!/bin/sh
key="sedDictNull.txt"
while read -r line || [ -n "$line" ] ## Makes sure that the last line is read. See http://stackoverflow.com/questions/12916352/shell-script-read-missing-last-line
do
if [[ $line =~ ^\*[Aa]:$ ]]
then
key="sedDictA.txt"
elif [[ $line =~ ^\*[Aa]#:$ ]]
then
key="sedDictA#.txt"
fi
echo "$line" | sed -f $key
done < $1

I assume your "dictionaries" are really sed scripts that search and replace, like this:
s/2C/nothing/;
s/2B/something/;
You could reorganize these scripts into sections, like this:
/^\*B:/, /^\*[^B]/ {
s/1B/whatever/;
s/2B/something/;
}
/^\*C:/, /^\*[^C]/ {
s/2C/nothing/;
s/2B/something/;
}
And, of course, you could do that on the fly:
for dict in B C
do echo "/^\\*$dict:/, /^\\*[^$dict]/ {"
cat dict.$dict
echo "}"
done | sed -f- dict.in

Related

Bash script to add double quotes in .CSV comma delimited file

I need to add double quotes to the csv file. My sample data is like this..
378478,COMPLETED,Tracfone,,,"2020/03/29 09:39:22",,2787,,356074101197544,89148000005748235454,75176540
378328,COMPLETED,"Total Wireless","Unlimited Talk, Text, & Data (First 25GB High Speed, then unlimited 2GB)",50,"2020/03/29 06:10:01",200890899011202395,0899,0279395,356058102052972,89148000005117597971,67756296
I have tried some code available online with awk and sed, it is resulting as below , Error - **First digit in the number is being trimmed like for ex. in '378478' it is only displaying '78478'.
Also it is adding double quotes to already existing double quotes too!** nothing seems to be perfectly working. Please guide me!
"78478","COMPLETED","Tracfone","","",""2020/03/29 09:39:22"","","2787","","356074101197544","89148000005748235454","75176540"
"78328","COMPLETED",""Total Wireless"",""Unlimited Talk"," Text"," & Data (First 25GB High Speed"," then unlimited 2GB)"","50",""2020/03/29 06:10:01"","200890899011202395","0899","0279395","356058102052972","89148000005117597971","67756296"
"78329","COMPLETED",""Cricket Wireless"",""Unlimited Talk"," Text"," & 4G LTE Data w/ 15GB Hotspot"","60",""2020/03/29""
This is the code I am using:
awk -F"'?,'?" -v OFS='","' '{$1=$1; gsub(/^.|$/,"\"")} 1' file # or
sed -E 's/([^,]*) , (.*)/"\1" , "\2"/' file
My total code is the below one. my Intention was to first convert all .xlsx to .csv and then add double quotes to same csv and save it in the same file.i know the $file.csv part is wrong, hence i need some help
find "$Src_Dir" -type f -iname "*.xlsx" -print>path/temp
cat path/temp | while IFS="" read -r -d $'\0' file;
do
echo $file
ssconvert "${file}" --export-type=Gnumeric_stf:stf_csv
awk -F"'?,'?" -v OFS='","' '{$1=$1; gsub(/^.|$/,"\"")} 1' $file > $file.csv
done

If you want to handle anything other than the simplest CSV files, you should probably move away from sed and awk. There are much better tools available.
For example, if you sudo apt install csvtool (or equivalent) on your favourite distro, you can use its call-per-line functionality to process each line in the input file. See the following script for an example:
#!/bin/bash
function quotify {
# Start empty line, process every field.
line=""
while [[ $# -ne 0 ]] ; do
# Append comma for all but first field, then quoted field.
[[ -n "${line}" ]] && line="${line},"
line="${line}\"$1\""
shift
done
# Output the fully quoted line.
echo "${line}"
}
# Needed to call functions. Also, ensure link: /bin/sh -> /bin/bash.
export -f quotify
# Pretty-print input and output.
echo "Input file:"
sed 's/^/ /' inputFile.csv
echo "Output file:"
csvtool call quotify inputFile.csv | sed 's/^/ /'
Note the quotify function which is called for each line in the CSV file, with the arguments set to each field within that line (sans quotes, whether the original fields had quotes or not).
It basically constructs a string of all the fields in the line, with quotes around them, then writes that to standard output, as shown below in the output from that script:
Input file:
378478,COMPLETED,Tracfone,,,"2020/03/29 09:39:22",,2787,,356074101197544,89148000005748235454,75176540
378328,COMPLETED,"Total Wireless","Unlimited Talk, Text, & Data (First 25GB High Speed, then unlimited 2GB)",50,"2020/03/29"
Output file:
"378478","COMPLETED","Tracfone","","","2020/03/29 09:39:22","","2787","","356074101197544","89148000005748235454","75176540"
"378328","COMPLETED","Total Wireless","Unlimited Talk, Text, & Data (First 25GB High Speed, then unlimited 2GB)","50","2020/03/29"
Even though using a separate tool is probably the easiest way to go, if you absolutely cannot install other packages, then you're going to have to code up something in a package you already have. The following bash script is a good place to start, as it uses no other tools to achieve its goal.
At the moment, it's tied to a very specific set of rules, as follows:
White space matters. Anything between the commas is considered part of the field. This especially matters when detecting a quoted field, it must have the quote as the first character, no abc, "d,e,f",ghi stuff since the "d,e,f" won't be handled correctly.
Quoted fields are allowed to contain commas, and "" sequences within them are turned into ".
It's probably not a good idea to supply ill-formatted CSV files :-)
But, with that in mind, here we go. I'll offer a brief textual description of each section but hopefully the comments in the code will be enough to figure out what's going on.
First, a function for finding the position if some string within another string, useful for working out the field bounds:
function findPos {
haystack="$1"
needle="$2"
# Remove everything past the needle.
prefix="${haystack%%${needle}*}"
# If nothing was removed, it wasn't found, so supply massive number.
# Otherwise, it was found at the length of the string with removed stuff.
position=999999
[[ ${#prefix} -ne ${#haystack} ]] && position=${#prefix}
echo ${position}
}
Then we can use that in the function that works out the length of the next field. This basically just looks for the next comma for unquoted fields, and does special handling for quoted fields by building up the field from segments (it has to handle quotes within quotes and commas):
function getNextFieldLen {
line="$1"
# Empty line means all work done.
[[ -z "${line}" ]] && echo -1 && return
# Handle unquoted first, this is easy.
[[ "${line:0:1}" != '"' ]] && { echo $(findPos "${line}" ","); return; }
# Now handle quoted. Loop over all segments where a segment is defined as
# the text up to the next <"">, assuming it's before the next <",>.
field=""
nextQuoteComma=$(findPos "${line}" '",')
nextDoubleQuote=$(findPos "${line}" '""')
while [[ ${nextDoubleQuote} -lt ${nextQuoteComma} ]]; do
# Append segment to the field and go back for next segment.
field="${field}${line:0:${nextDoubleQuote}}\"\""
line="${line:${nextDoubleQuote}}"
line="${line:2}"
nextQuoteComma=$(findPos "${line}" '",')
nextDoubleQuote=$(findPos "${line}" '""')
done
# Add final segment (up to the comma) and output entire field.
field="${field}${line:0:${nextQuoteComma}}\""
echo "${#field}"
}
Finally, there's the top-level function which will quotify whatever comes in via standard input:
function quotifyStdIn {
# Process file line by line.
while read -r line; do
# Start with empty output line and non-comma separator.
outLine="" ; sep=""
# Place terminator to make processing easier, start field loop.
line="${line},"
fieldLen=$(getNextFieldLen "${line}")
while [[ ${fieldLen} -ge 0 ]]; do
# Get field and quotify if needed, adjust line (remove field and comma).
field="${line:0:${fieldLen}}"
[[ "${field:0:1}" = '"' ]] || field="\"${field}\""
line="${line:$((fieldLen+1))}"
#line="${line:${fieldLen}}"
#line="${line:1}"
# Append to output line and prepare for next field.
outLine="${outLine}${sep}${field}"; sep=","
fieldLen=$(getNextFieldLen "${line}")
done
# Output built line.
echo "${outLine}"
done
}
And, on the off-chance you want to read directly from a file (though providing a file name that's empty or "-" will use standard input so you can probably just use the file-based function for everything):
function quotifyFile {
file="$1"
# Empty file or "-" means standard input, otherwise take input from real file.
[[ ${#file} -eq 0 ]] && { quotifyStdIn; return; }
[[ "${file}" = "-" ]] && { quotifyStdIn; return; }
quotifyStdIn < "${file}"
}
And, finally, because every program that's not a "Hello, world" one deserves some form of test harness, this is what you can use to test the various capabilities:
(
echo 'paxdiablo,was here'
echo 'and,"then, strangely,",he,was,not'
echo '50,"My name is ""Pax"", and yours is ""Bob""",42'
echo '17,"""Love"" is grand",19'
) > harness.csv
echo "Before:"
sed "s/^/ /" harness.csv
echo "After:"
quotifyFile harness.csv | sed "s/^/ /"
rm -rf harness.csv
And, since a test harness is of little use unless you run the tests, here's the results of the first run:
Before:
paxdiablo,was here
and,"then, strangely,",he,was,not
50,"My name is ""Pax"", and yours is ""Bob""",42
17,"""Love"" is grand",19
After:
"paxdiablo","was here"
"and","then, strangely,","he","was","not"
"50","My name is ""Pax"", and yours is ""Bob""","42"
"17","""Love"" is grand","19"
Hopefully, that will be enough to get you going in the absence of being able to install packages. Of course, if one of the packages you can't install in bash itself, then you have problems that I can't help you with :-)

Your starting CSV is not a good CSV: the 2 rows have different number of columns
+--------+-----------+----------------+--------------------------------------------------------------------------+----+---------------------+---+------+---+-----------------+----------------------+----------+
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
+--------+-----------+----------------+--------------------------------------------------------------------------+----+---------------------+---+------+---+-----------------+----------------------+----------+
| 378478 | COMPLETED | Tracfone | - | - | 2020/03/29 09:39:22 | - | 2787 | - | 356074101197544 | 89148000005748235454 | 75176540 |
| 378328 | COMPLETED | Total Wireless | Unlimited Talk, Text, & Data (First 25GB High Speed, then unlimited 2GB) | 50 | 2020/03/29 | - | - | - | - | - | - |
+--------+-----------+----------------+--------------------------------------------------------------------------+----+---------------------+---+------+---+-----------------+----------------------+----------+
Using Miller (https://github.com/johnkerl/miller) you could run
mlr --csv --quote-all -N unsparsify input >output
to have
"378478","COMPLETED","Tracfone","","","2020/03/29 09:39:22","","2787","","356074101197544","89148000005748235454","75176540"
"378328","COMPLETED","Total Wireless","Unlimited Talk, Text, & Data (First 25GB High Speed, then unlimited 2GB)","50","2020/03/29","","","","","",""
You can use it downloading the executable https://github.com/johnkerl/miller/releases/tag/v5.7.0

How can extract a value from .ini using sed [duplicate]

I have a parameters.ini file, such as:
[parameters.ini]
database_user = user
database_version = 20110611142248
I want to read in and use the database version specified in the parameters.ini file from within a bash shell script so I can process it.
#!/bin/sh
# Need to get database version from parameters.ini file to use in script
php app/console doctrine:migrations:migrate $DATABASE_VERSION
How would I do this?

How about grepping for that line then using awk
version=$(awk -F "=" '/database_version/ {print $2}' parameters.ini)

You can use bash native parser to interpret ini values, by:
$ source <(grep = file.ini)
Sample file:
[section-a]
var1=value1
var2=value2
IPS=( "1.2.3.4" "1.2.3.5" )
To access variables, you simply printing them: echo $var1. You may also use arrays as shown above (echo ${IPS[#]}).
If you only want a single value just grep for it:
source <(grep var1 file.ini)
For the demo, check this recording at asciinema.
It is simple as you don't need for any external library to parse the data, but it comes with some disadvantages. For example:
If you have spaces between = (variable name and value), then you've to trim the spaces first, e.g.
$ source <(grep = file.ini | sed 's/ *= */=/g')
Or if you don't care about the spaces (including in the middle), use:
$ source <(grep = file.ini | tr -d ' ')
To support ; comments, replace them with #:
$ sed "s/;/#/g" foo.ini | source /dev/stdin
The sections aren't supported (e.g. if you've [section-name], then you've to filter it out as shown above, e.g. grep =), the same for other unexpected errors.
If you need to read specific value under specific section, use grep -A, sed, awk or ex).
E.g.
source <(grep = <(grep -A5 '\[section-b\]' file.ini))
Note: Where -A5 is the number of rows to read in the section. Replace source with cat to debug.
If you've got any parsing errors, ignore them by adding: 2>/dev/null
See also:
How to parse and convert ini file into bash array variables? at serverfault SE
Are there any tools for modifying INI style files from shell script

Sed one-liner, that takes sections into account. Example file:
[section1]
param1=123
param2=345
param3=678
[section2]
param1=abc
param2=def
param3=ghi
[section3]
param1=000
param2=111
param3=222
Say you want param2 from section2. Run the following:
sed -nr "/^\[section2\]/ { :l /^param2[ ]*=/ { s/[^=]*=[ ]*//; p; q;}; n; b l;}" ./file.ini
will give you
def

Bash does not provide a parser for these files. Obviously you can use an awk command or a couple of sed calls, but if you are bash-priest and don't want to use any other shell, then you can try the following obscure code:
#!/usr/bin/env bash
cfg_parser ()
{
ini="$(<$1)" # read the file
ini="${ini//[/\[}" # escape [
ini="${ini//]/\]}" # escape ]
IFS=$'\n' && ini=( ${ini} ) # convert to line-array
ini=( ${ini[*]//;*/} ) # remove comments with ;
ini=( ${ini[*]/\ =/=} ) # remove tabs before =
ini=( ${ini[*]/=\ /=} ) # remove tabs after =
ini=( ${ini[*]/\ =\ /=} ) # remove anything with a space around =
ini=( ${ini[*]/#\\[/\}$'\n'cfg.section.} ) # set section prefix
ini=( ${ini[*]/%\\]/ \(} ) # convert text2function (1)
ini=( ${ini[*]/=/=\( } ) # convert item to array
ini=( ${ini[*]/%/ \)} ) # close array parenthesis
ini=( ${ini[*]/%\\ \)/ \\} ) # the multiline trick
ini=( ${ini[*]/%\( \)/\(\) \{} ) # convert text2function (2)
ini=( ${ini[*]/%\} \)/\}} ) # remove extra parenthesis
ini[0]="" # remove first element
ini[${#ini[*]} + 1]='}' # add the last brace
eval "$(echo "${ini[*]}")" # eval the result
}
cfg_writer ()
{
IFS=' '$'\n'
fun="$(declare -F)"
fun="${fun//declare -f/}"
for f in $fun; do
[ "${f#cfg.section}" == "${f}" ] && continue
item="$(declare -f ${f})"
item="${item##*\{}"
item="${item%\}}"
item="${item//=*;/}"
vars="${item//=*/}"
eval $f
echo "[${f#cfg.section.}]"
for var in $vars; do
echo $var=\"${!var}\"
done
done
}
Usage:
# parse the config file called 'myfile.ini', with the following
# contents::
# [sec2]
# var2='something'
cfg.parser 'myfile.ini'
# enable section called 'sec2' (in the file [sec2]) for reading
cfg.section.sec2
# read the content of the variable called 'var2' (in the file
# var2=XXX). If your var2 is an array, then you can use
# ${var[index]}
echo "$var2"
Bash ini-parser can be found at The Old School DevOps blog site.

Just include your .ini file into bash body:
File example.ini:
DBNAME=test
DBUSER=scott
DBPASSWORD=tiger
File example.sh
#!/bin/bash
#Including .ini file
. example.ini
#Test
echo "${DBNAME} ${DBUSER} ${DBPASSWORD}"

All of the solutions I've seen so far also hit on commented out lines. This one didn't, if the comment code is ;:
awk -F '=' '{if (! ($0 ~ /^;/) && $0 ~ /database_version/) print $2}' file.ini

You may use crudini tool to get ini values, e.g.:
DATABASE_VERSION=$(crudini --get parameters.ini '' database_version)

one of more possible solutions
dbver=$(sed -n 's/.*database_version *= *\([^ ]*.*\)/\1/p' < parameters.ini)
echo $dbver

Display the value of my_key in an ini-style my_file:
sed -n -e 's/^\s*my_key\s*=\s*//p' my_file
-n -- do not print anything by default
-e -- execute the expression
s/PATTERN//p -- display anything following this pattern
In the pattern:
^ -- pattern begins at the beginning of the line
\s -- whitespace character
* -- zero or many (whitespace characters)
Example:
$ cat my_file
# Example INI file
something = foo
my_key = bar
not_my_key = baz
my_key_2 = bing
$ sed -n -e 's/^\s*my_key\s*=\s*//p' my_file
bar
So:
Find a pattern where the line begins with zero or many whitespace characters,
followed by the string my_key, followed by zero or many whitespace characters, an equal sign, then zero or many whitespace characters again. Display the rest of the content on that line following that pattern.

Similar to the other Python answers, you can do this using the -c flag to execute a sequence of Python statements given on the command line:
$ python3 -c "import configparser; c = configparser.ConfigParser(); c.read('parameters.ini'); print(c['parameters.ini']['database_version'])"
20110611142248
This has the advantage of requiring only the Python standard library and the advantage of not writing a separate script file.
Or use a here document for better readability, thusly:
#!/bin/bash
python << EOI
import configparser
c = configparser.ConfigParser()
c.read('params.txt')
print c['chassis']['serialNumber']
EOI
serialNumber=$(python << EOI
import configparser
c = configparser.ConfigParser()
c.read('params.txt')
print c['chassis']['serialNumber']
EOI
)
echo $serialNumber

sed
You can use sed to parse the ini configuration file, especially when you've section names like:
# last modified 1 April 2001 by John Doe
[owner]
name=John Doe
organization=Acme Widgets Inc.
[database]
# use IP address in case network name resolution is not working
server=192.0.2.62
port=143
file=payroll.dat
so you can use the following sed script to parse above data:
# Configuration bindings found outside any section are given to
# to the default section.
1 {
x
s/^/default/
x
}
# Lines starting with a #-character are comments.
/#/n
# Sections are unpacked and stored in the hold space.
/\[/ {
s/\[\(.*\)\]/\1/
x
b
}
# Bindings are unpacked and decorated with the section
# they belong to, before being printed.
/=/ {
s/^[[:space:]]*//
s/[[:space:]]*=[[:space:]]*/|/
G
s/\(.*\)\n\(.*\)/\2|\1/
p
}
this will convert the ini data into this flat format:
owner|name|John Doe
owner|organization|Acme Widgets Inc.
database|server|192.0.2.62
database|port|143
database|file|payroll.dat
so it'll be easier to parse using sed, awk or read by having section names in every line.
Credits & source: Configuration files for shell scripts, Michael Grünewald
Alternatively, you can use this project: chilladx/config-parser, a configuration parser using sed.

For people (like me) looking to read INI files from shell scripts (read shell, not bash) - I've knocked up the a little helper library which tries to do exactly that:
https://github.com/wallyhall/shini (MIT license, do with it as you please. I've linked above including it inline as the code is quite lengthy.)
It's somewhat more "complicated" than the simple sed lines suggested above - but works on a very similar basis.
Function reads in a file line-by-line - looking for section markers ([section]) and key/value declarations (key=value).
Ultimately you get a callback to your own function - section, key and value.

Here is my version, which parses sections and populates a global associative array g_iniProperties with it.
Note that this works only with bash v4.2 and higher.
function parseIniFile() { #accepts the name of the file to parse as argument ($1)
#declare syntax below (-gA) only works with bash 4.2 and higher
unset g_iniProperties
declare -gA g_iniProperties
currentSection=""
while read -r line
do
if [[ $line = [* ]] ; then
if [[ $line = [* ]] ; then
currentSection=$(echo $line | sed -e 's/\r//g' | tr -d "[]")
fi
else
if [[ $line = *=* ]] ; then
cleanLine=$(echo $line | sed -e 's/\r//g')
key=$currentSection.$(echo $cleanLine | awk -F: '{ st = index($0,"=");print substr($0,0,st-1)}')
value=$(echo $cleanLine | awk -F: '{ st = index($0,"=");print substr($0,st+1)}')
g_iniProperties[$key]=$value
fi
fi;
done < $1
}
And here is a sample code using the function above:
parseIniFile "/path/to/myFile.ini"
for key in "${!g_iniProperties[#]}"; do
echo "Found key/value $key = ${g_iniProperties[$key]}"
done

Yet another implementation using awk with a little more flexibility.
function parse_ini() {
cat /dev/stdin | awk -v section="$1" -v key="$2" '
BEGIN {
if (length(key) > 0) { params=2 }
else if (length(section) > 0) { params=1 }
else { params=0 }
}
match($0,/#/) { next }
match($0,/^\[(.+)\]$/){
current=substr($0, RSTART+1, RLENGTH-2)
found=current==section
if (params==0) { print current }
}
match($0,/(.+)=(.+)/) {
if (found) {
if (params==2 && key==$1) { print $3 }
if (params==1) { printf "%s=%s\n",$1,$3 }
}
}'
}
You can use calling passing between 0 and 2 params:
cat myfile1.ini myfile2.ini | parse_ini # List section names
cat myfile1.ini myfile2.ini | parse_ini 'my-section' # Prints keys and values from a section
cat myfile1.ini myfile2.ini | parse_ini 'my-section' 'my-key' # Print a single value

complex simplicity
ini file
test.ini
[section1]
name1=value1
name2=value2
[section2]
name1=value_1
name2 = value_2
bash script with read and execute
/bin/parseini
#!/bin/bash
set +a
while read p; do
reSec='^\[(.*)\]$'
#reNV='[ ]*([^ ]*)+[ ]*=(.*)' #Remove only spaces around name
reNV='[ ]*([^ ]*)+[ ]*=[ ]*(.*)' #Remove spaces around name and spaces before value
if [[ $p =~ $reSec ]]; then
section=${BASH_REMATCH[1]}
elif [[ $p =~ $reNV ]]; then
sNm=${section}_${BASH_REMATCH[1]}
sVa=${BASH_REMATCH[2]}
set -a
eval "$(echo "$sNm"=\""$sVa"\")"
set +a
fi
done < $1
then in another script I source the results of the command and can use any variables within
test.sh
#!/bin/bash
source parseini test.ini
echo $section2_name2
finally from command line the output is thus
# ./test.sh
value_2

Some of the answers don't respect comments. Some don't respect sections. Some recognize only one syntax (only ":" or only "="). Some Python answers fail on my machine because of differing captialization or failing to import the sys module. All are a bit too terse for me.
So I wrote my own, and if you have a modern Python, you can probably call this from your Bash shell. It has the advantage of adhering to some of the common Python coding conventions, and even provides sensible error messages and help. To use it, name it something like myconfig.py (do NOT call it configparser.py or it may try to import itself,) make it executable, and call it like
value=$(myconfig.py something.ini sectionname value)
Here's my code for Python 3.5 on Linux:
#!/usr/bin/env python3
# Last Modified: Thu Aug 3 13:58:50 PDT 2017
"""A program that Bash can call to parse an .ini file"""
import sys
import configparser
import argparse
if __name__ == '__main__':
parser = argparse.ArgumentParser(description="A program that Bash can call to parse an .ini file")
parser.add_argument("inifile", help="name of the .ini file")
parser.add_argument("section", help="name of the section in the .ini file")
parser.add_argument("itemname", help="name of the desired value")
args = parser.parse_args()
config = configparser.ConfigParser()
config.read(args.inifile)
print(config.get(args.section, args.itemname))

I wrote a quick and easy python script to include in my bash script.
For example, your ini file is called food.ini
and in the file you can have some sections and some lines:
[FRUIT]
Oranges = 14
Apples = 6
Copy this small 6 line Python script and save it as configparser.py
#!/usr/bin/python
import configparser
import sys
config = configparser.ConfigParser()
config.read(sys.argv[1])
print config.get(sys.argv[2],sys.argv[3])
Now, in your bash script you could do this for example.
OrangeQty=$(python configparser.py food.ini FRUIT Oranges)
or
ApplesQty=$(python configparser.py food.ini FRUIT Apples)
echo $ApplesQty
This presupposes:
you have Python installed
you have the configparser library installed (this should come with a std python installation)
Hope it helps
:¬)

The explanation to the answer for the one-liner sed.
[section1]
param1=123
param2=345
param3=678
[section2]
param1=abc
param2=def
param3=ghi
[section3]
param1=000
param2=111
param3=222
sed -nr "/^\[section2\]/ { :l /^\s*[^#].*/ p; n; /^\[/ q; b l; }" ./file.ini
To understand, it will be easier to format the line like this:
sed -nr "
# start processing when we found the word \"section2\"
/^\[section2\]/ { #the set of commands inside { } will be executed
#create a label \"l\" (https://www.grymoire.com/Unix/Sed.html#uh-58)
:l /^\s*[^#].*/ p;
# move on to the next line. For the first run it is the \"param1=abc\"
n;
# check if this line is beginning of new section. If yes - then exit.
/^\[/ q
#otherwise jump to the label \"l\"
b l
}
" file.ini

This script will get parameters as follow :
meaning that if your ini has :
pars_ini.ksh < path to ini file > < name of Sector in Ini file > < the name in name=value to return >
eg. how to call it :
[ environment ]
a=x
[ DataBase_Sector ]
DSN = something
Then calling :
pars_ini.ksh /users/bubu_user/parameters.ini DataBase_Sector DSN
this will retrieve the following "something"
the script "pars_ini.ksh" :
\#!/bin/ksh
\#INI_FILE=path/to/file.ini
\#INI_SECTION=TheSection
\# BEGIN parse-ini-file.sh
\# SET UP THE MINIMUM VARS FIRST
alias sed=/usr/local/bin/sed
INI_FILE=$1
INI_SECTION=$2
INI_NAME=$3
INI_VALUE=""
eval `sed -e 's/[[:space:]]*\=[[:space:]]*/=/g' \
-e 's/;.*$//' \
-e 's/[[:space:]]*$//' \
-e 's/^[[:space:]]*//' \
-e "s/^\(.*\)=\([^\"']*\)$/\1=\"\2\"/" \
< $INI_FILE \
| sed -n -e "/^\[$INI_SECTION\]/,/^\s*\[/{/^[^;].*\=.*/p;}"`
TEMP_VALUE=`echo "$"$INI_NAME`
echo `eval echo $TEMP_VALUE`

This implementation uses awk and has the following advantages:
Will only return the first matching entry
Ignores lines that start with a ;
Trims leading and trailing whitespace, but not internal whitespace
Formatted version:
awk -F '=' '/^\s*database_version\s*=/ {
sub(/^ +/, "", $2);
sub(/ +$/, "", $2);
print $2;
exit;
}' parameters.ini
One-liner:
awk -F '=' '/^\s*database_version\s*=/ { sub(/^ +/, "", $2); sub(/ +$/, "", $2); print $2; exit; }' parameters.ini

You can use a CSV parser xsv as parsing INI data.
cargo install xsv
$ cat /etc/*release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
$ xsv select -d "=" - <<< "$( cat /etc/*release )" | xsv search --no-headers --select 1 "DISTRIB_CODENAME" | xsv select 2
xenial
or from a file.
$ xsv select -d "=" - file.ini | xsv search --no-headers --select 1 "DISTRIB_CODENAME" | xsv select 2

My version of the one-liner
#!/bin/bash
#Reader for MS Windows 3.1 Ini-files
#Usage: inireader.sh
# e.g.: inireader.sh win.ini ERRORS DISABLE
# would return value "no" from the section of win.ini
#[ERRORS]
#DISABLE=no
INIFILE=$1
SECTION=$2
ITEM=$3
cat $INIFILE | sed -n /^\[$SECTION\]/,/^\[.*\]/p | grep "^[:space:]*$ITEM[:space:]*=" | sed s/.*=[:space:]*//

Just finished writing my own parser. I tried to use various parser found here, none seems to work with both ksh93 (AIX) and bash (Linux).
It's old programming style - parsing line by line. Pretty fast since it used few external commands. A bit slower because of all the eval required for dynamic name of the array.
The ini support 3 special syntaxs:
includefile=ini file -->
Load an additionnal ini file. Useful for splitting ini in multiple files, or re-use some piece of configuration
includedir=directory -->
Same as includefile, but include a complete directory
includesection=section -->
Copy an existing section to the current section.
I used all thoses syntax to have pretty complex, re-usable ini file. Useful to install products when installing a new OS - we do that a lot.
Values can be accessed with ${ini[$section.$item]}. The array MUST be defined before calling this.
Have fun. Hope it's useful for someone else!
function Show_Debug {
[[ $DEBUG = YES ]] && echo "DEBUG $#"
}
function Fatal {
echo "$#. Script aborted"
exit 2
}
#-------------------------------------------------------------------------------
# This function load an ini file in the array "ini"
# The "ini" array must be defined in the calling program (typeset -A ini)
#
# It could be any array name, the default array name is "ini".
#
# There is heavy usage of "eval" since ksh and bash do not support
# reference variable. The name of the ini is passed as variable, and must
# be "eval" at run-time to work. Very specific syntax was used and must be
# understood before making any modifications.
#
# It complexify greatly the program, but add flexibility.
#-------------------------------------------------------------------------------
function Load_Ini {
Show_Debug "$0($#)"
typeset ini_file="$1"
# Name of the array to fill. By default, it's "ini"
typeset ini_array_name="${2:-ini}"
typeset section variable value line my_section file subsection value_array include_directory all_index index sections pre_parse
typeset LF="
"
if [[ ! -s $ini_file ]]; then
Fatal "The ini file is empty or absent in $0 [$ini_file]"
fi
include_directory=$(dirname $ini_file)
include_directory=${include_directory:-$(pwd)}
Show_Debug "include_directory=$include_directory"
section=""
# Since this code support both bash and ksh93, you cannot use
# the syntax "echo xyz|while read line". bash doesn't work like
# that.
# It forces the use of "<<<", introduced in bash and ksh93.
Show_Debug "Reading file $ini_file and putting the results in array $ini_array_name"
pre_parse="$(sed 's/^ *//g;s/#.*//g;s/ *$//g' <$ini_file | egrep -v '^$')"
while read line; do
if [[ ${line:0:1} = "[" ]]; then # Is the line starting with "["?
# Replace [section_name] to section_name by removing the first and last character
section="${line:1}"
section="${section%\]}"
eval "sections=\${$ini_array_name[sections_list]}"
sections="$sections${sections:+ }$section"
eval "$ini_array_name[sections_list]=\"$sections\""
Show_Debug "$ini_array_name[sections_list]=\"$sections\""
eval "$ini_array_name[$section.exist]=YES"
Show_Debug "$ini_array_name[$section.exist]='YES'"
else
variable=${line%%=*} # content before the =
value=${line#*=} # content after the =
if [[ $variable = includefile ]]; then
# Include a single file
Load_Ini "$include_directory/$value" "$ini_array_name"
continue
elif [[ $variable = includedir ]]; then
# Include a directory
# If the value doesn't start with a /, add the calculated include_directory
if [[ $value != /* ]]; then
value="$include_directory/$value"
fi
# go thru each file
for file in $(ls $value/*.ini 2>/dev/null); do
if [[ $file != *.ini ]]; then continue; fi
# Load a single file
Load_Ini "$file" "$ini_array_name"
done
continue
elif [[ $variable = includesection ]]; then
# Copy an existing section into the current section
eval "all_index=\"\${!$ini_array_name[#]}\""
# It's not necessarily fast. Need to go thru all the array
for index in $all_index; do
# Only if it is the requested section
if [[ $index = $value.* ]]; then
# Evaluate the subsection [section.subsection] --> subsection
subsection=${index#*.}
# Get the current value (source section)
eval "value_array=\"\${$ini_array_name[$index]}\""
# Assign the value to the current section
# The $value_array must be resolved on the second pass of the eval, so make sure the
# first pass doesn't resolve it (\$value_array instead of $value_array).
# It must be evaluated on the second pass in case there is special character like $1,
# or ' or " in it (code).
eval "$ini_array_name[$section.$subsection]=\"\$value_array\""
Show_Debug "$ini_array_name[$section.$subsection]=\"$value_array\""
fi
done
fi
# Add the value to the array
eval "current_value=\"\${$ini_array_name[$section.$variable]}\""
# If there's already something for this field, add it with the current
# content separated by a LF (line_feed)
new_value="$current_value${current_value:+$LF}$value"
# Assign the content
# The $new_value must be resolved on the second pass of the eval, so make sure the
# first pass doesn't resolve it (\$new_value instead of $new_value).
# It must be evaluated on the second pass in case there is special character like $1,
# or ' or " in it (code).
eval "$ini_array_name[$section.$variable]=\"\$new_value\""
Show_Debug "$ini_array_name[$section.$variable]=\"$new_value\""
fi
done <<< "$pre_parse"
Show_Debug "exit $0($#)\n"
}

When I use a password in base64, I put the separator ":" because the base64 string may has "=". For example (I use ksh):
> echo "Abc123" | base64
QWJjMTIzCg==
In parameters.ini put the line pass:QWJjMTIzCg==, and finally:
> PASS=`awk -F":" '/pass/ {print $2 }' parameters.ini | base64 --decode`
> echo "$PASS"
Abc123
If the line has spaces like "pass : QWJjMTIzCg== " add | tr -d ' ' to trim them:
> PASS=`awk -F":" '/pass/ {print $2 }' parameters.ini | tr -d ' ' | base64 --decode`
> echo "[$PASS]"
[Abc123]

This uses the system perl and clean regular expressions:
cat parameters.ini | perl -0777ne 'print "$1" if /\[\s*parameters\.ini\s*\][\s\S]*?\sdatabase_version\s*=\s*(.*)/'

The answer of "Karen Gabrielyan" among another answers was the best but in some environments we dont have awk, like typical busybox, i changed the answer by below code.
trim()
{
local trimmed="$1"
# Strip leading space.
trimmed="${trimmed## }"
# Strip trailing space.
trimmed="${trimmed%% }"
echo "$trimmed"
}
function parseIniFile() { #accepts the name of the file to parse as argument ($1)
#declare syntax below (-gA) only works with bash 4.2 and higher
unset g_iniProperties
declare -gA g_iniProperties
currentSection=""
while read -r line
do
if [[ $line = [* ]] ; then
if [[ $line = [* ]] ; then
currentSection=$(echo $line | sed -e 's/\r//g' | tr -d "[]")
fi
else
if [[ $line = *=* ]] ; then
cleanLine=$(echo $line | sed -e 's/\r//g')
key=$(trim $currentSection.$(echo $cleanLine | cut -d'=' -f1'))
value=$(trim $(echo $cleanLine | cut -d'=' -f2))
g_iniProperties[$key]=$value
fi
fi;
done < $1
}

If Python is available, the following will read all the sections, keys and values and save them in variables with their names following the format "[section]_[key]". Python can read .ini files properly, so we make use of it.
#!/bin/bash
eval $(python3 << EOP
from configparser import SafeConfigParser
config = SafeConfigParser()
config.read("config.ini"))
for section in config.sections():
for (key, val) in config.items(section):
print(section + "_" + key + "=\"" + val + "\"")
EOP
)
echo "Environment_type: ${Environment_type}"
echo "Environment_name: ${Environment_name}"
config.ini
[Environment]
type = DEV
name = D01

If using sections, this will do the job :
Example raw output :
$ ./settings
[section]
SETTING_ONE=this is setting one
SETTING_TWO=This is the second setting
ANOTHER_SETTING=This is another setting
Regexp parsing :
$ ./settings | sed -n -E "/^\[.*\]/{s/\[(.*)\]/\1/;h;n;};/^[a-zA-Z]/{s/#.*//;G;s/([^ ]*) *= *(.*)\n(.*)/\3_\1='\2'/;p;}"
section_SETTING_ONE='this is setting one'
section_SETTING_TWO='This is the second setting'
section_ANOTHER_SETTING='This is another setting'
Now all together :
$ eval "$(./settings | sed -n -E "/^\[.*\]/{s/\[(.*)\]/\1/;h;n;};/^[a-zA-Z]/{s/#.*//;G;s/([^ ]*) *= *(.*)\n(.*)/\3_\1='\2'/;p;}")"
$ echo $section_SETTING_TWO
This is the second setting

I have nice one-liner (assuimng you have php and jq installed):
cat file.ini | php -r "echo json_encode(parse_ini_string(file_get_contents('php://stdin'), true, INI_SCANNER_RAW));" | jq '.section.key'

This thread does not have enough solutions to choose from, thus here my solution, it does not require tools like sed or awk :
grep '^\[section\]' -A 999 config.ini | tail -n +2 | grep -B 999 '^\[' | head -n -1 | grep '^key' | cut -d '=' -f 2
If your are to expect sections with more than 999 lines, feel free to adapt the example above. Note that you may want to trim the resulting value, to remove spaces or a comment string after the value. Remove the ^ if you need to match keys that do not start at the beginning of the line, as in the example of the question. Better, match explicitly for white spaces and tabs, in such a case.
If you have multiple values in a given section you want to read, but want to avoid reading the file multiple times:
CONFIG_SECTION=$(grep '^\[section\]' -A 999 config.ini | tail -n +2 | grep -B 999 '^\[' | head -n -1)
KEY1=$(echo ${CONFIG_SECTION} | tr ' ' '\n' | grep key1 | cut -d '=' -f 2)
echo "KEY1=${KEY1}"
KEY2=$(echo ${CONFIG_SECTION} | tr ' ' '\n' | grep key2 | cut -d '=' -f 2)
echo "KEY2=${KEY2}"

shell script compare file with multiple line pattern

I have a file which is created after some manual configuration.
I need to check this file automatically with a shell script.
The file looks like this:
eth0;eth0;1c:98:ec:2a:1a:4c
eth1;eth1;1c:98:ec:2a:1a:4d
eth2;eth2;1c:98:ec:2a:1a:4e
eth3;eth3;1c:98:ec:2a:1a:4f
eth4;eth4;48:df:37:58:da:44
eth5;eth5;48:df:37:58:da:45
eth6;eth6;48:df:37:58:da:46
eth7;eth7;48:df:37:58:da:47
I want to compare it to a pattern like this:
eth0;eth0;*
eth1;eth1;*
eth2;eth2;*
eth3;eth3;*
eth4;eth4;*
eth5;eth5;*
eth6;eth6;*
eth7;eth7;*
If I would only have to check this pattern I could run this loop:
c=0
while [ $c -le 7 ]
do
if [ "$(grep "eth"${c}";eth"${c}";*" current_mapping)" ];
then
echo "eth$c ok"
fi
(( c++ ))
done
There are 6 or more different patterns possible. A pattern could also look like this for example (depending and specific configuration requests):
eth4;eth0;*
eth5;eth1;*
eth6;eth2;*
eth7;eth3;*
eth0;eth4;*
eth1;eth5;*
eth2;eth6;*
eth3;eth7;*
So I don't think I can run a standard grep per line command in a loop. The eth numbers are not consistently the same.
Is it possible somehow to compare the whole file to pattern like it would be possible with grep for a single line?

Assuming file is your data file and patt is your file that contains above pattern. You can use this grep -f in conjunction with sed in a process substitution that replaces * with .* and ? with . to make it a workable regex.
grep -f <(sed 's/\*/.*/g; s/?/./g' patt) file
eth0;eth0;1c:98:ec:2a:1a:4c
eth1;eth1;1c:98:ec:2a:1a:4d
eth2;eth2;1c:98:ec:2a:1a:4e
eth3;eth3;1c:98:ec:2a:1a:4f
eth4;eth4;48:df:37:58:da:44
eth5;eth5;48:df:37:58:da:45
eth6;eth6;48:df:37:58:da:46
eth7;eth7;48:df:37:58:da:47

I wrote this loop now and it does the job (current_mapping being the file with the content in the first code block of the question). I would have to create arrays with different patterns and use a case for every pattern. I was just wondering if there is something like grep for multiple lines, that could the same without writing this loop.
array=("eth0;eth0;*" "eth1;eth1;*" "eth2;eth2;*" "eth3;eth3;*" "eth4;eth4;*" "eth5;eth5;*" "eth6;eth6;*" "eth7;eth7;*")
c=1
while [ $c -le 8 ]
do
if [ ! "$(sed -n "${c}"p current_mapping | grep "${array[$c-1]}")" ];
then
echo "somethings wrong"
fi
(( c++ ))
done

Try any:
grep -P '(eth[0-9]);\1'
grep -E '(eth[0-9]);\1'
sed -n '/\(eth[0-9]\);\1/p'
awk -F';' '$1 == $2'
There are commands only. Apply them to a pipe or file.
Updated the answer after the question was edited.
As we can see the task requirements are as follows:
a file (a set of lines) formatted like ethN;ethM;MAC
examine each line for equality ethN and ethM
if they are equal, output a string ethN ok
If I understand the task correctly we can achieve this using the following code without loops:
awk -F';' '$1 == $2 { print $1, "ok" }'

appending text to specific line in file bash

So I have a file that contains some lines of text separated by ','. I want to create a script that counts how much parts a line has and if the line contains 16 parts i want to add a new one. So far its working great. The only thing that is not working is appending the ',' at the end. See my example below:
Original file:
a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a
b,b,b,b,b,b
a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a
b,b,b,b,b,b
a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a
Expected result:
a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,xx
b,b,b,b,b,b
a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a
b,b,b,b,b,b
a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,xx
This is my code:
while read p; do
if [[ $p == "HEA"* ]]
then
IFS=',' read -ra ADDR <<< "$p"
echo ${#ADDR[#]}
arrayCount=${#ADDR[#]}
if [ "${arrayCount}" -eq 16 ];
then
sed -i "/$p/ s/\$/,xx/g" $f
fi
fi
done <$f
Result:
a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a
,xx
b,b,b,b,b,b
a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a
b,b,b,b,b,b
a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a
,xx
What im doing wrong? I'm sure its something small but i cant find it..

It can be done using awk:
awk -F, 'NF==16{$0 = $0 FS "xx"} 1' file
a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,xx
b,b,b,b,b,b
a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a
b,b,b,b,b,b
a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,xx
-F, sets input field separator as comma
NF==16 is the condition that says execute block inside { and } if # of fields is 16
$0 = $0 FS "xx" appends xx at end of line
1 is the default awk action that means print the output

For using sed answer should be in the following:
Use ${line_number} s/..../..../ format - to target a specific line, you need to find out the line number first.
Use the special char & to denote the matched string
The sed statement should look like the following:
sed -i "${line_number}s/.*/&xx/"
I would prefer to leave it to you to play around with it but if you would prefer i can give you a full working sample.

Need to pick Latest File From a Dir Using Shell Script

I am new to Shell Script and I got a requirement to pick the latest files from a dir using Shell script
Directory Name : FTPDIR
File In this Dir will be of
APC5502015VP072020121826.csv
APC5502015VP082020122314.csv
APC5502015VP092020121451.csv
CBC5502015VP092020122045.csv
CBC5502015VP102020122045.csv
S5502015VP072020121620.csv
S5502015VP072020122314.csv
S5502015VP092020122045.csv
Note: (Need to Pick one Latest from each Group)- Below is the out put which I need to get after executing the shell script
APC5502015VP092020121451.csv
CBC5502015VP102020122045.csv
S5502015VP092020122045.csv
Ex: In the latest File APC5502015VP092020121451.csv the no 092020121451 is the date part in the format : MMDDYYYYHHMM and string part is APC5502015VP (Length Not Fixed in String Part)
I need to pick those three files from the dir using shell script
Can you help me to resolve this?

It's going to be really problematic to do this safely in just bash. As Jonathan mentioned, "special" characters like spaces or newlines may bung up your script.
If we can assume that there won't be any of those, then we can do most of job in bash, without involving other tools.
# Make an associative array to record types, in the second loop...
declare -A a
for file in *.csv; do
# First, we convert the filenames into something that can be sorted.
# The next three lines account for your "unknown length" in the first part
# of the filename. We assume the date+time is the 12 chars before ".csv".
new="$(rev <<<"$file")"
new="${new:4:12}"
new="$(rev <<<"$new")"
new="${new:4:4}${new:0:2}${new:2:2}${new:8:4}"
len=$(( ${#file} - 16 ))
echo "$new ${file:0:$len} $file"
done | sort | while read date type file; do
# Next, we print only the first of each "type"...
if [[ ${a[$type]} -eq 0 ]]; then
a[$type]=1
echo "$file"
fi
# And stop once we have collected three types.
if [[ ${#a[*]} -ge 3 ]]; then
break
fi
done
As I say, this doesn't handle newlines in filenames.
Note also that this uses rev and sort, which are not built in to bash. The rev parts could be done internally, using more code, which might make them execute faster, but you'd only see a difference in very extreme cases. There's not much we can do about sort, since there isn't a built-in within bash.

This Perl script works on the given data. No doubt it could be improved.
#!/usr/bin/env perl
use strict;
use warnings;
my %bases;
while (<>)
{
chomp;
my $name = $_;
my($prefix, $mmdd, $yyyy, $hhmm) = ($name =~ m/(.*)(\d{4})(\d{4})(\d{4})\.csv/);
#print "$name = $prefix $yyyy $mmdd $hhmm\n";
my $stamp = "$yyyy$mmdd$hhmm";
if (!exists($bases{$prefix}) || ($stamp > $bases{$prefix}->{stamp}))
{
$bases{$prefix} = { name => $name, stamp => $stamp };
}
}
foreach my $prefix (sort keys %bases)
{
print "$bases{$prefix}->{name}\n";
}
Output:
APC5502015VP092020121451.csv
CBC5502015VP102020122045.csv
S5502015VP092020122045.csv

this is the awk solution:
cd FTPDIR
ls -1|awk -F"VP" '{split($2,a,".");if(a[1]>b[$1]){b[$1]=$2}}END{for(i in b)print i"VP"b[i]}'
Testted Below:
> cat temp
APC5502015VP072020121826.csv
APC5502015VP082020122314.csv
APC5502015VP092020121451.csv
CBC5502015VP092020122045.csv
CBC5502015VP102020122045.csv
S5502015VP072020121620.csv
S5502015VP072020122314.csv
S5502015VP092020122045.csv
> awk -F"VP" '{split($2,a,".");if(a[1]>b[$1]){b[$1]=$2}}END{for(i in b)print i"VP"b[i]}' temp
CBC5502015VP102020122045.csv
S5502015VP092020122045.csv
APC5502015VP092020121451.csv

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Changing words in text files using multiple dictionaries - bash

Related

Bash script to add double quotes in .CSV comma delimited file

How can extract a value from .ini using sed [duplicate]

shell script compare file with multiple line pattern

appending text to specific line in file bash

Need to pick Latest File From a Dir Using Shell Script

Categories

Resources