Exporting function with xargs parallel & psql in bash - bash

I'm trying to run SQL against one-or-many psql-compatible hosts in parallel, with SQL run in sequence on each host, using xargs.
The bash script, which I'm sourcing from another script:
# Define the count of hosts (also the number of parallel processes)
export pe_fpe_hosts_line_count=$(cat $pe_fpe_hosts_file_loc | wc -l)
# Define the function that runs SQL from a file
function func_pe_exec_sql {
while read pe_sql_file; do
psql -q -t -c "\"$pe_sql_file"\"
done <$pe_fpe_sql_file_loc
}
export -f func_pe_exec_sql
# Define the xargs parallel function
function func_pe_parallel {
while read pe_hosts_file; do
echo $pe_hosts_file | xargs -d '\n' -P $pe_fpe_hosts_line_count func_pe_exec_sql
done <$pe_fpe_hosts_file_loc
}
The error I get: xargs: func_pe_exec_sql: No such file or directory. This is weird - I've exported the function!
Example SQL file:
INSERT INTO public.psql_test SELECT 1 as myint, now() as mytime;
INSERT INTO public.psql_test SELECT 2 as myint, now() as mytime;
INSERT INTO public.psql_test SELECT 3 as myint, now() as mytime;
INSERT INTO public.psql_test SELECT 4 as myint, now() as mytime;
INSERT INTO public.psql_test SELECT 5 as myint, now() as mytime;
Example SQL Host file:
--host=myhost1 --port=5432 --dbname=postgres --username=cooluser
--host=myhost2 --port=5432 --dbname=postgres --username=cooluser
pe_fpe_sql_file_loc is the path to the SQL file, and pe_fpe_hosts_file_loc is the path to the SQL Host file.
The SQL must always be run in separate transactions, and each row in the SQL file needs to be inserted separately, one after another. 5 should be in the same row as the greatest of the mytime values.
I am using it as an ETL framework with functions defined in the database though, and not for simple inserts :)

I think your invocation of xargs is incorrect. You are not actually passing the line from pe_hosts_file to the function func_pe_exec_sql.
You need to pass the input from the pipe to the function, to do that; you need to have a place-holder which -I flag in xargs provides.
-I replace-str
Replace occurrences of replace-str in the initial-arguments with names
read from standard input. Also, unquoted blanks do not terminate input items;
instead the separator is the newline character. Implies -x and -L 1.
Using that something like below needs to be used.
| xargs -d '\n' -I {} -P "$pe_fpe_hosts_line_count" bash -c 'func_pe_exec_sql "{}"'
where the {} is the place-holder for the value piped and we are passing it to the sub-shell spawned by bash -c directly to the function func_pe_exec_sql. The special double quotes around {} is to ensure, the shell to expand the value before the function is invoked.

Related

Is there a way to unpack a config file to cli flags in general?

Basically what foo(**bar) does in python, here I’d want something like
foo **bar.yaml
and that would become
foo --bar1=1 --bar2=2
Where bar.yaml would be
bar1: 1
bar2: 2
You could use a combination of sed and xargs:
sed -E 's/^(.+):[[:space:]]+(.+)$/--\1=\2/' bar.yaml | xargs -d '\n' foo
sed converts the format of bar.yaml lines (e.g. bar1: 1 -> --bar1=1) and xargs feeds the converted lines as arguments to foo.
You could of course modify/extend the sed part to support other formats or single-dash options like -v.
To test if this does what you want, you can run this Bash script instead of foo:
#!/usr/bin/env bash
echo "Arguments: $#"
for ((i=1; i <= $#; i++)); do
echo "Argument $i: '${!i}'"
done
Here's a version for zsh. Run this code or add it to ~/.zshrc:
function _yamlExpand {
setopt local_options extended_glob
# 'words' array contains the current command line
# yaml filename is the last value
yamlFile=${words[-1]}
# parse 'key : value' lines from file, create associative array
typeset -A parms=("${(#s.:.)${(f)"$(<${yamlFile})"}}")
# trim leading and trailing whitespace from keys and values
# requires extended_glob
parms=("${(kv#)${(kv#)parms##[[:space:]]##}%%[[:space:]]##}")
# add -- and = to create flags
typeset -a flags
for key val in "${(#kv)parms}"; do
flags+=("--${key}='${val}'")
done
# replace the value on the command line
compadd -QU -- "$flags"
}
# add the function as a completion and map it to ctrl-y
compdef -k _yamlExpand expand-or-complete '^Y'
At the zsh shell prompt, type in the command and the yaml file name:
% print -l -- ./bar.yaml▃
With the cursor immediately after the yaml file name, hit ctrl+y. The yaml filename will be replaced with the expanded parameters:
% print -l -- --bar1='1' --bar2='2' ▃
Now you're set; you can hit enter, or add parameters, just like any other command line.
Notes:
This only supports the yaml subset in your example.
You can add more yaml parsing to the function, possibly with yq.
In this version, the cursor must be next to the yaml filename - otherwise the last value in words will be empty. You can add code to detect that case and then alter the words array with compset -n.
compadd and compset are described in the zshcompwid man page.
zshcompsys has details on compdef; the section on autoloaded files describes another way to deploy something like this.

Bash: escaping variables

I'm writing a bash one liner.
This works (prints date to the console or tries to execute it):
-bash-4.1$ DATE=`$(date --date="2 days ago" +%F)` echo "${DATE}"
But this:
`DATE=$(date --date="2 days ago" +%F)` psql -d some_db -c "select row from table where started >= '${DATE}' and started < ('${DATE}'::date + '1 day'::interval);"
Gives:
ERROR: invalid input syntax for type timestamp: ""
LINE 1: ... table where started >= '' ...
Which means that ${DATE} is empty there.
What should I modify to make it work?
The first command is broken for a couple of reasons. Firstly, you are mixing backticks and $() in such a way that you are executing the output of the date command as if it were the name of another command.
To save the output of a command to a variable, use this syntax:
output=$(command)
Secondly, the variable would be expanded by the shell before it had been assigned a value, so you need to split up the command into two statements:
date=$(date_command); psql -c "select ... '$date'"
...or use a command substitution directly:
psql -c "select ... '$(date_command)"
Why do the date math in bash? PG can do it directly:
SELECT ... WHERE started >= (datefield - interval '2 day')
^^^^^^^^^^^^^^^^^

Pass external variable to xidel in bash loop script

I try to parse html page using XPath with xidel.
The page have a table with multiple rows and columns
I need to get values from each row from columns 2 and 5 (IP and port) and store them in csv-like file.
Here is my script
#!/bin/bash
for (( i = 2; i <= 100; i++ ))
do
xidel http://www.vpngate.net/en/ -e '//*[#id="vg_hosts_table_id"]/tbody/tr["'$i'"]/td[2]/span[1]' >> "$i".txt #get value from first column
xidel http://www.vpngate.net/en/ -e '//*[#id="vg_hosts_table_id"]/tbody/tr["'$i'"]/td[5]' >> "$i".txt #get value from second column
sed -i ':a;N;$!ba;s/\n/^/g' "$i".txt #replace newline with custom delimiter
sed -i '/\s/d' "$i".txt #remove blanks
cat "$i".txt >> ip_port_list #create list
zip -m ips.zip "$i".txt #archive unneeded texts
done
The perfomance is not issue
When i manually increment each tr - looks perfect. But not with variable from loop.
I want to receive a pair of values from each row.
Now i got only partial data or even empty file
I need to get values from each row from columns 2 and 5 (IP and port) and store them in csv-like file.
xidel -s "https://www.vpngate.net/en/" -e '
(//table[#id="vg_hosts_table_id"])[3]//tr[not(td[#class="vg_table_header"])]/concat(
td[2]/span[#style="font-size: 10pt;"],
",",
extract(
td[5],
"TCP: (\d+)",
1
)
)
'
220.218.70.177,443
211.58.36.54,995
1.239.223.190,1351
[...]
153.207.18.229,1542
(//table[#id="vg_hosts_table_id"])[3]: Select the 3rd table of its
kind. The one you want.
//tr[not(td[#class="vg_table_header"])]: Select all rows, except the headers.
td[2]/span[#style="font-size: 10pt;"]: Select the 2nd column and the <span> that contains just the IP-address.
extract(td[5],"TCP: (\d+)",1): Select the 5th column and extract (regex) the numerical value after "TCP ".
Maybe this xidel line will come in handy:
xidel -q http://www.vpngate.net/en/ -e '//*[#id="vg_hosts_table_id"]/tbody/tr[*]/concat(td[2]/span[1],",",substring-after(substring-before(td[5],"UDP:"),"TCP: "))'
This will only do one fetch (so the admins of vpngate won't block you) and it'll also create a CSV output (ip,port)... Hopefully that is what you were looking for?

How to extract the sybase sql query output in a shell script

I am trying to execute a SQL query on SYBASE database using shell script.
A simple query to count the number of rows in a table.
#!/bin/sh
[ -f /etc/bash.bashrc.local ] && . /etc/bash.bashrc.local
. /gi/base_environ
. /usr/gi/bin/environ
. /usr/gi/bin/path
ISQL="isql <username> guest"
count() {
VAL=$( ${ISQL} <<EOSQL
set nocount on
go
set chained off
go
select count(*) from table_name
go
EOSQL
)
echo "VAL : $VAL"
echo $VAL | while read line
do
echo "line : $line"
done
}
count
The above code gives the output as follows
VAL : Password:
-----------
35
line : Password: ----------- 35
Is there a way to get only the value '35'. What I am missing here? Thanks in advance.
The "select count(*)" prints a result set as output, i.e. a column header (here, that's blank), a line of dashes for each column, and the column value for every row. Here you have only 1 column and 1 row.
If you want to get rid of the dashes, you can do various things:
select the count(*) into a variable and just PRINT the variable. This will remove the dashes from the output
perform some additional filtering with things like grep and awk on the $VAL variable before using it
As for the 'Password:' line: you are not specifying a password in the 'isql' command, so 'isql' will prompt for it (since it works, it looks like there is no password). Best specify a password flag to avoid this prompt -- or filter out that part as mentioned above.
Incidentally, it looks like you may be using the 'isql' from the Unix/Linux ODBC installation, rather than the 'isql' utility that comes with Sybase. Best use the latter (check with 'which isql').

bash: calling a scripts with double-quote argument

I have a bash scripts which an argument enclosed with double quotes, which creates a shape-file of map within the given boundries, e.g.
$ export_map "0 0 100 100"
Within the script, there are two select statements:
select ENCODING in UTF8 WIN1252 WIN1255 ISO-8859-8;
...
select NAV_SELECT in Included Excluded;
Naturally, these two statements require the input to enter a number as an input. This can by bypassed by piping the numbers, followed by a newline, to the script.
In order to save time, I would like to have a script that would create 8 maps - for each combination of ENCODING (4 options) and NAV_SELECT (2 options).
I have written another bash script, create_map, to server as a wrapper:
#!/bin/bash
for nav in 1 2 3 4;
do
for enc in 1 2;
do
printf "$nav\n$enc\n" | /bin/bash -c "./export_map.sh \"0 0 100 100\""
done
done
**This works (thanks, Brian!), but I can't find a way to have the numeric argument "0 0 100 100" being passed from outside the outer script. **
Basically, I'm looking for way to accept an argument within double quotes to a wrapper bash script, and pass it - with the double quotes - to an inner script.
CLARIFICATIONS:
export_map is the main script, being called from create_map 8 times.
Any ideas?
Thanks,
Adam
If I understand your problem correctly (which I'm not sure about; see my comment), you should probably add another \n to your printf; printf does not add a trailing newline by default the way that echo does. This will ensure that the second value will be read properly by the select command which I'm assuming appears in export_map.sh.
printf "$nav\n$enc\n" | /bin/bash -c "./export_map.sh \"100 200 300 400\""
Also, I don't think that you need to add the /bin/bash -c and quote marks. The following should be sufficient, unless I'm missing something:
printf "$nav\n$enc\n" | ./export_map.sh "100 200 300 400"
edit Thanks for the clarification. In order to pass an argument from your wrapper script, into the inner script, keeping it as a single argument, you can pass in "$1", where the quotes indicate that you want to keep this grouped as one argument, and $1 is the first parameter to your wrapper script. If you want to pass all parameters from your outer script in to your inner script, each being kept as a single parameter, you can use "$#" instead.
#!/bin/bash
for nav in 1 2 3 4;
do
for enc in 1 2;
do
printf "$nav\n$enc\n" | ./export_map.sh "$1"
done
done
Here's a quick example of how "$#" works. First, inner.bash:
#!/bin/bash
for str in "$#"
do
echo $str
done
outer.bash:
#!/bin/bash
./inner.bash "$#"
And invoking it:
$ ./outer.bash "foo bar" baz "quux zot"
foo bar
baz
quux zot

Resources