searching for a solution to write log file pg restore - windows

I get an conflict error with -d and -f together, do you have a solution ?
pg_restore -d mydb -h myhost --clean --verbose c:\dba\Manager\tk\Tasks\import-tk-21aug2022\budget-app-sara-21.8_updated_withdata -f c:\dba\manager\tk\restore-tk-21aug2022.log

-f is not for a log file. -f says instead of restoring the dump to a database, write a SQL script which will do the restore.
-d says to restore the dump to the given database.
They say to do two different things, so they conflict. You have to decide which one you want to do.
See the pg_restore docs.

Related

docker mkdir won't create a directory

I am trying to run a bash script which should load data into jena. This script comes from a github repository and was allegedly working on the owner's machine but on mine it won't run, even though I followed the instructions. So let me first describe what the script does based on my understanding: It should load .nt data (RDF data) into Jena using docker by using the docker image of jena, named stain/jena. Here is the script:
#/bin/bash
files=$(echo $(pwd)/rawdata-bearb/hour/alldata.IC.nt/*.nt | sed "s%$(pwd)/rawdata-bearb/hour/alldata.IC.nt%/var/data/in%g")
mkdir output # added
for file in $files; do
v=$(echo $file | sed "s/^.*\/\([0-9][0-9]*\)\.nt$/\1-1/" | bc)
echo "$v"
mkdir -p /var/data/out/ic/$v
time docker run \
-it \
--rm \
-v $(pwd)/tdb-bearb-hour/:/var/data/out/ \
-v $(pwd)/rawdata-bearb/hour/alldata.IC.nt/:/var/data/in/ \
stain/jena /jena/bin/tdbloader2 \
--sort-args "-S=16G" \
--loc /var/data/out/ic/$v $file \
> output/load-bearb-hour-ic-$v-.txt
done
However, when I execute the script, I get following message from the saved log file:
13:12:46 INFO -- TDB Bulk Loader Start
mkdir: cannot create directory ‘/var/data/out/ic/0’: No such file or directory
13:12:46 ERROR Failed during data phase
According to the tdbloader2 manual the --loc parameter should create the directory if it does not exist
-- loc: Sets the location in which the database should be created.
This location must be a directory and must be empty,
if a non-existent path is specified it will be created as a new directory.
I created the directories /var/data/out/ic/0 - /var/data/out/ic/10 manually and re-executed the script. Still, I got the same error message. My first guess was that tdbloader2 or docker use the mkdir command without the -p parameter but since I manually created the directories, thus, they existed before the execution and I still got the same error, it must be something else. I am kindly asking for your help

is there is a way to automate the redshift vaccum process through udf?

I have more that 300+ table in redshift.
Data is getting update daily basic just want to know can i create a udf in redshift to automate the vaccum process.
I found a link automate using python but not that great python coder i am so looking for solution in sql script.
Unfortunately, you can't use a udf for something like this, udf's are simple input/ouput function meant to be used in queries.
Your best bet is to use this open source tool from AWS Labs: VaccumAnalyzeUtility. The great thing about using this tool is that it is very smart about only running VACUUM on tables that need them, and it will also run ANALYZE on tables that need it.
It's pretty easy to set up as cron job. Here is an example of how it can be done:
Pull the amazon-redshift-utils repo in git:
git clone https://github.com/awslabs/amazon-redshift-utils
cd amazon-redshift-utils
Create a script that can be run by cron. In your text editor create a file called run_vacuum_analyze.sh with the following, and fill in the values for the your environment:
export REDSHIFT_USER=<your db user name>
export REDSHIFT_PASSWORD=<your db password>
export REDSHIFT_DB=<your db>
export REDSHIFT_HOST=<your redshift host>
export REDSHIFT_PORT=<your redshift port>
export WORKSPACE=$PWD/src/AnalyzeVacuumUtility
#
# VIRTUALENV
#
rm -rf $WORKSPACE/ve1
virtualenv -p python2.6 "$WORKSPACE/ve1"
# enter virutalenv
source $WORKSPACE/ve1/bin/activate
#
# DEPENDENCIES
#
pip install PyGreSQL
cd $WORKSPACE/run
#
# RUN IT
#
python analyze-vacuum-schema.py --db $REDSHIFT_DB --db-user $REDSHIFT_USER --db-pwd $REDSHIFT_PASSWORD --db-port $REDSHIFT_PORT --db-host $REDSHIFT_HOST
Then create a cron job that will run this script (In this example, I run it daily at 2:30 AM)
chmod +x run_vacuum_analyze.sh
crontab -e
Add the following entry:
30 2 * * * <path-to-the-cloned-repo>/run_vacuum_analyze.sh
You CANNOT use a UDF for this, UDFs cannot run command that update data.
Yes, I have created a AWS lamda function in java and used cloud watch event to schedule using a cron. AWS lamda function in java expects shaded jar to be uploaded. I have created environment variable in lamda function for redshift connection properties which are passed into java handler.
Now you can use auto vaccum ,Redshift now providing this option
Here is my shell script utility to automate this with better control over the table filters.
https://thedataguy.in/automate-redshift-vacuum-analyze-using-shell-script-utility/
Example Commands:
Run vacuum and Analyze on all the tables.
./vacuum-analyze-utility.sh -h endpoint -u bhuvi -d dev
Run vacuum and Analyze on the schema sc1, sc2.
./vacuum-analyze-utility.sh -h endpoint -u bhuvi -d dev -s 'sc1,sc2'
Run vacuum FULL on all the tables in all the schema except the schema sc1. But don’t want Analyze
./vacuum-analyze-utility.sh -h endpoint -u bhuvi -d dev -k sc1 -o FULL -a 0 -v 1
or
./vacuum-analyze-utility.sh -h endpoint -u bhuvi -d dev -k sc1 -o FULL -a 0
Run Analyze only on all the tables except the tables tb1,tbl3.
./vacuum-analyze-utility.sh -h endpoint -u bhuvi -d dev -b 'tbl1,tbl3' -a 1 -v 0
or
./vacuum-analyze-utility.sh -h endpoint -u bhuvi -d dev -b 'tbl1,tbl3' -v 0
Use a password on the command line.
./vacuum-analyze-utility.sh -h endpoint -u bhuvi -d dev -P bhuvipassword
Run vacuum and analyze on the tables where unsorted rows are greater than 10%.
./vacuum-analyze-utility.sh -h endpoint -u bhuvi -d dev -v 1 -a 1 -x 10
or
./vacuum-analyze-utility.sh -h endpoint -u bhuvi -d dev -x 10
Run the Analyze on all the tables in schema sc1 where stats_off is greater than 5.
./vacuum-analyze-utility.sh -h endpoint -u bhuvi -d dev -v 0 -a 1 -f 5
Run the vacuum only on the table tbl1 which is in the schema sc1 with the Vacuum threshold 90%.
./vacuum-analyze-utility.sh -h endpoint -u bhuvi -d dev -s sc1 -t tbl1 -a 0 -c 90
Run analyze only the schema sc1 but set the analyze_threshold_percent=0.01
./vacuum-analyze-utility.sh -h endpoint -u bhuvi -d dev -s sc1 -t tbl1 -a 1 -v 0 -r 0.01
Do a dry run (generate SQL queries) for analyze all the tables on the schema sc2.
./vacuum-analyze-utility.sh -h endpoint -u bhuvi -d dev -s sc2 -z 1

GoAccess not recognizing --keep-db-files and others

I have installed goaccess-1.0.2 along with the ncurses and the optional dependencies provided by goaccess man page on my Ubuntu 16.04 OS.
I am successfully able to parse logs with the following command: goaccess -f access.log -a. I am able to create logs into an html report with the following command goaccess -f access.log -a -o report.html.
I am not able to successfully parse data into real time html output with the following command goaccess -f access.log -o report.html --real-time-html. However, it does start to parse, but at some point it freezes. So I assume the data is to large for memory.
So I wanted to learn how to parse data to disk. I used the following command goaccess -f access.log --keep-db-files. I received the following feedback: goaccess: unrecognized option '--keep-db-files'
I thought maybe I missed dependencies, but I checked back through the goaccess man page and it doesn't seem that I am.
However, it does start to parse, but at some point it freezes. So I assume the data is to large for memory.
If you are successfully parsing the same log with goaccess -f access.log -a, then using --real-time-html shouldn't make any difference. Check your memory usage with top or take a look at dmesg.
I used the following command goaccess -f access.log --keep-db-files. I received the following feedback: goaccess: unrecognized option '--keep-db-files'
You need to compile GoAccess with btree support. To do this, you need to install TokyoCabinet and then build goaccess from source:
$ sudo apt-get install libtokyocabinet-dev
$ curl http://tar.goaccess.io/goaccess-1.0.2.tar.gz | tar xvz
$ cd goaccess-1.0.2/
$ ./configure --enable-utf8 --enable-geoip --enable-tcb=btree
$ make
$ sudo make install

Using CURL to download file is having issue

I am trying to download a file from remote server using curl
curl -u username:password -O https://remoteserver/filename.txt
In my case a file filename.txt is getting created but the content of file says virtaul user logged in. It is not downloading the actual file.
I am not sure why this is happening. Any help on why the download is not working.
Try this in terminal:
curl -u username:password -o filedownload.txt -0 https://remoteserver/filename.txt
This command with -o will copy the contents of filename.txt to filedownload.txt in the current working directory.

MySQLdump with arguments

Hello to professionals !
There was a good and simplest script idea to make mysqldump of every database - taken from
dump all mysql tables into separate files automagically?
author - https://stackoverflow.com/users/1274838/elias-torres-arroyo
with script as follows
#!/bin/bash
# Optional variables for a backup script
MYSQL_USER="root"
MYSQL_PASS="PASSWORD"
BACKUP_DIR="/backup/01sql/";
# Get the database list, exclude information_schema
for db in $(mysql -B -s -u $MYSQL_USER --password=$MYSQL_PASS -e 'show databases' | grep -v information_schema)
do
# dump each database in a separate file
mysqldump -u $MYSQL_USER --password=$MYSQL_PASS "$db" | gzip > "$BACKUP_DIR/$db.sql.gz"
done
sh
but the problem is that this script does not "understand" arguments like
--add-drop-database
to perform
mysqldump -u $MYSQL_USER --password=$MYSQL_PASS "$db" --add-drop-database | gzip > "$BACKUP_DIR/$db.sql.gz"
Is there any idea how to force this script to understand the additional arguments listed under
mysqldump --help
because while all my tests shows it doesn't.
Thank you in advance for any hint to try !
--add-drop-database works only with --all-databases or --databases.
See please the reference in docs
So in your case mysqldump utility ignore mentioned parameter because you are going to dump one database.

Resources