Show runtime of a query on monetdb - monetdb

I am testing monetdb for a colunmnar storage.
I already installed and run the server
but, when I connect to the client and run a query, the response does not show the time to execute the query.
I am connecting as:
mclient -u monetdb -d voc
I already tried to connect with interactive like:
mclient -u monetdb -d voc -i
Output example:
sql>select count(*) from voc.regions;
+---------+
| L3 |
+=========+
| 5570699 |
+---------+
1 tuple

As mkersten mentioned, I would read through the options of the mclient utility first.
To get server and client timing measurements I used --timer=performance option when starting mclient.
Inside mclient I would then disable the result output by setting \f trash to ignore the results when measuring only.
Prepend trace to your query and you get your results like this:
sql>\f trash
sql>trace select count(*) from categories;
sql:0.000 opt:0.266 run:1.713 clk:5.244 ms
sql:0.000 opt:0.266 run:2.002 clk:5.309 ms
The first of the two lines shows you the server timings, the second one the overall timing including passing the results back to the client.

If you use the latest version MonetDB-Mar18 you have good control over the performance timers, which includes parsing, optimization, and runtime at server. See mclient --help.

Related

ESQL performance tools

I would like to analyze separate ESQL modules for performance on IBM Integration Bus, not whole application with PerfHarness. I know that exists list of good practices for write ESQL (for example, this - ESQL code tips)
But is it tool for performance analysis only one ESQL module?
You can check through your Broker 'Web User Interface'. Just turn your flow (with your ESQL code) statistics on and it will show how many time the process took in each node.
I know this is rather old but it still covers the basics. https://www.ibm.com/developerworks/websphere/library/techarticles/0406_dunn/0406_dunn.html The section on "Isolate the problem using accounting and statistics" should answer your question. And the part on using trace should help you profile the statements within an ESQL module.
The trace file generated at the debug level shows you how long each statement took to execute down to microsecond precision helping you to find the problematic statement or loop.
To get a trace file do the following
Step :1 - Start a user trace using the command below
mqsichangetrace <Node> -u -e <Server> -f <MessageFlowName> -l debug -r
Step :2 - Send a message through the message flow.
Step :3 - Stop the trace using the below MQSI command
mqsichangetrace <Node> -u -e <Server> -f "<Message Flow Name>" -l none
Step :4 - Read the trace content into a file :
mqsireadlog <Node> -u -e <Server> -f -o flowtrace.xml
Step :5 - Format the XML trace file into user readable format.
mqsiformatlog -i flowtrace.xml -o flowtrace.txt
Examine the text file.

db2 drop few database as once

recently I started to work with db2, and created few databases.
To drop a single DB I should use db2 drop db demoDB, is there a way to drop all DBs at once?
Thanks
Taking into account the previous answer, this set of lines do the same without creating a script.
db2 list db directory | tail -n +6 | sed 'N;N;N;N;N;N;N;N;N;N;N;s/\n/ /g' | awk '$28 = /Indirect/ {print "db2 drop database "$7}' | source /dev/stdin
This filters the local databases, and executes the generated output.
(Only works in English environment)
first , i don't think there is any db2 nature way to do that. But I usually do the following thing. At start, the way to see all the databases on your instance is one of the following:
db2 list db directory
db2 list active active databases
Depends on your need ( all DBs or just the active DBs)
I'm sure there is more DBs lists you can get.(at DB2 user guide)
The way I usually drop all my DBs is by using shell script:
1. create new script by using 'vi db2_drop_all.sh' or some other way you want.
2. paste the code:
#!/bin/bash -x
for db_name in $(db2 list db directory | grep Database | \
grep name | cut -d= -f2); do
db2 drop db $db_name || true
done
exit 0
3. save changes
4. and just run the script (after you switched to your instance of course ) sh db2_drop_all.sh
Notice that in step 2 you can change the list of DBs as you wish. ( for example to db2 list active databases)
Hope it helped you. :)

Can you view historic logs for parse.com cloud code?

On the Parse.com cloud-code console, I can see logs, but they only go back maybe 100-200 lines. Is there a way to see or download older logs?
I've searched their website & googled, and don't see anything.
Using the parse command-line tool, you can retrieve an arbitrary number of log lines:
Usage:
parse logs [flags]
Aliases:
logs, log
Flags:
-f, --follow=false: Emulates tail -f and streams new messages from the server
-l, --level="INFO": The log level to restrict to. Can be 'INFO' or 'ERROR'.
-n, --num=10: The number of the messages to display
Not sure if there is a limit, but I've been able to fetch 5000 lines of log with this command:
parse logs prod -n 5000
To add on to Pascal Bourque's answer, you may also wish to filter the logs by a given range of dates. To achieve this, I used the following:
parse logs -n 5000 | sed -n '/2016-01-10/, /2016-01-15/p' > filteredLog.txt
This will get up to 5000 logs, use the sed command to keep all of the logs which are between 2016-01-10 and 2016-01-15, and store the results in filteredLog.txt.

Is it possible to count the number of partitions?

I am working on a test in which I must find out the number of partitions of a table and check if it is right. If I use show partitions TableName I get all the partitions by name, but I wish to get the number of partitions, like something along the lines show count(partitions) TableName (which retuns OK btw.. so it's not good) and get 12 (for ex.).
Is there any way to achieve this??
Using Hive CLI
$ hive --silent -e "show partitions <dbName>.<tableName>;" | wc -l
--silent is to enable silent mode
-e tells hive to execute quoted query string
You could use:
select count(distinct <partition key>) from <TableName>;
By using the below command, you will get the all partitions and also at the end it shows the number of fetched rows. That number of rows means number of partitions
SHOW PARTITIONS [db_name.]table_name [PARTITION(partition_spec)];
< failed pictoral example >
You can use the WebHCat interface to get information like this. This has the benefit that you can run the command from anywhere that the server is accessible. The result is JSON - use a JSON parser of your choice to process the results.
In this example of piping the WebHCat results to Python, only the number 24 is returned representing the number of partitions for this table. (Server name is the name node).
curl -s 'http://*myservername*:50111/templeton/v1/ddl/database/*mydatabasename*/table/*mytablename*/partition?user.name=*myusername*' | python -c 'import sys, json; print len(json.load(sys.stdin)["partitions"])'
24
In scala you can do following:
sql("show partitions <table_name>").count()
I used following.
beeline -silent --showHeader=false --outputformat=csv2 -e 'show partitions <dbname>.<tablename>' | wc -l
Use the following syntax:
show create table <table name>;

PostgreSQL: improving pg_dump, pg_restore performance

When I began, I used pg_dump with the default plain format. I was unenlightened.
Research revealed to me time and file size improvements with pg_dump -Fc | gzip -9 -c > dumpfile.gz. I was enlightened.
When it came time to create the database anew,
# create tablespace dbname location '/SAN/dbname';
# create database dbname tablespace dbname;
# alter database dbname set temp_tablespaces = dbname;
% gunzip dumpfile.gz # to evaluate restore time without a piped uncompression
% pg_restore -d dbname dumpfile # into a new, empty database defined above
I felt unenlightened: the restore took 12 hours to create the database that's only a fraction of what it will become:
# select pg_size_pretty(pg_database_size('dbname'));
47 GB
Because there are predictions this database will be a few terabytes, I need to look at improving performance now.
Please, enlighten me.
First check that you are getting reasonable IO performance from your disk setup. Then check that you PostgreSQL installation is appropriately tuned. In particular shared_buffers should be set correctly, maintenance_work_mem should be increased during the restore, full_page_writes should be off during the restore, wal_buffers should be increased to 16MB during the restore, checkpoint_segments should be increased to something like 16 during the restore, you shouldn't have any unreasonable logging on (like logging every statement executed), auto_vacuum should be disabled during the restore.
If you are on 8.4 also experiment with parallel restore, the --jobs option for pg_restore.
Improve pg dump&restore
PG_DUMP | always use format-directory and -j options
time pg_dump -j 8 -Fd -f /tmp/newout.dir fsdcm_external
PG_RESTORE | always use tuning for postgres.conf and format-directory and -j options
work_mem = 32MB
shared_buffers = 4GB
maintenance_work_mem = 2GB
full_page_writes = off
autovacuum = off
wal_buffers = -1
time pg_restore -j 8 --format=d -C -d postgres /tmp/newout.dir/
Two issues/ideas:
By specifying -Fc, the pg_dump output is already compressed. The compression is not maximal, so you may find some space savings by using "gzip -9", but I would wager it's not enough to warrant the extra time (and I/O) used compressing and uncompressing the -Fc version of the backup.
If you are using PostgreSQL 8.4.x you can potentially speed up the restore from a -Fc backup with the new pg_restore command-line option "-j n" where n=number of parallel connections to use for the restore. This will allow pg_restore to load more than one table's data or generate more than one index at the same time.
I assume you need backup, not a major upgrade of database.
For backup of large databases you should setup continuous archiving instead of pg_dump.
Set up WAL archiving.
Make your base backups for example every day by using
psql template1 -c "select pg_start_backup('`date +%F-%T``')"
rsync -a --delete /var/lib/pgsql/data/ /var/backups/pgsql/base/
psql template1 -c "select pg_stop_backup()"`
A restore would be as simple as restoring database and WAL logs not older than pg_start_backup time from backup location and starting Postgres. And it will be much faster.
zcat dumpfile.gz | pg_restore -d db_name
Removes the full write of the uncompressed data to disk, which is currently your bottleneck.
As you may have guessed simply by the fact that compressing the backup results in faster performance, your backup is I/O bound. This should come as no surprise as backup is pretty much always going to be I/O bound. Compressing the data trades I/O load for CPU load, and since most CPUs are idle during monster data transfers, compression comes out as a net win.
So, to speed up backup/restore times, you need faster I/O. Beyond reorganizing the database to not be one huge single instance, that's pretty much all you can do.
If you're facing issues with the speed of pg_restore check whether you dumped your data using INSERT or COPY statement.
If you use INSERT (pg_dump is called with --column-inserts parameter) the restore of data would be significantly slower.
Using INSERT is good for making dumps that are loaded into non-Postgres databases. But if you do a restore to Postgres omit using --column-inserts parameter when using pg_dump.

Resources