Parse and filter MySQL Slow Query Logs Using Grafana - grafana-loki

We have MySQL Slow Query logs writing to Grafana 9.3.6.
Given a MYSQL slow log like the following, I'm trying to filter the log output to logs that are slower than, say, one second.
# User#Host: kermit[muppets] # [99.99.99.99] Id: 54908918
# Schema: frogs Last_errno: 0 Killed: 0
# Query_time: 0.000218 Lock_time: 0.000081 Rows_sent: 1 Rows_examined: 1 Rows_affected: 0 Bytes_sent: 665
# Tmp_tables: 0 Tmp_disk_tables: 0 Tmp_table_sizes: 0
# InnoDB_trx_id: 0
# QC_Hit: No Full_scan: No Full_join: No Tmp_table: No Tmp_table_on_disk: No
# Filesort: No Filesort_on_disk: No Merge_passes: 0
# InnoDB_IO_r_ops: 0 InnoDB_IO_r_bytes: 0 InnoDB_IO_r_wait: 0.000000
# InnoDB_rec_lock_wait: 0.000000 InnoDB_queue_wait: 0.000000
# InnoDB_pages_distinct: 9
# Log_slow_rate_type: query Log_slow_rate_limit: 1000
SET timestamp=1676569875;
select id FROM characters WHERE name='monster';
I've made it this far,
{service="db::muppets"} |~ `Query_time: (\d*\.\d*)`
which correctly highlights the field in the log messages, but now I'd like to use that (\d*\.\d*) capture group to reduce the logs to the queries that are more than one second.
It seems like I need something like this, but this returns no results.
{service="db::muppets"} |~ `Query_time: (?P<query_time>\d*\.\d*)` | query_time > 1
I assume there needs to be some type of text-to-number conversion of the query_time label, but I can't figure that part out.
Grafana is totally new to me.

Related

Mock time in prometheus alert unit test

Morning everyone,
I have a prometheus alert that looks like that :
whatever_expr > 0.10 AND ON() absent(hour() >= 2 <= 3)
That excludes alert during some maintenance schedule.
Now that I want to unit test it, I can't find any information to mock the result of the hour() function.
I would like to do something like that in the test.yaml
- interval: 1m
input_series:
- series: 'hour'
values: 2 2 2 2 2 2 2 2 2 2
I start to think it's not even possible ?
If any idea, I'd be glad to hear it :)
I have a similar issue since i was using time(). I found this issue on github https://github.com/prometheus/docs/issues/1464 and it appears that all time related functions start at the timestamp 0, which equals Thu Jan 01 1970 00:00:00, and increase based on the interval of the test.
Its not very intuitive but I managed to provide a series for time based on the following schema:
tests:
- interval: 1h
input_series:
- series: 'series_xyz'
values: '1+0x5'
alert_rule_test:
- eval_time: 0h # => hour would be 0
alertname: alertXYZ
- eval_time: 1h # => hour would be 1
alertname: alertXYZ
- eval_time: 2h # => hour would be 2
alertname: alertXYZ
- eval_time: 3h # => hour would be 3
alertname: alertXYZ
- eval_time: 4h # => hour would be 4
.....

ORACLE Program code to use filter and pivot table

IN EXCEL SHEET FOR THE BELOW INPUT, I HAVE TO USE FILTER TO “NET” FIRST WHERE NET=APB AND NEED TO FILTER “CODE VALUES” AS WDL, LRTF & NEED TO USE “PIVOT TABLE” TO GET OUTPUT WITH COUNT AS:
BUT I NEED CODE IN ORACLE TO RUN FOR THE FOLLOWING OUTPUT:-
INPUT:
STTID
AMOUNT
NET
CODE
SVPC12309A
5000
NFS
SOP
SVPC12309A
10000
NFS
WDL
000DHP11291
2500
APB
WDL
SVPC12309A
3000
CMV
LRTF
SVPC12309A
3000
CMV
WDL
DHP12341
4500
APB
LRTF
DHP23451
9500
APB
LRTF
DHP12341
5500
APB
LRTF
OUTPUT AS:
STTID
LRTF
WDL
TOTAL
000DHP11291
0
1
1
DHP12341
2
0
2
DHP23451
1
0
1
It appears you want something like
select sttid,
sum( case when code = 'LRTF' then 1 else 0 end ) ltrf,
sum( case when code = 'WDL' then 1 else 0 end ) wdl,
sum( case when code in ('WDL', 'LTRF') then 1 else 0 end) total
from your_table_name
group by sttid

High memory usage on digital ocean droplet

I have a laravel application which I've installed on a 1GB standard droplet running ubuntu 20.4, nginx, MySQL 8 and php 7.4
The application isn't even live yet and I notice it's already using over 50% memory. Yesterday it was using 80% and after a system reboot its returned to using around 60% memory usage.
Below is a snap shot of the current high memory running processes. Is this level of memory usage normal for a laravel application which is not even live i.e. limited load?
top - 19:41:00 up 3:46, 1 user, load average: 0.08, 0.04, 0.01
Tasks: 101 total, 1 running, 100 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.3 us, 0.7 sy, 0.0 ni, 98.7 id, 0.3 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 981.3 total, 90.6 free, 601.4 used, 289.3 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 212.2 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
815 mysql 20 0 1305900 417008 13352 S 0.7 41.5 1:32.39 mysqld
2257 www-data 20 0 245988 44992 30180 S 0.0 4.5 0:04.67 php-fpm7.4
2265 www-data 20 0 243700 42204 29572 S 0.0 4.2 0:04.41 php-fpm7.4
2259 www-data 20 0 243960 42104 30380 S 0.0 4.2 0:04.44 php-fpm7.4
988 root 20 0 125160 36188 10604 S 0.3 3.6 0:09.89 php
388 root 19 -1 84404 35116 33932 S 0.0 3.5 0:01.14 systemd-journ+
741 root 20 0 627300 20936 6656 S 0.0 2.1 0:02.11 snapd
738 root 20 0 238392 18588 12624 S 0.0 1.8 0:00.83 php-fpm7.4
743 root 20 0 31348 18344 3844 S 0.0 1.8 0:02.75 supervisord
544 root rt 0 280180 17976 8184 S 0.0 1.8 0:00.90 multipathd
825 root 20 0 108036 15376 7732 S 0.0 1.5 0:00.10 unattended-up+
736 root 20 0 29220 13200 5544 S 0.0 1.3 0:00.11 networkd-disp+
726 do-agent 20 0 559436 12120 6588 S 0.0 1.2 0:01.78 do-agent
1 root 20 0 101964 11124 8024 S 0.0 1.1 0:02.52 systemd
623 systemd+ 20 0 23912 10488 6484 S 0.0 1.0 0:00.42 systemd-resol+
778 www-data 20 0 71004 9964 5240 S 0.0 1.0 0:02.43 nginx
My concern is once the application goes live and the load increases, more database connection it going to run out of memory. I know I can resize the droplet and increase the memory or set up some swap space but is this amount of memory usage normal for an unused application?
How can I optimize the high memory usage processes such as mysql, niginx, php. Mysql8 appear to be the main culprit hogging all the memory. Below are my mysql setting:
#
# The MySQL database server configuration file.
#
# One can use all long options that the program supports.
# Run program with --help to get a list of available options and with
# --print-defaults to see which it would actually understand and use.
#
# For explanations see
# http://dev.mysql.com/doc/mysql/en/server-system-variables.html
# Here is entries for some specific programs
# The following values assume you have at least 32M ram
[mysqld]
#
# * Basic Settings
#
user = mysql
# pid-file = /var/run/mysqld/mysqld.pid
# socket = /var/run/mysqld/mysqld.sock
# port = 3306
# datadir = /var/lib/mysql
# If MySQL is running as a replication slave, this should be
# changed. Ref https://dev.mysql.com/doc/refman/8.0/en/server-system- variables.html#sysvar_tmpdir
# tmpdir = /tmp
#
# Instead of skip-networking the default is now to listen only on
# localhost which is more compatible and is not less secure.
bind-address = 127.0.0.1
#
# * Fine Tuning
#
key_buffer_size = 16M
# max_allowed_packet = 64M
# thread_stack = 256K
# thread_cache_size = -1
# This replaces the startup script and checks MyISAM tables if needed
# the first time they are touched
myisam-recover-options = BACKUP
# max_connections = 151
# table_open_cache = 4000
#
# * Logging and Replication
#
# Both location gets rotated by the cronjob.
#
# Log all queries
# Be aware that this log type is a performance killer.
# general_log_file = /var/log/mysql/query.log
# general_log = 1
#
# Error log - should be very few entries.
#
log_error = /var/log/mysql/error.log
#
# Here you can see queries with especially long duration
# slow_query_log = 1
# slow_query_log_file = /var/log/mysql/mysql-slow.log
# long_query_time = 2
# log-queries-not-using-indexes
#
# The following can be used as easy to replay backup logs or for replication.
# note: if you are setting up a replication slave, see README.Debian about
# other settings you may need to change.
# server-id = 1
# log_bin = /var/log/mysql/mysql-bin.log
# binlog_expire_logs_seconds = 2592000
max_binlog_size = 100M
# binlog_do_db = include_database_name
# binlog_ignore_db = include_database_name
Any tips and advice much appreciate as this is the first time I'm using a vps.

Spark SQL performance too slow if the number of rows are 100000

I'm testing Spark performance with very many rows table.
What I did is very simple.
Prepare csv file which has many rows and only 2 data records.
eg, csv file is like as follows:
col000001,col000002,,,,,,,col100000
dtA000001,dtA000002,,,,,,,,dtA100000
dtB000001,dtB000002,,,,,,,,dtB100000
dfdata100000 = sqlContext.read.csv('../datasets/100000c.csv', header='true')
dfdata100000.registerTempTable("tbl100000")
result = sqlContext.sql("select col000001,ol100000 from tbl100000")
Then get 1 row by show(1)
%%time
result.show(1)
File sizes are as follows(very small).
File name shows the number of rows:
$ du -m *c.csv
3 100000c.csv
1 10000c.csv
1 1000c.csv
1 100c.csv
1 20479c.csv
2 40000c.csv
2 60000c.csv
3 80000c.csv
Results are like as follows:
As you can see, the execution time is exponentially increase.
Example result:
+---------+---------+
|col000001|col100000|
+---------+---------+
|dtA000001|dtA100000|
+---------+---------+
only showing top 1 row
CPU times: user 218 ms, sys: 509 ms, total: 727 ms
Wall time: 53min 22s
Question1: Is it an acceptable result? Why is the execution time exponentially increase?
Question2: Is there any other method to do faster?

Caching not Working in Cassandra

I dont seem to have any caching enabled when checking in Opscenter or cfstats. Im running Cassandra 1.1.7 with Solandra on Debian. I have set the required global options in cassandra.yaml:
key_cache_size_in_mb: 800
key_cache_save_period: 14400
row_cache_size_in_mb: 800
row_cache_save_period: 15400
row_cache_provider: SerializingCacheProvider
Column Families were created as follows:
create column family example
with column_type = 'Standard'
and comparator = 'BytesType'
and default_validation_class = 'BytesType'
and key_validation_class = 'BytesType'
and read_repair_chance = 1.0
and dclocal_read_repair_chance = 0.0
and gc_grace = 864000
and min_compaction_threshold = 4
and max_compaction_threshold = 32
and replicate_on_write = true
and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
and caching = 'ALL';
Opscenter shows no data available on caching graphs and CFSTATS doesn't show any cache related fields:
Column Family: charsets
SSTable count: 1
Space used (live): 5558
Space used (total): 5558
Number of Keys (estimate): 128
Memtable Columns Count: 0
Memtable Data Size: 0
Memtable Switch Count: 0
Read Count: 61381
Read Latency: 0.123 ms.
Write Count: 0
Write Latency: NaN ms.
Pending Tasks: 0
Bloom Filter False Postives: 0
Bloom Filter False Ratio: 0.00000
Bloom Filter Space Used: 16
Compacted row minimum size: 1917
Compacted row maximum size: 2299
Compacted row mean size: 2299
Any help or suggestions are appreciated.
Sam
The caching stats have been moved from cfstats to info in Cassandra 1.1. If you run nodetool info you should see something like:
Key Cache : size 5552 (bytes), capacity 838860800 (bytes), 38 hits, 47 requests, 0.809 recent hit rate, 14400 save period in seconds
Row Cache : size 0 (bytes), capacity 838860800 (bytes), 0 hits, 0 requests, NaN recent hit rate, 15400 save period in seconds
This is because there are now global caches, rather than per-CF. It seems that Opscenter needs updating for this change - maybe there is a later version available that will work.

Resources