.sst files are daily increasing in cockroachdb on path /var/lib/cockroach/cockroach-data - cockroachdb

.sst files are daily increasing in cockroachdb on path /var/lib/cockroach/cockroach-data
daily there are lots of .sst files are getting created in my cockroachdb cluster at the host path /var/lib/cockroach/cockroach-data.
Can I delete them? its a critical environment and We can not just delete those files without knowing the impact.
however, they are piling up the disk utilization.
[root#winprd-vck001 cockroach-data]# ls -lrth | grep sst | wc -l
4616
-rw-r----- 1 cockroach cockroach 8.9M May 10 21:27 5266230.sst
-rw-r----- 1 cockroach cockroach 50M May 10 21:35 5266274.sst
-rw-r----- 1 cockroach cockroach 17M May 10 22:00 5266420.sst
-rw-r----- 1 cockroach cockroach 12M May 10 22:00 5266421.sst
-rw-r----- 1 cockroach cockroach 17M May 10 22:11 5266468.sst
-rw-r----- 1 cockroach cockroach 9.0M May 10 23:04 5266745.sst
-rw-r----- 1 cockroach cockroach 17M May 10 23:04 5266746.sst
-rw-r----- 1 cockroach cockroach 18M May 10 23:19 5266806.sst
-rw-r----- 1 cockroach cockroach 31M May 10 23:19 5266807.sst
-rw-r----- 1 cockroach cockroach 11M May 11 00:11 5267051.sst
-rw-r----- 1 cockroach cockroach 28M May 11 00:15 5267083.sst
-rw-r----- 1 cockroach cockroach 64M May 11 00:15 5267089.sst
-rw-r----- 1 cockroach cockroach 20M May 11 00:30 5267173.sst
-rw-r----- 1 cockroach cockroach 1.2M May 11 00:30 5267174.sst
-rw-r----- 1 cockroach cockroach 64M May 11 01:01 5267295.sst
-rw-r----- 1 cockroach cockroach 64M May 11 01:01 5267296.sst
-rw-r----- 1 cockroach cockroach 64M May 11 01:01 5267297.sst
-rw-r----- 1 cockroach cockroach 14M May 11 01:01 5267298.sst
-rw-r----- 1 cockroach cockroach 17M May 11 01:20 5267403.sst
-rw-r----- 1 cockroach cockroach 18M May 11 01:20 5267404.sst
-rw-r----- 1 cockroach cockroach 571K May 11 01:20 5267405.sst
-rw-r----- 1 cockroach cockroach 9.6M May 11 01:57 5267591.sst
-rw-r----- 1 cockroach cockroach 17M May 11 01:57 5267592.sst
-rw-r----- 1 cockroach cockroach 12M May 11 01:57 5267593.sst
-rw-r----- 1 cockroach cockroach 65M May 11 01:58 5267646.sst
-rw-r----- 1 cockroach cockroach 19M May 11 01:58 5267647.sst
-rw-r----- 1 cockroach cockroach 65M May 11 01:58 5267674.sst
-rw-r----- 1 cockroach cockroach 13M May 11 01:58 5267680.sst
-rw-r----- 1 cockroach cockroach 64M May 11 02:33 5267929.sst
-rw-r----- 1 cockroach cockroach 19M May 11 02:33 5267930.sst
-rw-r----- 1 cockroach cockroach 27M May 11 03:44 5268110.sst
-rw-r----- 1 cockroach cockroach 4.5M May 11 03:51 5268155.sst
-rw-r----- 1 cockroach cockroach 34M May 11 03:51 5268156.sst
-rw-r----- 1 cockroach cockroach 2.6M May 11 04:14 5268235.sst
-rw-r----- 1 cockroach cockroach 8.3M May 11 05:17 5268436.sst
-rw-r----- 1 cockroach cockroach 17M May 11 06:00 5268696.sst

Those .sst files are your data, don't delete them.
If your disk utilization is higher than you'd like, but in a relatively steady state after 10 days of uptime, try decreasing your gc.ttlseconds zone configuration so that deleted rows are garbage-collected more frequently. Any time that is comfortably longer than your slowest query is generally safe, although you'll want to make sure not to interfere with backups or change data capture jobs.
If your disk utilization is continuing to creep up over time, either
Your data is growing faster than you're deleting rows: This is a good problem to have. You can solve it with money by adding more nodes, or by periodically deleting stale data (or archiving it in an external system). In CockroachDB 22.1 (in beta release as of May 11th 2022), you can automate the first part of this process by setting ttl_expire_after on a table in a CREATE or ALTER statement. Archiving can be done via scheduled backups or changefeeds.
Something is preventing garbage collection: Backups and changefeeds will protect old versions of rows from being garbage-collected until they've finished exporting them. If you have a long-running job that's not able to connect to its external sink, it could be preventing garbage collection indefinitely. This can be fixed by fixing the job. Other long-running jobs can prevent garbage collection of metadata if something's gone wrong.
Finally, if you're still in the first 10 days of a new cluster and you're seeing disk utilization slowly rise, that's expected. Your database is growing, and it's also accumulating time series statistical data for use in monitoring. No need to take action just yet.

Related

How can I segment/split a file in NiFi getting the small pieces?

Good night
I got 5 files
[azureuser#ibpoccloudera output]$ pwd
/home/azureuser/logs_auditoria/output
[azureuser#ibpoccloudera output]$ ls -lrth
total 5.1G
-rw-r--r-- 1 nifi nifi 1.2G Oct 6 00:38 auditoria_20200928.txt
-rw-r--r-- 1 nifi nifi 433M Oct 6 00:38 auditoria_20200927.txt
-rw-r--r-- 1 nifi nifi 1.5G Oct 6 00:38 auditoria_20200929.txt
-rw-r--r-- 1 nifi nifi 1.6G Oct 6 00:38 auditoria_20200925.txt
-rw-r--r-- 1 nifi nifi 427M Oct 6 00:38 auditoria_20200926.txt
And I want to split them in smaller pieces and put it in another directory using NiFi. I use this processor secuence:
Getfile -> SegmentContent -> Putfile
GetFile
SegmentContent
PutFile
But when I check my output directory (PutFile) I got the last segment that gave me the SegmentContent.
There are any option to get something like linux split
[azureuser#ibpoccloudera output]$ split -b 524288000 auditoria_20200929.txt auditoria_20200929
[azureuser#ibpoccloudera output]$ ls -lrth
total 6.5G
-rw-r--r-- 1 nifi nifi 1.2G Oct 6 00:38 auditoria_20200928.txt
-rw-r--r-- 1 nifi nifi 433M Oct 6 00:38 auditoria_20200927.txt
-rw-r--r-- 1 nifi nifi 1.5G Oct 6 00:38 auditoria_20200929.txt
-rw-r--r-- 1 nifi nifi 1.6G Oct 6 00:38 auditoria_20200925.txt
-rw-r--r-- 1 nifi nifi 427M Oct 6 00:38 auditoria_20200926.txt
-rw-rw-r-- 1 azureuser azureuser 500M Oct 6 00:54 auditoria_20200929aa
-rw-rw-r-- 1 azureuser azureuser 500M Oct 6 00:55 auditoria_20200929ab
-rw-rw-r-- 1 azureuser azureuser 500M Oct 6 00:55 auditoria_20200929ac
-rw-rw-r-- 1 azureuser azureuser 14M Oct 6 00:55 auditoria_20200929ad
I solved the problem using a SplitText and UpdateAttribute.
I use SplitText because I have a json file so, if I use SegmentContent, sometimes y cut one record and get errors.
and with UpdateAttribute, I change the name of the file by UUID, so I am pretty sure that I don't have any repeated records.
SplitText
UpgradeAttribute

BASH script cron scheduled for midnight - running every minute until 12:20am

I have a Digital Ocean Ubuntu (18.04) LAMP server.
On it I have a simple bash script to create a backup directory and dump a site SQL file and a tar.gz of all the files.
Crontab Entry
# Backup files and database at midnight every night
0 0 * * * . /var/www/mysite.com/daily-backup.sh
There is only one cron daemon running on this server.
es#rental-link-production:/var/backups/rentallink$ ps aux | grep cron
es 606 0.0 0.0 14856 1100 pts/0 S+ 10:32 0:00 grep --color=auto cron
root 959 0.0 0.0 31852 3160 ? Ss Mar25 0:02 /usr/sbin/cron -f
Yet overnight it must have ran almost every minute until 12:20am. There are no other cron scripts on this server.
es#mysite-production:/var/backups/mysite$ ls -lah
total 52K
drwxr-xr-x 13 es es 4.0K Apr 9 12:20 .
drwxr-xr-x 3 root root 4.0K Apr 10 06:25 ..
drwxrwxr-x 4 es es 4.0K Apr 9 12:02 2019-04-09-1202
drwxrwxr-x 4 es es 4.0K Apr 9 12:03 2019-04-09-1203
drwxrwxr-x 4 es es 4.0K Apr 9 12:07 2019-04-09-1207
drwxrwxr-x 4 es es 4.0K Apr 9 12:08 2019-04-09-1208
drwxrwxr-x 4 es es 4.0K Apr 9 12:11 2019-04-09-1211
drwxrwxr-x 4 es es 4.0K Apr 9 12:13 2019-04-09-1213
drwxrwxr-x 4 es es 4.0K Apr 9 12:14 2019-04-09-1214
drwxrwxr-x 4 es es 4.0K Apr 9 12:15 2019-04-09-1215
drwxrwxr-x 4 es es 4.0K Apr 9 12:16 2019-04-09-1216
drwxrwxr-x 4 es es 4.0K Apr 9 12:18 2019-04-09-1218
drwxrwxr-x 4 es es 4.0K Apr 9 12:20 2019-04-09-1220
Oddly enough on the staging server, which is an identical setup, it ran twice instead of just once at midnight - which is still once more than I was expecting.
es#mysite-staging:/var/backups/mysite$ ls -lah
total 16K
drwxr-xr-x 4 es es 4.0K Apr 9 12:20 .
drwxr-xr-x 3 root root 4.0K Apr 10 06:25 ..
drwxrwxr-x 4 es es 4.0K Apr 9 12:01 2019-04-09-1201
drwxrwxr-x 4 es es 4.0K Apr 9 12:20 2019-04-09-1220
I could almost expect to see this if the job failed and Linux tried to restart it... but it's a couple of mkdirs, a mysqldump call and a tar call. The CPU hasn't gone over 10% in the last 24 hours. It's not like it ran out of resources.
Any ideas?
check these cases:
1-do other users call it?
2-does it call itself?
3-does it set in motion conditions for repetition?
for test add whoami at the end of the name.
for another test use ps -A to check if crond run more than one time.

What directories does Linux have that macOS doesn't?

I'm a Python programmer that is trying to make a system of sorts that creates save data for a game I'm making, and I want it to be in different places in something like Ubuntu than I do macOS. As I don't have a macOS, and it's impossible to just up and get an ISO to get a macOS VM, I can't look through the files and folders and see what it has (or doesn't) that Linux does or doesn't.
I've tried looking all over to the point of attempting to get a hold of an ISO to build a VM in VirtualBox, but haven't been successful at all.
What files or folders does macOS have that Linux does, or vice versa?
To complete the (excellent) answer from #Michael, here is the listing of the home and root directory after a fresh install on the latest stable release of MacOS System (10.14.2)
MacBook-Pro:~ max$ ls -al /
total 37
drwxr-xr-x 26 root wheel 832 Jan 6 19:00 .
drwxr-xr-x 26 root wheel 832 Jan 6 19:00 ..
-rw-rw-r-- 1 root admin 8196 Jan 13 07:11 .DS_Store
drwx------ 5 root admin 160 Jan 6 11:47 .Spotlight-V100
d-wx-wx-wt 2 root wheel 64 Jan 14 06:39 .Trashes
---------- 1 root admin 0 Aug 18 06:53 .file
drwx------ 11 root admin 352 Jan 14 06:39 .fseventsd
drwxr-xr-x 2 root wheel 64 Aug 18 06:53 .vol
drwxrwxr-x+ 39 root admin 1248 Nov 30 12:49 Applications
drwxr-xr-x+ 60 root wheel 1920 Nov 30 12:50 Library
drwxr-xr-x 2 root wheel 64 Aug 18 06:53 Network
drwxr-xr-x# 5 root wheel 160 Nov 30 12:46 System
drwxr-xr-x 5 root admin 160 Jan 6 18:59 Users
drwxr-xr-x+ 4 root wheel 128 Jan 14 06:39 Volumes
drwxr-xr-x# 37 root wheel 1184 Nov 30 12:55 bin
drwxrwxr-t 2 root admin 64 Aug 18 06:53 cores
dr-xr-xr-x 3 root wheel 4301 Jan 14 06:39 dev
lrwxr-xr-x# 1 root wheel 11 Jan 6 18:49 etc -> private/etc
dr-xr-xr-x 2 root wheel 1 Jan 14 06:40 home
-rw-r--r-- 1 root wheel 313 Aug 18 10:03 installer.failurerequests
dr-xr-xr-x 2 root wheel 1 Jan 14 06:40 net
drwxr-xr-x 6 root wheel 192 Nov 30 12:50 private
drwxr-xr-x# 64 root wheel 2048 Jan 6 18:49 sbin
lrwxr-xr-x# 1 root wheel 11 Jan 6 18:49 tmp -> private/tmp
drwxr-xr-x# 9 root wheel 288 Nov 30 12:38 usr
lrwxr-xr-x# 1 root wheel 11 Jan 6 18:49 var -> private/var
And the home dir:
MacBook-Pro:~ max$ ls -al ~
total 16
drwxr-xr-x+ 15 max staff 480 Jan 14 06:43 .
drwxr-xr-x 5 root admin 160 Jan 6 18:59 ..
-r-------- 1 max staff 7 Jan 6 18:59 .CFUserTextEncoding
drwx------ 2 max staff 64 Jan 14 06:40 .Trash
-rw------- 1 max staff 0 Jan 13 07:11 .bash_history
drwx------ 10 max staff 320 Jan 14 06:40 .bash_sessions
-rw------- 1 max staff 908 Jan 14 06:43 .viminfo
drwx------+ 3 max staff 96 Jan 6 18:59 Desktop
drwx------+ 3 max staff 96 Jan 6 18:59 Documents
drwx------+ 3 max staff 96 Jan 6 18:59 Downloads
drwx------# 51 max staff 1632 Jan 13 07:11 Library
drwx------+ 3 max staff 96 Jan 6 18:59 Movies
drwx------+ 3 max staff 96 Jan 6 18:59 Music
drwx------+ 3 max staff 96 Jan 6 18:59 Pictures
drwxr-xr-x+ 4 max staff 128 Jan 6 18:59 Public
Application settings on macOS are ususally saved somewhere in ~/Library: Common places are ~/Library/Preferences/com.example.mycoolgame.plist for preferences (should be in plist format and "com.example.mycoolgame" should be a valid bundle ID that you own (you should own the domain)). The advantage/disadvantage of this path is that power users know about this directory and can edit the files there as they wish.
Then you have ~/Library/Caches for cached data. All cached data should be put somewhere under this directory. (Never use it for content that cannot be regenerated or redownloaded though.)
~/Library/ApplicationSupport/YourApplicationName/...: here you can basically do anything you like. It would be good if "YourApplicationName" would be globally unique in this case.. So better make it long. Users usually don't see the filesystem contents of anything below "~/Library", so there is no need for short names.
Of course, you can also put your savegames in ~/Documents/MyCoolGame/savegames and tell the user that you save the games there.
A gotcha (maybe): I'm not sure if system APIs expand "~" properly. I think probably not: Calling fopen with a path that starts with "~" would most likely not do the right thing. The users directory is located at something like "/Users/max", so "~" expands to "/Users/max" in the command line if the username is "max".
I can't answer your original question "What directories does Linux have that macOS doesn't?" because I don't have a Linux box at hand at the moment, and I don't think that it would be helpful for your use case.
For global data, there is also the "/Library" hierarchy.. But normal users don't have access to this place, so your game would need to ask for admin rights, which will make everything much more complicated, and this will feel user-unfriendly to macOS users. The macOS way is to have a self-contained application bundle and put all user-specific or temporary data into the appropriate place within the users home folder.

metabase as service with oracle database

I tried Install Metabase as servicetutorial for Metabase Setup on CENTOS. Though i want to run metabase with oracle.
[bkmetabase#BK01 MetaBase]$ pwd
/home/bkmetabase/MetaBase
[bkmetabase#BK01 MetaBase]$ ll
total 119488
-rw-rw-r-- 1 bkmetabase bkmetabase 3975 Jun 6 07:52 logger_metabase.log
-rw-rw-r-- 1 bkmetabase bkmetabase 7503872 Jun 6 07:52 metabase.db.mv.db
-rw-rw-r-- 1 bkmetabase bkmetabase 187 May 27 12:12 metabase.db.trace.db
-rw-rw-r-- 1 bkmetabase bkmetabase 114839203 May 20 13:10 metabase.jar
drwxrwxr-x 2 bkmetabase bkmetabase 4096 May 27 12:05 plugins
[bkmetabase#BK01 MetaBase]$ cd plugins/
[bkmetabase#BK01 MetaBase]$ ll
total 3320
-rw-rw-r-- 1 bkmetabase bkmetabase 3397734 May 5 17:13 ojdbc7.jar
Currently to run as process i am running like below
~/jdk1.8.0_171/bin/java -jar metabase.jar &> logger_metabase.log &
How can we run it as background service with oracle as database?

Logstash Persistent Queues Not Creating Tail Files

I have just started playing with logstash 5.4.0 persistent queues.
I have configured logstash to use persistent queues though this always writes to head and never rolls the head over to tail.
My logstash.yml is as follows
queue.checkpoint.writes: 1
queue.type: persisted
path.queue: /usr/share/logstash/persisted-queues
queue.page_capacity: 1000mb
And creates
-rw-r--r-- 1 root root 1048576000 Feb 23 14:14 page.1
-rw-r--r-- 1 root root 34 Feb 23 14:14 checkpoint.head
a few minute later I get
-rw-r--r-- 1 root root 1048576000 Feb 23 14:15 page.1
-rw-r--r-- 1 root root 34 Feb 23 14:14 checkpoint.head
The size of the file remains consistent and when I cat the page file I can see it changing.

Resources