Disable ElasticSearch logs in Liferay 7 - elasticsearch

I've started using Liferay v7, and am getting a lot of the following log messages:
17:14:12,265 WARN [elasticsearch[Mirage][management][T#1]][decider:157] [Mirage] high disk watermark [90%] exceeded on [fph02E6ISIWnZ5cxWw_mow][Mirage][/Users/randy/FasterPayments/src/eclipse/com.rps.portal/com.rps.portal.backoffice/bundles/data/elasticsearch/indices/LiferayElasticsearchCluster/nodes/0] free: 46gb[9.9%], shards will be relocated away from this node
To be honest, I'd rather not spend time learning about ElasticSearch right now, is it possible to simply disable ElasticSearch within Liferay 7 dev environment? Or other action to remove these log messages?

Go to Control Panel / Configuration / System Settings / Foundation / Elasticsearch.
Under "Additional Configurations" enter
cluster.routing.allocation.disk.threshold_enabled: True
cluster.routing.allocation.disk.watermark.low: 30gb
cluster.routing.allocation.disk.watermark.high: 20gb
or whatever are appropriate values for your system (there is value in being warned that the disk is almost full).
Save & Restart (the values seem not to be picked up at runtime).

Liferay needs an index/search engine, say ElasticSearch or SOLR. It defaults to ElasticSearch in DXP. It makes no sense disabling it.
The warnings tell you you've reached your configured disk shared allocation. You can change this settings in your elasticSearch.yml (cluster.routing.allocation.disk.watermark.high).
If your logs annoy you, you can change your logging settings. Not sure If it's still valid in DXP, but have a look at https://dev.liferay.com/es/discover/deployment/-/knowledge_base/6-2/liferays-logging-system.

Related

Solr 8 Performance issue on restart

I am trying to figure out why the solr core, doesn't respond upon a restart of solr daemon . I have multiple cores , and the configuration is a leader / follower approach, each core serving certain business needs.
When I restart solr on the server, the cores that have <100K documents, show up immediately when they are queried.
But there are 2 specific cores, where we have around 2 to 3M documents, that takes around 2 minutes to be available for querying.
I know about the warmup / first searcher..etc. But those queries are commented out, so it should not be running the first searcher queries.
I noticed that when I turn this to "true" ( the default value is false)
<useColdSearcher>true</useColdSearcher>
The core that has 2M plus documents show up immediately on a restart of solr.
This never happened in solr 6.6 world, Is this something new in solr 8.x ?
Can someone who experienced this throw some light on this.
In solr 6.x we had the defaults and the cores were available right away. But the same settings in solr 8.11 , doesn't make the core available after a restart.
thanks in advance
B
Since I did not get an answer ,I tried the following experiments.
Made a change to the useColdSearcher to true and restarted the core, then the core started right away and started serving the request.
I also ran a load test with the configuration "useColdSearcher=true", and I did not see that much of a difference. I tried this load test with both true and false.
The default option in the solrconfig is useColdSearcher is false , so the same index, similar configuration in solr 6 started the searcher quick, but not in solr 8, until I made the above change.
I experimented with questions on chatGPT as well. The response in bold.
The "useColdSearcher" setting in Solr can potentially slow down the process of registering a new searcher in Solr 8.x, but it shouldn't have any effect on Solr 6.x.
It's important to note that useColdSearcher is only available for SolrCloud mode and not for standalone mode.
This setting is not available in Solr 6.x, so it wouldn't have any impact on the registration of new searchers in that version.
Since my setup is a leader ->follower , I guess I should be good to set the useColdSearcher to true.
One should try the above tests before taking their course of action. But it worked for me. So wanted to post the answer.

How to turn off Elasticsearch JVM Garbage Collection logs

After settling into our ELK stack log aggregation setup over the past few months, I am noticing that a significant percentage of the logs we are persisting are from elastic search garbage collection.
While I have tried to ignore these logs specifically in filebeat configuration I seem to have been unsuccessful. Is there a way via configuration to turn this logging off until I need it? Or a way to ignore these log files that I am not currently using?
I put this quote from the official document of elasticsearch.
By default, Elasticsearch enables garbage collection (GC) logs. These are configured in jvm.options and output to the same default location as the Elasticsearch logs. The default configuration rotates the logs every 64 MB and can consume up to 2 GB of disk space.
You can reconfigure JVM logging using the command line options described in JEP 158: Unified JVM Logging. Unless you change the default jvm.options file directly, the Elasticsearch default configuration is applied in addition to your own settings. To disable the default configuration, first disable logging by supplying the -Xlog:disable option, then supply your own command line options. This disables all JVM logging, so be sure to review the available options and enable everything that you require.
For more details: GC logging settings

Archive old data from Elasticsearch to Google Cloud Storage

I have an elasticsearch server installed in Google Compute Instance. A huge amount of data is being ingested every minute and the underline disk fills up pretty quickly.
I understand we can increase the size of the disks but this would cost a lot for storing the long term data.
We need 90 days of data in the Elasticsearch server (Compute engine disk) and data older than 90 days (till 7 years) to be stored in Google Cloud Storage Buckets. The older data should be retrievable in case needed for later analysis.
One way I know is to take snapshots frequently and delete the indices older than 90 days from Elasticsearch server using Curator. This way I can keep the disks free and minimize the storage cost.
Is there any other way this can be done without manually automating the above-mentioned idea?
For example, something provided by Elasticsearch out of the box, that archives the data older than 90 days itself and keeps the data files in the disk, we can then manually move this file form the disk the Google Cloud Storage.
There is no other way around, to make backups of your data you need to use the snapshot/restore API, it is the only safe and reliable option available.
There is a plugin to use google cloud storage as a repository.
If you are using version 7.5+ and Kibana with the basic license, you can configure the Snapshot directly from the Kibana interface, if you are on an older version or do not have Kibana you will need to rely on Curator or a custom script running with a crontab scheduler.
While you can copy the data directory, you would need to stop your entire cluster everytime you want to copy the data, and to restore it you would also need to create a new cluster from scratch every time, this is a lot of work and not practical when you have something like the snapshot/restore API.
Look into Snapshot Lifecycle Management and Index Lifecycle Management. They are available with a Basic license.

Can I disable the bootstrap checks in Elasticsearch 5.4?

I'm running Elasticsearch on a non-production RHEL6 server. I only have a regular user account with no root access. I'm in a very locked-down corporate environment so getting root will be time-consuming and I need a work-around.
When I start the process I get these errors:
max file descriptors [8192] for elasticsearch process is too low, increase to at least [65536]
max number of threads [1024] for user [salimfadhley] is too low, increase to at least [2048]
max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
Is there a way to make ElasticSearch ignore this error condition and just start up.
I'm fully aware that ignoring errors is normally considered unwise, however on this occasion I just need to get ES up and running so that I can devote my attention to other aspects of this project: Getting the system limits raised will take more time than I currently have available.
#ThomasDecaux is correct that you can technically disable bootstrap. But you can only disable bootstrap checks sometimes. Here is when you can disable bootstrap checks:
configured your elasticsearch.yml with discovery.type: single-node
or your N nodes are all using zen discovery on localhost
If you need to run N nodes on N machines as one cluster, then no you can not disable bootstrap checks.
When you configure your elasticsearch.yml to use an external interface and you don't have discovery.type: single-node then bootstrap checks can not be disabled. I tried.
I had a machine failing the bootstrap tests but I didn't have sudo permission to fix it. I tried to disable checks by passing the -Des.enforce.bootstrap.checks=false to the Java JVM options but the bootstrap checks were still enabled.
Here is a github issue from 2018 where the developers say you can't disable bootstrap checks https://github.com/elastic/elasticsearch/issues/31933 :
" There is no command line option for disabling bootstrap checks. The
es.enforce.bootstrap.checks option is used to enable them when they
are disabled due to Elasticsearch not detecting that it is being used
in production (single node, only reachable through localhost or using
single-node discovery). "
Yes you can!
(I found) this is very dirty, but if you configure discovery.type as single-node, no bootstrap check will run.
Yes, that means you cannot test a cluster in your laptop.
See https://github.com/elastic/elasticsearch/issues/21655

SonarQube 5.1 too busy due to ElasticSearch

I have recently migrated from SonarQube 3.7.2 to SonarQube 5.1. Update was successfull and I was able to run analysis.
However now I cannot reach the server and from log it seems ElasticSearch is slowly eating away my disk space.
I tried to restart the server and to delete the data/es directory, but nothing helped.
sonar.log is full of these lines:
...
2015.05.18 00:00:13 WARN es[o.e.c.r.a.decider] [sonar-1431686361188] high disk watermark [10%] exceeded on [Jbz_O0pFRKecav4NT3DWzQ][sonar-1431686361188] free: 5.6gb[3.8%], shards will be relocated away from this node
2015.05.18 00:00:13 INFO es[o.e.c.r.a.decider] [sonar-1431686361188] high disk watermark exceeded on one or more nodes, rerouting shards
...
There are just a few Java projects, but two of them are around a couple million lines of code (LOC).
Your server does not have enough available disk space to feed its internal Elasticsearch indices.
Note that an external volume can be used by setting the property sonar.path.data (see conf/sonar.properties).

Resources