Solr 8 Performance issue on restart - performance

I am trying to figure out why the solr core, doesn't respond upon a restart of solr daemon . I have multiple cores , and the configuration is a leader / follower approach, each core serving certain business needs.
When I restart solr on the server, the cores that have <100K documents, show up immediately when they are queried.
But there are 2 specific cores, where we have around 2 to 3M documents, that takes around 2 minutes to be available for querying.
I know about the warmup / first searcher..etc. But those queries are commented out, so it should not be running the first searcher queries.
I noticed that when I turn this to "true" ( the default value is false)
<useColdSearcher>true</useColdSearcher>
The core that has 2M plus documents show up immediately on a restart of solr.
This never happened in solr 6.6 world, Is this something new in solr 8.x ?
Can someone who experienced this throw some light on this.
In solr 6.x we had the defaults and the cores were available right away. But the same settings in solr 8.11 , doesn't make the core available after a restart.
thanks in advance
B

Since I did not get an answer ,I tried the following experiments.
Made a change to the useColdSearcher to true and restarted the core, then the core started right away and started serving the request.
I also ran a load test with the configuration "useColdSearcher=true", and I did not see that much of a difference. I tried this load test with both true and false.
The default option in the solrconfig is useColdSearcher is false , so the same index, similar configuration in solr 6 started the searcher quick, but not in solr 8, until I made the above change.
I experimented with questions on chatGPT as well. The response in bold.
The "useColdSearcher" setting in Solr can potentially slow down the process of registering a new searcher in Solr 8.x, but it shouldn't have any effect on Solr 6.x.
It's important to note that useColdSearcher is only available for SolrCloud mode and not for standalone mode.
This setting is not available in Solr 6.x, so it wouldn't have any impact on the registration of new searchers in that version.
Since my setup is a leader ->follower , I guess I should be good to set the useColdSearcher to true.
One should try the above tests before taking their course of action. But it worked for me. So wanted to post the answer.

Related

Designing ElasticSearch Migration from 6.8 to 7.16 along with App Deployment

I have a Spring Boot application that uses ElasticSearch 6.8 and I would like to migrate it to Elasticsearch 7.16 with least downtime. I can do rolling update but the problem with migration is that when I migrate my ES cluster from version 6 to 7, some features in my application fails because of breaking changes (for example total hit response change)
I also upgraded my ElasticSearch client to version 7 in a separate branch and I can deploy it as well but that client doesn't work with ES version 6. So I cannot first release the application and then do the ES migration. I thought about doing application deployment and ES migration at the same time with a few hours downtime but in case something goes wrong rollback may take too much time (We have >10TB data in PROD).
I still couldn't find a good solution to this problem. I'm thinking to migrate only ES data nodes to 7.16 version and keep master nodes in 6.8. Then do application deployment and migrate ElasticSearch master nodes together with a small downtime. Has anyone tried doing this? Would running data and master nodes of my ElasticSearch cluster in different versions (6.8 and 7.16) cause problem?
Any help / suggestion is much appreciated
The breaking change you mention can be alleviated by using the query string parameter rest_total_hits_as_int=true in your client code in order to keep getting total hit count as in version 6 (mentioned in the same link you shared).
Running master and data nodes with different versions is not supported and I would not venture into it. If you have a staging environment where you can test this upgrade procedure it's better.
Since 6.8 clients are compatible with 7.16 clusters, you can add that small bit to your 6.8 client code, then you should be able to upgrade your cluster to 7.16.
When your ES server is upgraded, you can upgrade your application code to use the 7.16 client and you'll be good.
As usual with upgrades, since you cannot revert them once started, you should test this on a test environment first.

elasticsearch temporary crash when heavily adding/updating

I am running elasticsearch on a dedicated server on a Saas platform. The problem is that when cron jobs execute, and massively update/insert new values in elastic search, the front-office(the site) when it tries to connect to elasticsearch it returns false (the connection fails).
Anyone knows what can be the problem and how it can be fixed? We are running elasticsearch latest stable elastic search version.
This happens on and off, meaning when i refresh the page in the front office sometimes it cannot connect to elastic search, after another refresh it works again and so on, until the heavy load passes.
We have nvme hdds and elastic search is only running on that server not multi-nodes.
When i say heavily, I mean 1000-2000 updates per second.

Why SolrCloud slows down when one instance killed?

I am using SolrCloud (version 4.7.1) with 4 instances and embedded ZooKeeper (test environment).
When I simulate failure of one of the instances, the indexing speed goes from 4 seconds to 17 seconds.
It goes back to 4 seconds after the instance is brought back to life.
Search speed is not affected.
Our production environment shows similar behavior (only the configuration is more complex).
Is this normal or did I miss some configuration option?
It is due to having Zookeeper embedded in Solr cluster.
Please try with external zookeeper. This setup give the expected results.

Apache Solr requires regular restart

I have Solr installed and set up on my Drupal 7 site. Most of the time it works as expected. However, every so often, maybe every other day at least, the search will suddenly stop working and according to the Drupal error log I get:
"0" Status: Request failed: Connection refused.
The Type column says Apache Solr. To fix this, I just restart the Solr service, is there something I can do to prevent this issue from occurring again? I suspect it's some sort of configuration with the Solr that needs adjusting.
I'm kind of new to Solr, so any tips would be appreciated.
Thanks
How busy is Solr server? If not very busy, check if you have a firewall between your Drupal and Solr servers. Some firewalls kill the connections if there is no traffic going through.
One way to test would be to access Solr admin interface. If you can, the server itself is fine, only Drupal's connection died.
I am assuming that Solr client library in Drupal tries to maintain a persistent connection. If that's not the case, the above does not apply.
I ended up reducing the number of documents to be indexed during cron from 200 to 50. That seemed to resolve the issue, as I have not had any Solr outages over the last couple of weeks.

Rails - searching and configuration of Solr

I use for searching a content in my app Solr. What I don't like is, that everytime, when I restart computer, I have to manually start Solr and then, when is in the app a new content, I have to reindex that, because in other hand Solr wouldn't find the new data.
This is not very comfortable, how looks the work with Solr on the server, eg. on Heroku? Do I have there starting Solr all the time or do I have there reindex data over and over again, as on my localhost I do?
Eventually, exist better solution for searching except Solr?
You are using the included server, right?
You can choose to deploy it in Tomcat. You just have to copy your files to Tomcat and register your Solr application in Tomcat configuration. Tomcat is run as a service. Or, you can use a script to start Jetty on startup.
And a professional Solr service tries to keep your Solr application alive and your data safe against any cause such as a crashed software, failed server or even a datacenter that went down.
Check what Heroku (or other hosted Solr solutions) promises you in their terms. They would do a much better job than an individual (no restarting Solr instances frequently!).
When you add something to Solr, it is persisted to disk. When commited, it is available to search. If a document changes, you reindex it to reflect the new changes.
When you restart Solr, the same persisted data is available. What is your exact trouble?
There is the DIH (Direct Import Handler) if you want to automatically index from a DB.
I'm happy with Solr so far.
As far as starting Solr instance after restarting your computer, you can write a bash script that would do it for you, or declare an alias that would start your Solr and your app server.
As far as re-indexing. New and updated records should be re-indexed automatically, unless manipulate your data from the console.
For the alternate solutions check out Thinking Sphinx

Resources