Why solr error in logs when adding documents - spring-boot

I am getting Error when am adding document. My process will be deleting document with id and adding whole document again.
I am using spring boot solr to do this operations.
Most of the time document is added but some time I see it is missing
in solr 9 logs I see this error
I don't have groups value in my document at all
Unknown operation for the an atomic update: GROUPS
schema.xml
<field name="groups" type="string" indexed="true" stored="true" multiValued="true"/>

Related

BIgquery to elasticsearch (Avoid adding duplicate documents to elasticsearch.)

I am trying to sync data between Bigquery and elasticsearch using the job template provided in GCP. The issue is that Bigquery sends all the documents everytime the job is run, now as elasticsearch has the document id as _id ,it creates duplicate documents.
Is there a way by which we can configure data _id field while sending data from bigquery to elasticsearch.

Error occurred when add new field on Elastic Search

I am using Elastic Search as a database. By changing requirements, I added a new field on ES. After added it, my NodeJS application threw an error on the line which I added sortable.
My expectation is: update the index mappings for all old documents when I add a new field on ES.
Thank you very much for helping me!

Elasticsearch configuration using Nlog

I'm using Nlog to write logs to Elasticsearch, which works just fine. The problem is that aggregations don't work because the fielddata on the property I try to aggregate is set to false by default. The error I get reads as follows:
illegal_argument_exception Reason: "Fielddata is disabled on text
fields by default. Set fielddata=true on [level] in order to load
fielddata in memory by uninverting the inverted index
Since an index is created by Nlog, I would like it to map certain properties in a way that they can be later aggregated. Is it possible to configurte Nlog so that the error is gone and aggregations start working?

Why sometimes Elasticsearch scroll or search returns a set of doc ids which cannot be individually retrieved?

I am seeing a strange problem where Elasticsearch scroll or search API returns a set of documents which I cannot get by the ids any more. I am using Elassandra (Cassandra + ES) which is using Elasticsearch as secondary index store. There are TTL on the Cassandra records which are dropped due to TTL, but the ids are still there in Elasticsearch. Why is this strange behaviour? I did refresh and forcemerge of the corresponding index on Elasticsearch, but it didn't help.
Okay. I found the problem. The TTL field on Cassandra deletes the record on Cassandra, but the custom secondary index Elassandra built on Elasticsearch doesn't get deleted by that mechanism. In fact TTL is no longer there on higher version of ES. The documents need to be deleted explicitly from ES or we need to have time partioned Index on ES so that old indexes can be just deleted.

Relevancy boosting very slow in Solr

I have a Solr index with about 2.5M items in it and I am trying to use an ExternalFileField to boost relevancy. Unfortunately, it's VERY slow when I try to do this, despite it being a beefy machine and Solr having lots of memory available.
In the external file I have contents like:
747501=3.8294805903e-07
747500=3.8294805903e-07
1718770=4.03292174724e-07
1534562=3.8294805903e-07
1956010=3.8294805903e-07
747509=3.8294805903e-07
747508=3.8294805903e-07
1718772=3.8294805903e-07
1391385=3.8294805903e-07
2089652=3.8294805903e-07
1948271=3.8294805903e-07
108368=3.84404072186e-06
Each line is a document ID and it's corresponding boosting factor.
In my query I'm using edismax, and I am using the boost parameter, setting it to pagerank. The entire query is here.
In my schema I have:
<!-- External File Field Type-->
<fieldType name="pagerank"
keyField="id"
stored="false"
indexed="true"
omitNorms="false"
class="solr.ExternalFileField"
valType="float"/>
and
<field name="pagerank"
type="pagerank"
indexed="true"
stored="true"
omitNorms="false"/>
But the performance is just, plain bad. Am I missing a setting or something?
According to the javadoc
The external file may be sorted or unsorted by the key field, but it
will be substantially slower (untested) if it isn't sorted.
And as I see, ids in your file are unsorted. Can you sort it and test if it helps?

Resources