ElasticSearch Updating Document - elasticsearch

General help question, but I wanted to ask a clarifying question on how updating documents in ES works.
When adding a document request to the elasticsearch indexing, do we have to include all fields for that document or just the ones I want to update?
If there already exists a document with the same document id, would our new document request override all data in that document or just update the fields listed in this document request? In other words, do I need to supply all the fields in this document request or just the ones I want to update? Thanks!

The docs:
The update API also supports passing a partial document, which is
merged into the existing document.

Related

Updating fields of a Couchbase document if it exists by Go

I am using gocb library. I want to update specific field of a document.
However if the document does not exist, I don't want to do anything I will just produce an error message.
You can say that first retrieve the full document itself and make update and then insert it.It is possible right. But I want to use a ready for use method for this purpose if there is any. Since I don't want to retrieve the document. I just want to update some fields of it.
Is there a way for this in gocb library?
If you want to update parts of the document, you can look at sub-document operations. It only transmits the accessed sections of the document over the network making it more efficient for small changes.
Example: https://couchbase.live/examples/basic-go-subdoc-mutate
If you want to rewrite the entire document, you are looking for Replace() which replaces an existing document with a new one. It is similar to Upsert() except that it can only replace existing documents & not create new ones.
General Reference:https://docs.couchbase.com/server/current/guides/updating-data.html

Elasticsearch: How to add "created_at" and "updated_at" timestamps?

I have a database where I should store created_at and updated_at fields for each document.
The created_at field should be created once on first document insert.
The updated_at field should be created on first document insert and should be updated via Bulk API on each update, even if none of the document fields are changed.
The question is: how to add those timestamps?
I believe that Elasticsearch used to have a feature to add these automatically, but it was removed in later versions to improve performance. You would have to add these fields to your mapping and then implement a process to set those fields. One way to do this is with an ingest pipeline, which someone explains in the ES forum. You will want to check the docs for how to implement pipelines with your particular version of Elasticsearch.
Suggestion: You should always check the forums for Elasticsearch questions. The community seems to be more active on there, and devs will also often respond to questions.

What's the different between index and update document in elasticsearch?

As we know when we update an existed document the Elasticsearch engine will reindex the document and mark the previous document deleted. But for the restful API, it's same. So I guess the ElasticSearch will analysis the document whether exist by the unique document ID and then update or index.
So my question is, we don't need to care the index or update functionality, because both restful API and Java Client are PUT the same endpoint, Am I right?
The most difference for PUT and POST document in Elasticsearch:
POST will create a new document with a new unique ID.
PUT will update the current document without change ID.
so if your ID is important to you like for some context, you should use PUT to update a document to keep this ID.

GSA feed - adding multiple documents to a single record

I am working on a product catalog whose search is powered by GSA. Each product is a single entry, but may have many associated documents. As I see there is only one content node allowed in the feed XML. is there a way to add multiple files to same record in the feed xml? Any suggestions?
Your best bet is going be when submitting each record to the feed, is to extract the content of associated documents and add as additional meta-data before sending to the feed.
Alternately, you can submit each record and each document, with some meta-data that references the record it's associated with. When returning search results, you could customize the front-end results to display related content (records or other attached documents).
You can try additional document content into new metadata(1500 char limit) or document body.

Multiple routing field in elasticsearch

I am a newbie to elasticsearch. i need a clarification. i can understand how routing works, but I have a question.
Can i create routing for an document with multiple field. if yes, can i search the
data using single routing value. Can any on provide any example about it.
Imagine I have 5 fields: [username,id,age,dept,salary]. Now i need to create a routing value for this document. Can I do so using the username and id field?
Thanks in advance.
In answer to your question: no, you can't automatically use multiple fields for a routing value when indexing a document. You can choose one and only one field, and that field must contain a single value.
However, you could manually concatenate the username and id field and pass it in the indexing request:
PUT /index/type/id?routing=username_id
{ body }
That said, routing is a feature for more advanced users. It is very useful but does make life more complicated. You say that you're a newbie, so I'd suggest not playing with routing just yet. That can follow when you're running a 50 node cluster.

Resources