Bulk update with mapping in NEST

Bulk update with mapping in NEST - elasticsearch

I'm creating an index with mappings the following way:
_elasticClient.Indices.Create(name, c => c.Map(/* omitted for brevity */));
_elasticClient.BulkAll(data, b => b
.Index(name)
.BackOffTime("30s")
.BackOffRetries(2)
.RefreshOnCompleted()
.MaxDegreeOfParallelism(Environment.ProcessorCount)
.Size(50)).Wait(TimeSpan.FromMinutes(15), _ => { });
_elasticClient.Indices.PutAlias(name, Alias);
I'm just trying to understand how the bulk update works and make sure I do this correctly. Even if I remove _elasticClient.Indices.Create, an index still seems to be created. Does a POST to index_v1/_bulk create the index if it doesn't exist, but update it with data if I create it first with my first line?

By default, if data is indexed into an index that does not yet exist, Elasticsearch will create the index and infer the mapping for documents based on the first document it sees. This can be useful, but typically for a search use case, you want to explicitly control the mapping for documents to apply specific analyzers, etc., so creating the index with the mapping you desire is the preferred approach.

Related

Are composite indexes supported in Memgraph?

Can I create index on more than one property for any given label? I am running query like:
MATCH (n:Node) WHERE n.id = xyz AND n.source = abc;
and I want to know whether I can create multiple property indexes and if not, is there a good way to query for nodes matching multiple properties?

Memgraph does not support composite indexes, but you can create multiple label-property indexes. For example, run
CREATE INDEX ON Node(id);
CREATE INDEX ON Node(source);
To check if they are properly created, run SHOW INDEX INFO;.
Use EXPLAIN/PROFILE query (inspecting and profiling queries) to see different options and test the performance, maybe one label-property index is good enough.

Can you have an index pattern with a field with multiple field types?

Currently I have an elasticsearch index that rolls over periodically. We have an index mapping applied to a certain index pattern. We want to update the field type of the index for subsequent indices that gets rolled over.
If we change the mapping of a field from a string type to number for new rolled over indices, what happens in the index pattern when refreshed?
Would the index pattern have the field as one type over the other?

There is only one version of an index pattern at any given time. When you update it (i.e. change some mapping type), all the existing indices matching that index pattern remain unchanged. All future indices created out of that index pattern will get the modification (i.e. new field mapping type).
What you need to be aware is that you'll end up with (old) indices containing documents with a field having the old mapping type and (new) indices containing documents with a field having the new mapping type. Depending on the change you make, some of your queries running on old and new indices might not run correctly afterwards. Make sure that your queries still work with that mapping change.

Deleting documents not belonging to an index

I have been evaluating elasticsearch 5.1.1. My data upload happens via NEST. I have used two different types and different index names while testing. Now that I have a better understanding of the API, I have settled on a type. I deleted all the indices and created a new one.
My documents have their own ID and I have fluent code as follows
config.InferMappingFor<SearchFriendlyIssue>(ib => ib.IdProperty(p => p.Id));
When I upload documents, the API comes back as "Updated". This is strange, since I just created a new index. What is worse, my new index only contains one document. What I expected is to have a Created response. The code to add data is as per the API documentation
var searchObject = new SearchFriendlyIssue(issue);
var response = Client.Index(searchObject, idx => idx.Index(Index));
Console.WriteLine(response.Result.ToString());
I think I am missing something around how types and indices interact. How do I get rid of my unreachable documents? Rather more specifically how do I get them into my index so they can be deleted or dealt with?

Looks like the assumption I had unreachable documents was wrong. Instead, the declaration for the ID property wasn't working, and I was overwriting the same document over and over again. My bad!

Can I index nested documents on ElasticSearch without mapping?

I have a document whose structure changes often, how can I index nested documents inside it without changing the mapping on ElasticSearch?

You can index documents in Elasticsearch without providing a mapping yes.
However, Elasticsearch makes a decision about the type of a field when the first document contains a value for that field. If you add document 1 and it has a field called item_code, and in document 1 item_code is a string, Elasticsearch will set the type of field "item_code" to be string. If document 2 has an integer value in item_code Elasticsearch will have already set the type as string.
Basically, field type is index dependant, not document dependant.
This is mainly because of Apache Lucene and the way it handles this information.

If you're having a case where some data structure changes, while other doesn't, you could use an object type, http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-object-type.html.
You can even go as far as to use "enabled": false on it, which makes elasticsearch just store the data. You couldn't search it anymore, but maybe you actually don't even want that?

autoincrement and text in rt index in Sphinx

I was able to use sphinx rt index successfully but I have two issues though.
The first one is how to use autoincrement in rt index for the ID?
The second one is how to get the text field? the documentation says "you should explicitly enumerate all the text fields", I'm not sure how to do that?
I'm using PHP to query the rt index and I can see the result except for the text fields, I'm using the same index in the sphinx doc.
index rt
{
type = rt
path = /usr/local/sphinx/data/rt
rt_field = title
rt_field = content
rt_attr_uint = gid
}

Sphinx doesnt have "autoincrement" ids. You could run a query to find the max id, and then add one. but its not 'safe' if have multiple clients inserting. There is no lock index.
Fields are not stored in the index. So you CAN'T get them back out. They are tokenized and indexed, but not stored.
The 'enumerate' comment, is that you need to list all fields in the index definition. (unlike disk indexes, which will automatically make a column a field, if its not defined as an attribute. )
Attributes on the other hand ARE stored, and can be retrieved. If want to be able to make a column searchable, and retrieveable, need to insert it twice, once as a field, then again as an attribute.
(Note sphinx is not really intended to be a 'database' - but rather just an index to one. So it designed around the case that its 'mirroring' the data)

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Bulk update with mapping in NEST - elasticsearch

Related

Are composite indexes supported in Memgraph?

Can you have an index pattern with a field with multiple field types?

Deleting documents not belonging to an index

Can I index nested documents on ElasticSearch without mapping?

autoincrement and text in rt index in Sphinx

Categories

Resources