Create nested object if not exists in a Logstash pipeline - elasticsearch

In a Logstash pipeline, I'm trying to verify if a nested object exists. If it doesn't, I want to create it and initialized its fields.
Here is my pipeline output:
output {
elasticsearch {
hosts => ["${ELASTICSEARCH_HOST:localhost:9200}"]
user => "${ELASTICSEARCH_USERNAME:elastic}"
password => "${ELASTICSEARCH_PASSWORD:password}"
index => "shape_idx"
action => "update"
document_id => "%{shapeId}"
script_type => "inline"
scripted_upsert => true
document_type => "shape"
script => '
if (ctx._source["realViews"]) {
ctx._source.realViews.all=0;
ctx._source.realViews.desktop=0;
ctx._source.realViews.phone=0;
ctx._source.realViews.tablet=0;
}
ctx._source.realViews.all = (ctx._source.realViews.all ?:0) + params.event.deviceCounter.all;
ctx._source.realViews.desktop = (ctx._source.realViews.desktop ?:0) + params.event.deviceCounter.desktop;
ctx._source.realViews.phone = (ctx._source.realViews.phone ?:0) + params.event.deviceCounter.phone;
ctx._source.realViews.tablet = (ctx._source.realViews.tablet ?:0) + params.event.deviceCounter.tablet;
'
}
}
Unfortunately, my pipeline fails silently, any idea ?
For the record, I'm using Logstashg 7.1.1 and Elasticsearch 5.6.16.

Mutations should happen in a mutate filter, just add a check there and add the field when needed within your filter.
Try to remind yourself that you have input / filter / output and make each part do what it is supposed to do.
Let the input just input data in a pipeline, maybe add a field if you really want to (useful if you use multiple inputs in the same pipeline running through the same filter).
Let the filter do whatever manipulation you want to do, this includes checking if fields exist and then acting upon it.
Let the output just output (if statements to have conditional outputs can be part of it, but no further processing).

Related

how to merge elasticsearch results into one field or how to explain which field results were found in

follow me on this one...
if i've got a db of movies and i want to search on multiple fields and return the results into a single field, how would i accomplish this?
let me set an example...
my documents have a title and artists.name (array). i want the user to be able to search in both title and artist at the same time so that the results are in the same field. this would be implemented in an 'autocomplete' search scenario where you get results as you type.
so if a user types 'mike' i want to be able to search for actors (artists.name) with the name mike and titles with the word mike in it. in this case, you might return 'magic mike' and 'mike meyers' in the same autocomplete result set. (imdb.com has this implementation)
i understand how to search both of those fields, but how do i return them into one? i believe i'd have to have some knowledge on where my 'hit' came from - title or artists.name. so maybe that's the larger question here - how do i tell which field the hit came from?
I don't think there are any direct ways to determine which field(s) a query matched on. I can think of a few "workaround" approaches that may do it for you- one is by using the multisearch api, and executing separate queries on each field. Another is using highlighting, which will return back the fields that a match was found in.
Example using multi search:
var response = client.MultiSearch(ms => ms
.Search<Artist>("name", s => s.Query(q => q.Match(m => m.OnField(a => a.Name).Query("mike"))))
.Search<Artist>("titles", s => s.Query(q => q.Match(m => m.OnField(a => a.Titles).Query("mike")))));
response.GetResponse<Artist>("name"); // <-- Contains search results from matching on Name
response.GetResponse<Artist>("titles"); // <-- Contains search results from matching on Titles
Example using highlighting:
var response = client.Search<Artist>(s => s
.Query(q => q
.MultiMatch(m => m
.OnFields(a => a.Name, a => a.Titles)
.Query("mike")))
.Highlight(h => h
.OnFields(fs => fs.OnField(a => a.Name),
fs => fs.OnField(a => a.Titles))));
You can then inspect the Highlights object of each hit, or the Highlights object of the response to determine what field the match came from.
There is also the explain api, and you can add explain to your query, but that will return a lot of irrelevant scoring info, which you would have to parse through. Probably too cumbersome for your needs.
As a side note- for autocomplete functionality, if possible I would really try to leverage the completion suggester instead of the above solutions. These are pre-computed suggestions that are created when you index your documents by building up FSTs, which will increase your indexing time as well as index size, but as a result will provide extremely fast suggestions.

Elasticsearch Nest not honoring index = not_indexed on POCO field?

i just found out through the documentation that we should be telling it to use the mapping attributes and manually creating the index before we index.
however, the documentation is not consistent with the newest version of the code. (pre release).
http://nest.azurewebsites.net/nest/indices/put-mapping.html
var response = this.ConnectedClient.Map<ElasticSearchProject>();
the call above in the new code takes 1 argument in the Map() method. the documentation is not requiring any arguments.
what should be contained in that method? it seems like there are many options, but i'm unclear on which ones to use.
Have a look at the Create Indices documentation. I think something like this will work for what you are trying to accomplish. Plus it will create the index and apply the mapping all in one call to your Elasticsearch instance.
client.CreateIndex("myindexname", c => c
.NumberOfReplicas(0)
.NumberOfShards(1)
.Settings(s=>s
.Add("merge.policy.merge_factor","10")
.Add("search.slowlog.threshold.fetch.warn", "1s")
)
.AddMapping<ElasticSearchProject>(m => m.MapFromAttributes())
.AddMapping<Person>(m => m.MapFromAttributes())
);
The .AddMapping<ElasticSearchProject>(m => m.MapFromAttributes()) line tells NEST to grab all of the Attribute settings on the via ElasticType and ElasticProperty on the ElasticSearchProject class and use those to create the mapping.

Cannot index document

I have written some code using the Elasticsearch.Net & NEST client library that should index a document without using a POCO for mapping fields as I have many different documents.
Question 1) Is this correct way to do create an index, does the .AddMapping<string>(mapping => mapping.Dynamic(true)) create the mapping based on the document passed in?
var newIndex = client.CreateIndex(indexName, index => index
.NumberOfReplicas(replicas)
.NumberOfShards(shards)
.Settings(settings => settings
.Add("merge.policy.merge_factor", "10")
.Add("search.slowlog.threshold.fetch.warn", "1s")
)
.AddMapping<string>(mapping => mapping.Dynamic(true))
);
Question 2) Is this possible?
string document = "{\"name\": \"Mike\"}";
var newIndex = client.Index(document, indexSelector => indexSelector
.Index(indexName)
);
When I run code in "Question 2" it returns:
{"Unable to perform request: 'POST ' on any of the nodes after retrying 0 times."}
NEST only deals with typed objects in this case passing a string will cause it to index the document into /{indexName}/string/{id}.
Since it can't infer an id from string and you do not pass it one it will fail on that or on the fact that it can't serialize a string. I'll update the client to throw a better exception in this case.
If you want to index a document as string use the exposed Elasticsearch.NET client like so:
client.Raw.Index(indexName, typeName, id, stringJson);
If you want elasticsearch to come up with an id you can use
client.Raw.Index(indexName, type, stringJson);
client is the NESTclient and the Raw property is an Elasticsearch.Net client with the same connectionsettings.
Please note that I might rename Raw with LowLevel in the next beta update, still debating that.

Rails 4 and Mongoid: programmatically build query to search for different conditions on the same field

I'm building a advanced search functionality and, thanks to the help of some ruby fellows on SO, I've been already able to combine AND and OR conditions programmatically on different fields of the same class.
I ended up writing something similar to the accepted answer mentioned above, which I report here:
query = criteria.each_with_object({}) do |(field, values), query|
field = field.in if(values.is_a?(Array))
query[field] = values
end
MyClass.where(query)
Now, what might happen is that someone wants to search on a certain field with multiple criteria, something like:
"all the users where names contains 'abc' but not contains 'def'"
How would you write the query above?
Please note that I already have the regexes to do what I want to (see below), my question is mainly on how to combine them together.
#contains
Regex.new('.*' + val + '.*')
#not contains
Regex.new('^((?!'+ val +').)*$')
Thanks for your time!
* UPDATE *
I was playing with the console and this is working:
MyClass.where(name: /.*abc.*/).and(name: /^((?!def).)*$/)
My question remains: how do I do that programmatically? I shouldn't end up with more than two conditions on the same field but it's something I can't be sure of.
You could use an :$and operator to combine the individual queries:
MyClass.where(:$and => [
{ name: /.*abc.*/ },
{ name: /^((?!def).)*$/ }
])
That would change the overall query builder to something like this:
components = criteria.map do |field, value|
field = field.in if(value.is_a?(Array))
{ field => value }
end
query = components.length > 1 ? { :$and => components } : components.first
You build a list of the individual components and then, at the end, either combine them with :$and or, if there aren't enough components for :$and, just unwrap the single component and call that your query.

Mongodb update document through ruby not working

I want to update documents in MongoDB via Ruby code. I have document ids of documents I want to update only a specific field. I tried the following code.
collection.update({"_id".to_s => doc_id},{"$set"=> {"selected" => "false"}})
and also
collection.update({"_id".to_s => doc_id},{"selected" => "false"})
Both commands execute without any error but the database remains unaffected.
According to the documentation, the way to update a document are
collection.update({"_id" => id}, doc)
or
collection.update({"_id" => id}, {"$set" => {"name" => "MongoDB Ruby"}})
The ID is expected to be a valid id. I'm not sure if a string is accepted, in case make sure to convert it to a BSON::ObjectId.
collection.update({"_id" => BSON::ObjectId.from_string(doc_id) }, {"$set" => {"selected" => "false" }})
Make sure to check the command returns true.
Also note that if you are using a driver version < 1.8, you should be using :safe => true.
The driver will send a getlasterror command after every write to ensure that the write succeeded by default. Prior to version 1.8 of the driver, writes were not acknowledged by default and it was necessary to explicitly specify the option ':safe => true' for write confirmation. This is no longer the case.

Resources