Elasticsearch: automatic parameter propagation in documents - elasticsearch

Let's say I have the following documents (containing logs) in Elasticsearch index:
PUT logs/_doc/1
{
"commonId" : "111111",
"comment" : "abc",
"phase" : "start"
}
PUT logs/_doc/2
{
"commonId" : "111111",
"comment" : "cde",
"customerNumber" : "234-333"
}
PUT logs/_doc/3
{
"commonId" : "222222",
"comment" : "efg",
"phase" : "stop"
}
PUT logs/_doc/4
{
"commonId" : "222222",
"comment" : "jkl",
"customerNumber" : "234-555"
}
The thing which is common in all logs is commonId attribute.
Problem is:
I want process logs in a way:
All logs with same commonId should exchange each other with missing attributes. So log=1 should add "customerNumber" : "234-333", and log=2 should add "phase" : "start". Same situation with logs=3 and 4.
Is it possible to do this by any Elasticsearch query? Generaly I'm not iterested in any paid option of X-Pack.

Related

How to form index stats API?

ES Version : 7.10.2
I have a requirement to show index statistics, I have come across the index stats API which does fulfill my requirement.
But the issue is I don't necessarily need all the fields for a particular metric.
Ex: curl -XGET "http://localhost:9200/order/_stats/docs"
It shows response as below (omitted for brevity)
"docs" : {
"count" : 7,
"deleted" : 0
}
But I only want "count" not "deleted" field, from this.
So, in Index Stats API documentation, i came across a query param as :
fields:
(Optional, string) Comma-separated list or wildcard expressions of fields to include in the statistics.
Used as the default list unless a specific field list is provided in the completion_fields or fielddata_fields parameters
As per above when I perform curl -XGET "http://localhost:9200/order/_stats/docs?fields=count"
It throws an exception
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "request [/order/_stats/docs] contains unrecognized parameter: [fields]"
}
],
"type" : "illegal_argument_exception",
"reason" : "request [/order/_stats/docs] contains unrecognized parameter: [fields]"
},
"status" : 400
}
Am I understanding the usage of fields correctly ?
If yes/no, how can I achieve the above requirement ?
Any help is much appreciated :)
You can use the filter_path argument, like:
curl -XGET "http://localhost:9200/order/_stats?filter_path=_all.primaries.docs.count
This will return you only one field like:
{
"_all" : {
"primaries" : {
"docs" : {
"count" : 10
}
}
}
}

Using `push` to create a field array in laravel

I want to add a field array to mongoDB in laravel. I use push to create an array field. And I did it the following way:
public function addFieldByUser($id)
{
$product = Product::find($id);
if (empty($product)) {
abort(404);
}
if (!isset($product->image['productC'])) {
$product->push("image.productC", []);
}
}
I have successfully added the field productC in mongoDB, and here are the results:
{
"_id" : ObjectId("6295eb7210b4ec7cb72c9426"),
"name" : "test",
"image" : {
"productA" : [
{
"_id" : "6295eb6c8e88fb54231e66c3",
"link" : "https://example.com",
"image" : "dc1c94e421d4ab6a592bcae33ec97345.jpg",
"externalLink" : true
}
],
"productB" : [
{
"_id" : "6295eb957cb6f9350f0f0de5",
"link" : "https://example.com",
"image" : "1ccd4b1d7601a3beb213eb5b42c5d9bf.jpg",
"externalLink" : true
}
],
"productC" : [
[]
]
}
}
But inside productC there is one more sub-array. But my original intention was to add field productC as an empty array
I would like to have a result like this:
{
"_id" : ObjectId("6295eb7210b4ec7cb72c9426"),
"name" : "test",
"image" : {
"productA" : [
{
"_id" : "6295eb6c8e88fb54231e66c3",
"link" : "https://example.com",
"image" : "dc1c94e421d4ab6a592bcae33ec97345.jpg",
"externalLink" : true
}
],
"productB" : [
{
"_id" : "6295eb957cb6f9350f0f0de5",
"link" : "https://example.com",
"image" : "1ccd4b1d7601a3beb213eb5b42c5d9bf.jpg",
"externalLink" : true
}
],
"productC" : []
}
}
Anyone please give me any suggestions, I'm new to mongoDB so it's really giving me a hard time. Thank you very much
Push doesn't make sense here. push implies that you have an array that you want to add a value to. But here you just want to initialise an array, so you'll have to use another method.
(Also notably if you use push on an eloquent model it'll also save it to the database)
You could do something like $product->image['productC'] = []
And then $product->save() if you want to save it

Firebase Realtime Database: Do I need extra container for my queries?

I am developing an app, where I retrieve data from a firebase realtime database.
One the one hand, I got my objects. There will be around 10000 entries when it is finished. A user can select for each property like "Blütenfarbe" (flower color) 1 (or more) characteristics, where he will then get the result plants, on which these constraints are true. Each property has 2-10 characteristics.
Is querying here powerful enough, to get fast results ? If not, my thought would be to also setup container for each characteristic and put every ID in that, when it is a characteristic of that plant.
This is my first project, so any tip for better structure is welcome. I don't want to create this database and realize afterwards, that it is not well enough structured.
Thanks for your help :)
{
"Pflanzen" : {
"Objekt" : {
"00001" : {
"Belaubung" : "Sommergrün",
"Blütenfarbe" : "Gelb",
"Blütezeit" : "Februar",
"Breite" : 20,
"Duftend" : "Ja",
"Frosthärte" : "Ja",
"Fruchtschmuck" : "Nein",
"Herbstfärbung" : "Gelb",
"Höhe" : 20,
"Pflanzengruppe" : "Laubgehölze",
"Standort" : "Sonnig",
"Umfang" : 10
},
"00002" : {
"Belaubung" : "Sommergrün",
"Blütenfarbe" : "Gelb",
"Blütezeit" : "März",
"Breite" : 25,
"Duftend" : "Nein",
"Frosthärte" : "Ja",
"Fruchtschmuck" : "Nein",
"Herbstfärbung" : "Ja",
"Höhe" : 10,
"Pflanzengruppe" : "Nadelgehölze",
"Standort" : "Schatten",
"Umfang" : 10
},
"Eigenschaften" : {
"Belaubung" : {
"Sommergrün" : [ "00001", "00002" ],
"Wintergrün" : ["..."]
},
"Blütenfarbe" : {
"Braun": ["00002"],
"Blau" : [ "00001" ]
},
}
}
}
}

How to store nested document as String in elastic search

Context:
1) We are building a CDC pipeline (using kafka & connect framework)
2) We are using debezium for capturing mysql Tx logs
3) We are using Elastic Search connector to add documents to ES index
Sample change event generated by Debezium:
{
"source" : {
"before" : {
"Id" : 97,
"name" : "Northland",
"code" : "NTL",
"country_id" : 6,
"is_business_mapped" : 0
},
"after" : {
"Id" : 97,
"name" : "Northland",
"code" : "NTL",
"country_id" : 6,
"is_business_mapped" : 1
},
"source" : {
"version" : "0.7.5",
"name" : "__",
"server_id" : 252639387,
"ts_sec" : 1547805940,
"gtid" : null,
"file" : "mysql-bin-changelog.000570",
"pos" : 236,
"row" : 0,
"snapshot" : false,
"thread" : 614,
"db" : "bazaarify",
"table" : "state"
},
"op" : "u",
"ts_ms" : 1547805939683
}
What we want :
We want to visualize only 3 columns in kibana :
1) before - containing the nested JSON as string
2) after - containing the nested JSON as string
3) source - containing the nested JSON as string
I can think below possibilities here :
a) Either converting nested JSON as string
b) Combining column data in elastic search
I am a newbie to elastic search . Can someone please guide me how to do that.
I tried defining custom mapping as well but it is giving me exception.
You can always view your document as a Raw JSON in Kibana.
You don't need to manipulate it before indexing in elastic.
As this is related to visualization, handle this in Kibana only.
Check this link for a screenshot.
Refer this to add the columns which you want to see onto the results
I don't fully understand your use case, but if you would like to turn some json's to their representing strings, then you can use logstash for that, or even Elasticsearch ingest capabilities to convert an object (json) to a string.
From the link above, an example:
PUT _ingest/pipeline/my-pipeline-id { "description": "converts the
content of the id field to an integer", "processors" : [
{
"convert" : {
"field" : "source",
"type": "string"
}
} ] }

elastic search filter by documents count in nested document

I have this schema in elastic search.
79[
'ID' : '1233',
Geomtries:[{
'doc1' : 'F1',
'doc2' : 'F2'
},
(optional for some of the documents)
{
'doc2' : 'F1',
'doc3' : 'F2'
}]
]
the Geometries is a nested element.
I want to get all of the documents that have one object inside Geometries.
Tried so far :
"script" : {"script" : "if (Geomtries.size < 2) return true"}
But i get exceptions : no such property GEOMTRIES
If you have the field as type nested in the mapping, the typical doc[fieldkey].values.size() approached does not seem to work. I found the following script to work:
{
"from" : 0,
"size" : <SIZE>,
"query" : {
"filtered" : {
"filter" : {
"script" : {
"script" : "_source.containsKey('Geomtries') && _source['Geomtries'].size() == 1"
}
}
}
}
}
NB: You must use _source instead of doc.
The problem is in the way you access fields in your script, use:
doc['Geometry'].size()
or
_source.Geometry.size()
By the way for performance reasons, I would denormalize and add GeometryNumber field. You can use the transform mapping to compute size at index time.

Resources