ElasticSearch - Create Query That Show Different Properties Between Two Indexes - elasticsearch

I am trying to create an elastic query that will show non-common properties between two indexes. Say the first index is:
{
"myFirstIndex" : {
"mappings" : {
"properties" : {
"CAT" : {
"type" : "keyword",
"ignore_above" : 256
},
"DATE_OF_BIRTH" : {
"type" : "date",
"format" : "dateOptionalTime"
},
"ID" : {
"type" : "keyword",
"ignore_above" : 256
},
"NAME" : {
"type" : "text"
},
"timestamp" : {
"type" : "date",
"format" : "dateOptionalTime"
}
}
}
}
}
, and the second was is:
{
"mySecondIndex" : {
"mappings" : {
"properties" : {
"CAT" : {
"type" : "keyword",
"ignore_above" : 256
},
"DATE_OF_BIRTH" : {
"type" : "date",
"format" : "dateOptionalTime"
},
"ID" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
I have never done a query across indexes so I am not sure how to do this. I don't care much about whether the properties have nested characteristics. For my purposes, finding the appropriate common properties at a base level is sufficient.
Grateful for any assistance. Thank you

(based off your clarifications) you can't do that natively in Elasticsearch
you'd need to run the queries from some code and then compare the two indices in said code

Related

Unable to get data in a scripted field in kibana/elastic

Kibana Version : 7.4.2
My Index: cls-docker-logs*
Existing mapping in index cls-docker-logs-*
GET cls-docker-logs-*/_mapping
"stack_trace" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
Updated the mapping to:
"stack_trace" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 2560000
}
}
}
Created a scripted field from where i basically want the value and want to perform some string manipulation
return doc['stack_trace.keyword'].value;
I don't see any value after i run this
What I am doing wrong here?

Elastic: How to correct the auto generated mapping?

I'm using Elastic cloud hosted in Azure and use NEST for the client. I have a part of auto generated mapping that I need to change from
"Bonus" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
to
"Bonus" : {
"properties" : {
"Amount" : {
"properties" : {
"Value" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
},
"PayrollSyncDateTime" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
},
When I tried to do it, I get illegal_argument_exception error with the message "can't merge a non object mapping [activityData.Bonus] with an object mapping". How can I correct the auto generated mapping?

ES index mapping has "analyzer" parameter

One of my indices had this mapping (a lot not closely related content were omitted for simplicity):
{
"properties" : {
"analyzer" : {
"type" : "keyword",
"ignore_above" : 2048
},
"query" : {
"properties" : {
"bool" : {
"properties" : {
"filter" : {
"properties" : {
"range" : {
"properties" : {
"publish_time" : {
"properties" : {
"gte" : {
"type" : "date",
"format" : "yyyy-MM-dd"
},
"lte" : {
"type" : "date",
"format" : "yyyy-MM-dd"
}
}
}
}
}
}
},
"must" : {
"properties" : {
"match" : {
"properties" : {
"content" : {
"type" : "keyword",
"ignore_above" : 2048
}
}
},
"term" : {
"properties" : {
"topic_id" : {
"type" : "keyword",
"ignore_above" : 2048
}
}
}
}
}
}
},
"match" : {
"properties" : {
"doc_id" : {
"type" : "keyword",
"ignore_above" : 2048
}
}
},
"match_all" : {
"type" : "object"
}
}
},
}
}
I don't know why the "properties" has a filed "analyzer", I have read the offical doc and searched in google for quite a while, but find nearly nothing helps.
Actually, I don't know why there is a "query" field either, but fortunately, I find the answer in the post "ES index mapping has "query" parameter" (link ES index mapping has "query" parameter).
I guess there maybe three possible explanations:
"analyzer" parameter is indeed the keyword to specify the analyzer used when indexing all fields of the current index (I guess this has a medium possibility);
"analyzer" parameter is a result of incorrect use of some commond as why there is "query" parameter (I guess this has a medium possibility);
"analyzer" parameter is a user defined filed (I guess this has a lower possibility).
So, could someone help me with this issue please?

ElasticSearch Tokenizer keywords

I'm wondering how elastic search tokenizes keywords.
Example:
So I'm using a search box for searching keywords in comments.
When I search for "Zelle" only comments in Spanish showed up.
enter image description here
But if I search for "Zell", all comments with "Zelle" showed up, with highlighting "Zell".
enter image description here
Can anyone please tell me why when I search for some keywords, only some comments in specific languages showed up?
Edit1:
The mapping is like this:
{
"comments" : {
"mappings" : {
"ios" : {
"properties" : {
"content" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"country" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"date" : {
"type" : "date"
},
"language" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"product_id" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"product_version" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"rating" : {
"type" : "long"
},
"title" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"user_language" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
}
and it did not have any info with the tokenizer.
How should I know which tokenizer es uses for searching?
I recommend you read the Mapping chapter of the official book, it will help you a lot.
To answer your question, we need to know the Mapping of your documents, specifically, the mapping of the field you search in.
By the look of it, you do not use the default analyzer (called "standard"), because "Zell" would not match "Zelle" with it.
In Elasticsearch you have analyzer which tokenize your content the way you want. And by the look of it, some analyzer is setup in your mapping, because "Zelle" and "Zell" are matching.

Is there a way to Search through Elastic Search to get all results that have an ID contained in an array of IDs?

Been trying to find a way to do this for a couple days now. I've looked through 'bool', 'constant_score', 'filtered' queries none of which seem to be able to come up with the result I want.
One that HAS come close is the 'ids' query (does exactly what I described in the title of this questions) the one problem is that the key that I'm trying to search is not the '_id' value of the Elastic search index. Instead it is 'posterId' in the index below:
"_index": "activity",
"_type": "activity",
"_id": "<unique string id>",
"_score": null,
"_source": {
...
misc keys
...
"posterId": "<QUERY BASED ON THIS VALUE>",
"time": 20171007173623
}
Query that returns based on the _id value:
ids : {
type : "activity",
values : ["<unique string id>", ...]
}
as seen here: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-ids-query.html
How I want my query to work:
posterId : {
type : "activity",
values : [<list of posterIds>]
}
Returning all indicies that have posterIds contained in "<list of posterIds>"
< Edit > I'm trying to do this in one query as apposed to looping through each member of my list of posterIds because I also need to sort based on the time key and be able to page the query.
So, does anyone know of a built in query that does this or a work around?
Side note: if you feel like you're about to downvote this please just comment why, I'm about to be banned and I've read through all the guidelines and I feel like I'm following them but my questions rarely perform well. :( It would be much appreciated
Edit:
{
"activity" : {
"aliases" : { },
"mappings" : {
"activity" : {
"properties" : {
"-Kvp7f3epvW_dXSONzKj" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"actionId" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"actionType" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"activityType" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"attachedId" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"attachedType" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"cardType" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"noteTitleDict" : {
"properties" : {
"noun" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"subject" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"verb" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
},
"posterId" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"segueType" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"time" : {
"type" : "long"
}
}
}
},
"settings" : {
"index" : {
"creation_date" : "1507678305995",
"number_of_shards" : "5",
"number_of_replicas" : "1",
"uuid" : "<id>",
"version" : {
"created" : "5010199"
},
"provided_name" : "activity"
}
}
}
}
I think what you are looking for is a Terms Query
{
"query": {
"constant_score" : {
"filter" : {
"terms" : { "user" : ["kimchy", "elasticsearch"]}
}
}
}
}
This finds documents which contain the exact term Kimchy or elasticsearch in the index of the user field. You can read more about this here https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-terms-query.html
In your case you need to replace
the user with posterId.keyword
Kimchy and elasticsearch with all your posterIds
Keep in mind that a terms query is case sensitive and the keyword field does not use a lowercase analyzer (which means it'll save/index the value in the same case it was received)

Resources