Elasticsearch : how to search multiple words in a copy_to field? - elasticsearch

I am currently learning Elasticsearch and stuck on the issue described below:
On an existing index (I don't know if it matter) I added this new mapping:
PUT user-index
{
"mappings": {
"properties": {
"common_criteria": { -- new property which aggregates other properties by copy_to
"type": "text"
},
"name": { -- already existed before this mapping
"type": "text",
"copy_to": "common_criteria"
},
"username": { -- already existed before this mapping
"type": "text",
"copy_to": "common_criteria"
},
"phone": { -- already existed before this mapping
"type": "text",
"copy_to": "common_criteria"
},
"country": { -- already existed before this mapping
"type": "text",
"copy_to": "common_criteria"
}
}
}
}
The goal is to search ONE or MORE values only on common_criteria.
Say that we have:
{
"common_criteria": ["John Smith","johny","USA"]
}
What I would like to achieve is an exact match searching on multiple values of common_criteria:
We should have a result if we search with John Smith or with USA + John Smith or with johny + USA or with USA or with johny and finally with John Smith + USA + johny (the words order does not matter)
If we search with multiple words like John Smith + Germany or johny + England we should not have a result
I am using Spring Data Elastic to build my query:
NativeSearchQueryBuilder nativeSearchQuery = new NativeSearchQueryBuilder();
BoolQueryBuilder booleanQuery = QueryBuilders.boolQuery();
String valueToSearch = "johny"
nativeSearchQuery.withQuery(booleanQuery.must(QueryBuilders.matchQuery("common_criteria", valueToSearch)
.fuzziness(Fuzziness.AUTO)
.operator(Operator.AND)));
Logging the request sent to Elastic I have:
{
"bool" : {
"must" :
{
"match" : {
"common_criteria" : {
"query" : "johny",
"operator" : "AND",
"fuzziness" : "AUTO",
"prefix_length" : 0,
"max_expansions" : 50,
"fuzzy_transpositions" : true,
"lenient" : false,
"zero_terms_query" : "NONE",
"auto_generate_synonyms_phrase_query" : true,
"boost" : 1.0
}
}
},
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
With that request I have 0 result. I know that request is not correct because of must.match condition and maybe the field common_criteria is also not well defined.
Thanks in advance for your help and explanations.
EDIT: After trying multi_match query.
Following #rabbitbr's suggestion I tried the multi_match query but does not seem to work. This is the example of a request sent to Elastic (with 0 result):
{
"bool" : {
"must" : {
"multi_match" : {
"query" : "John Smith USA",
"fields" : [
"name^1.0",
"username^1.0",
"phone^1.0",
"country^1.0",
],
"type" : "best_fields",
"operator" : "AND",
"slop" : 0,
"fuzziness" : "AUTO",
"prefix_length" : 0,
"max_expansions" : 50,
"zero_terms_query" : "NONE",
"auto_generate_synonyms_phrase_query" : true,
"fuzzy_transpositions" : true,
"boost" : 1.0
}
},
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
That request does not return a result.

I would try to use Multi-match query before creating a field to store all the others in one place.
The multi_match query builds on the match query to allow multi-field
queries.

Related

Add new suggestion context - ElasticSearch Suggestions

I have this search context in my index mapping
Index: region
"place_suggest": {
"type" : "completion",
"analyzer" : "simple",
"preserve_separators" : true,
"preserve_position_increments" : true,
"max_input_length" : 50,
"contexts" : [
{
"name" : "place_type",
"type" : "CATEGORY",
"path" : "place_type"
}
]
}
And I want to add a new context to this mapping
{
"name": "restricted",
"type": "CATEGORY",
"path": "restricted"
}
I've tried using Update Mapping API to add this new context like this:
PUT region_test/_mapping/
{
"properties" : {
"place_suggest" : {
"contexts": [
"name": "restricted",
"type": "CATEGORY",
"path": "restricted"
]
}
}
}
I'm using Kibana dev tools for running this query.
You will not be able to edit your field by adding the new context.
You need to create a new mapping and re-index your index.
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html#change-existing-mapping-parms

Elasticsearch search by keywords and boost

I'm using Spring Boot 2.0.5, Spring Data Elasticsearch 3.1.0 and Elasticsearch 6.4.2
I have loaded ElasticSearch with a set of articles. For each article, I have a keywords field with a string list of keywords e.g.
"keywords": ["Football", "Barcelona", "Cristiano Ronaldo", "Real Madrid", "Zinedine Zidane"],
For each user using the application, they can specify their keyword preferences with a weight factor.
e.g.
User 1:
keyword: Football, weight:3.0
keyword: Tech, weight:1.0
keyword: Health, weight:2.0
What I would like to do is find articles based on their keyword preferences and display them based on their weight factor preference (I think this relates to elastic search boost) and sort by latest article time.
This is what I have so far (only for one keyword):
public Page<Article> getArticles(String keyword, float boost, Pageable pageable) {
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withQuery(QueryBuilders.matchQuery("keywords", keyword).boost(boost))
.build();
return articleRepository.search(searchQuery);
}
As a user may have n number of keyword preferences, what would I need to change in the above code to support this?
Any suggestions would be highly appreciated.
Solution
OK I enabled logging so I can could see the elastic search query being produced. Then I updated the getArticles method to the following:
public Page<Article> getArticles(List<Keyword> keywords, Pageable pageable) {
BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
List<FilterFunctionBuilder> functions = new ArrayList<FilterFunctionBuilder>();
for (Keyword keyword : keywords) {
queryBuilder.should(QueryBuilders.termsQuery("keywords", keyword.getKeyword()));
functions.add(new FunctionScoreQueryBuilder.FilterFunctionBuilder(
QueryBuilders.termQuery("keywords", keyword.getKeyword()),
ScoreFunctionBuilders.weightFactorFunction(keyword.getWeight())));
}
FunctionScoreQueryBuilder functionScoreQueryBuilder = QueryBuilders.functionScoreQuery(queryBuilder,
functions.toArray(new FunctionScoreQueryBuilder.FilterFunctionBuilder[functions.size()]));
NativeSearchQueryBuilder searchQuery = new NativeSearchQueryBuilder();
searchQuery.withQuery(functionScoreQueryBuilder);
searchQuery.withPageable(pageable);
// searchQuery.withSort(SortBuilders.fieldSort("createdDate").order(SortOrder.DESC));
return articleRepository.search(searchQuery.build());
}
This produces the following elastic search query:
{
"from" : 0,
"size" : 20,
"query" : {
"function_score" : {
"query" : {
"bool" : {
"should" : [
{
"terms" : {
"keywords" : [
"Football"
],
"boost" : 1.0
}
},
{
"terms" : {
"keywords" : [
"Tech"
],
"boost" : 1.0
}
}
],
"disable_coord" : false,
"adjust_pure_negative" : true,
"boost" : 1.0
}
},
"functions" : [
{
"filter" : {
"term" : {
"keywords" : {
"value" : "Football",
"boost" : 1.0
}
}
},
"weight" : 3.0
},
{
"filter" : {
"term" : {
"keywords" : {
"value" : "Tech",
"boost" : 1.0
}
}
},
"weight" : 1.0
}
],
"score_mode" : "multiply",
"max_boost" : 3.4028235E38,
"boost" : 1.0
}
},
"version" : true
}
What you are looking for is the function_score query. Something along the lines of
{
"query": {
"function_score": {
"query": {
"bool": {
"should": [
{"term":{"keyword":"Football"}},
{"term":{"keyword":"Tech"}},
{"term":{"keyword":"Health"}}
]
}
},
"functions": [
{"filter":{"term":{"keyword":"Football"}},"weight": 3},
{"filter":{"term":{"keyword":"Tech"}},"weight": 1},
{"filter":{"term":{"keyword":"Health"}},"weight": 2}
]
}
}
}
See here for API help https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/java-compound-queries.html#java-query-dsl-function-score-query

Exact phrase match in ElasticSearch

I'm trying to achieve exact search by phrase in Elastic, using my existing index (full-text). When user is searching, say, "Sanity Testing", the result should bring all the docs with "Sanity Testing" (case-insensitive), but not "Sanity tested".
My mapping:
{
"doc": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"term_vector":"with_positions_offsets",
"analyzer":"o3analyzer",
"store": true
},
"title" : {"store" : "yes"},
"date" : {"store" : "yes"},
"keywords" : {"store" : "yes"},
"content_type" : {"store" : "yes"},
"content_length" : {"store" : "yes"},
"language" : {"store" : "yes"}
}
}
}
}
}
As I understand, there's a way to add another index with "raw" analyzer, but I'm not sure this will work due to the need to search as case-insensitive. And also I don't want to rebuild indexes, as there are hundreds machines with tons of documents already indexed, so it may take ages.
Is there a way to run such a query? I'm now trying to search using the following query:
{
query: {
match_phrase: {
file: "Sanity Testing"
}
}
and it brings me both "Sanity Testing" and "Sanity Tested".
Any help appreciated!

Elasticsearch Completion in middle of the sentence

is it possible to perform Completion on Elasticsearch and get result even if text is from the middle of input?
For instance:
"TitleSuggest" : {
"type" : "completion",
"index_analyzer" : "simple",
"search_analyzer" : "simple",
"payloads" : true,
"preserve_position_increments" : false,
"preserve_separators" : false
}
That's my current mapping and my query is
{
"passport": {
"text": "Industry Overview",
"completion": {
"field": "TitleSuggest",
"fuzzy": {
"edit_distance": 2
}
}
}
}
But nothing is returned, I have documents that contain Industry Overview in their input. For instance if I'm looking only for Industry:
{
"text" : "Industry",
"offset" : 0,
"length" : 8,
"options" : [{
"text" : "Airline Industry Sees Recovery in 2014",
"score" : 16
}, {
"text" : "Alcoholic Drinks Industry Overview",
"score" : 16
}, {
"text" : "Challenges in the Pet Care Industry For 2014",
"score" : 16
}
]
}
I can achieve that by using nGrams, but I'd like to get this done using completion suggesters
So my initial goal would getting this if I type in Industry Overview
{
"text" : "Industry Overview",
"offset" : 0,
"length" : 8,
"options" : [{
"text" : "Alcoholic Drinks Industry Overview",
"score" : 16
}
]
}
I've tried using shingle analyzer - that didn't solve the problem and I didn't come up on Google with anything useful.
ES Version : 1.5.1

How should I query Elastic Search given my mapping and using keywords?

I have a very simple mapping which looks like this (I streamlined the example a bit):
{
"location" : {
"properties": {
"name": { "type": "string", "boost": 2.0, "analyzer": "snowball" },
"description": { "type": "string", "analyzer": "snowball" }
}
}
}
Now I index a lot of locations using some random values which are based on real English words.
I'd like to be able to search for locations that match any of the given keywords in either the name or the description field (name is more important, hence the boost I gave it). I tried a few different queries and they don't return any results.
{
"fields" : ["name", "description"],
"query" : {
"terms" : {
"name" : ["savage"],
"description" : ["savage"]
},
"from" : 0,
"size" : 500
}
}
Considering there are locations which have the word savaged in the description it should get me some results (savage is the stem of savaged). It yields 0 results using the above query. I've been using curl to query ES:
curl -XGET -d #query.json http://localhost:9200/myindex/locations/_search
If I use query string instead:
curl -XGET http://localhost:9200/fieldtripfinder/locations/_search?q=description:savage
I actually get one result (of course now it would be searching the description field only).
Basically I am looking for a query that will do a OR kind of search using multiple keywords and compare them to the values in both the name and the description field.
Snowball stems "savage" into "savag" that’s why term "savage" didn't return any results. However, when you specify "savage" on URL, it’s getting analyzed and you get results. Depending on what your intention is, you can either use correct stem ("savag") or analyze your terms by using "match" query instead of "terms":
{
"fields" : ["name", "description"],
"query" : {
"bool" : {
"should" : [
{"match" : {"name" : "savage"}},
{"match" : {"description" : "savage"}}
]
},
"from" : 0,
"size" : 500
}
}

Resources