Elastic Search multi match - elasticsearch

I'm new to ElasticSearch. Now I have a requirement that need to return all result which contains the keyword.
public Class People(){
public string UserId {get; set;}
public string FirstName {get; set;}
public string LastName {get; set;}
}
I want to filter all People if one of three fileds contains the keyword, similar to like "%keyword%".
For example,I have a People
var people = new People() {
UserId = "lastname.middlename.firstname",
FirstName = "firstname",
LastName = "lastname"
}
How I could get this Peoplle by searching the keyword ddl, How to setup the index and how to query.
I have tried to query with NEST like below
var keyword = "ddl"
var result = await _client.SearchAsync<People>(s =>
s.Query(q => q.MultiMatch(m => m.Fields(f => f.Field(ff => ff.UserId).Field(ff => ff.FirstName).Field(ff => ff.LastName)).Query(keyword)))
);
It won't work. It only work when I changed the keyword to firstname or lastname or lastname.middlename.firstname
Is there any way to meet the requirement?

The short answer is that you would want to configure an analyzer for each of the target fields that tokenizes terms into trigrams, probably using the ngram token filter with min_gram and max_gram set to 3. This analysis will generate a ddl token for middlename that would then match your query.
The longer answer is that you'll want to have a read about Analysis, and how to write and test analyzers with the .NET client. You may want to go through the example repository that builds a Nuget search application. It's a fairly involved walkthrough that goes through a number of concepts, including analysis.

To search on parts of your fields, you should use an ngram tokenizer in your mapping.
It will tokenize your fields using windows of different size.
This should solve your problem, but you need to take care of several points :
It is likely that you will want to use this analyzer only at index time. Using this tokenizer both at indexation and search is likely to generate a LOT of irrelevant results.
Use a minimum window size (min_gram parameter) not to low. In your case 3 will work.
The size of your index can substantially grow.
Another solution, simpler to implement but usually not efficient is to use wildcards queries in query string. It is very similar to the LIKE operator in SQL.

Related

How to query upon the given parameters in Mongo repository

I have a mongo repostiory query as below, If we provide both name and price it gives the response. I want to get response if we only give name or price or both. how to make those parameters optional. if we provide both name and price i want to retrieve aggregated result unless i want to search just from the given field. Much appreciate you help.
List<Response> findByNameAndPrice(String name, int price)
Either you may need to implement custom JPA query or need to use QueryDSL in such scenarios.
1) Custom JPA Query like, you may need to change the query if you want to ad new optional parameters.
#Query(value = "{$or:[{name: ?0}, {?0: null}], $or:[{price: ?1}, {?1: null}]}")
List<Response> findByNameAndPrice(String name, Integer price)
2) QueryDSL Approach where you can add as many optional parameters, no need to modify your code. It will generate query automatically.
Refer this link for more : Spring Data MongoDB Repository - JPA Specifications like
I don't believe you'll be able to do that with the method name approach to query definition. From the documentation (reference):
There is a JIRA ticket regarding this which is still under investigation by the Spring team.
You can try this way
In repository
List<Response> findByName(String name)
List<Response> findByPrice(int price)
List<Response> findByNameAndPrice(String name, int price)
In your service file
public List<Response> findByNameAndPrice(String name, int price){
if(name == null ){
return repository.findByName(name);
}
if( price == 0){
return repository.findByPrice(price);
}
return repository.findByNameAndPrice(name, price);
}
Here I found a simple solution using mongodb regex,we can write a query like this,
#Query("{name : {$regex : ?0}, type : {$regex : ?1})List<String> findbyNameAndType(String name, String type)
The trick is if we want to query upon the given parameter, we should set other parameters some default values. As an example here thinks we only provide name. Then we set the type to select its all possible matches by setting its default param as ".*".
This regex pattern gives any string match.

JHipster - Elasticsearch examples for search field

Having an example entity like
entity Vehicle {
id Long,
manufacturer String,
model String
}
and using ElasticSearch, how do queries in the search input field have to look like in the front end provided by the JHipster application?
A search might involve only one column or a combination of the given columns.
Let's say:
id = 1
manufacturer = Ferrari
manufacturer = Ferrari AND model = Model 1
or involving other query keywords like LIKE, etc.
How to specify those queries into the search input field?

Optimize Sitecore Lucene / Solr query

I am trying to optimize some Lucene/Solr queries that are done via the Sitecore ContentSearch API. Specifically, when it comes to searching a MultiListField.
Environment:
Sitecore 8.1u2, Solr
I have the following Method for querying the values of a multilistfield:
public static Expression<Func<SearchResultItem, bool>> MultiFieldContainsExpression(IEnumerable<string> fieldNames, IEnumerable<string> ids)
{
//fieldNames = ["field_A", "field_X"]
//ids = [GUID_A, GUID_X]
Expression<Func<SearchResultItem, bool>> expression = PredicateBuilder.True<SearchResultItem>();
foreach (string fieldname in fieldNames)
{
ids.ForEach(id =>
{
expression = expression.Or(i => i[fieldName].Contains(IdHealper.NormalizeGuid(id, true)));
});
}
return expression;
}
The resulting Lucene query looks like this:
((field_A:(*GUID_A*) OR field_A:(*GUID_X*) OR field:_X:(*GUID_A*) OR field_X:(*GUID_X*)))
I want the query to be more like this (or even better if possible):
((field_A:(*GUID_A* OR *GUID_X*) OR (field_X:(*GUID_A* OR *GUID_X*)))
Basically, to check if the array of values in the field contains any value from another array. Thank you very much in advance.
Sitecore by default indexs a multilist field as a space separated list of lowercase Guids (Guid.ToString("N")). It could be useful to denormalize the relationship and store the item name or content within the item document through the use of a Computed Field. With the Computed Field it can iterate over the referenced items and turn them into a single field containing their names. You’ll still want to keep the Guid field around for cases when you want to limit the results to just a particular referenced item for the case where you know the exact Guid.

Spring data elastic search - Query - Full text search

I am trying to use elastic search for full text search and Spring data for integrating elastic search with my application.
For example,
There are 6 fields to be indexed.
1)firstName
2)lastName
3)title
4)location
5)industry
6)email
http://localhost:9200/test/_mapping/
I can see these fields in the mapping.
Now, I would like to make a search against these fields with a search input.
For example, When I search "mike 123", it has to search against all these 6 fields.
In Spring data repository,
The below method works to search only in firstName.
Collection<Object> findByFirstNameLike(String searchInput)
But, I would like to search against all the fields.
I tried,
Collection<Object> findByFirstNameLikeOrLastNameLikeOrTitleLikeOrLocationLikeOrIndustryLikeOrEmailLike(String searchInput,String searchInput1,String searchInput2,String searchInput3,)
Here, even the input string is same, i need to pass the same input as 6 params. Also the method name looks bigger with multiple fields.
Is there anyway to make it simple with #Query or ....
Like,
Collection<Object> findByInput(String inputString)
Also, boosting should be given for one of the field.
For example,
When i search for "mike mat", if there is any match in the firstName, that should be the first one in the result even there are exact match in the other fields.
Thanks
Lets suppose your search term is in the variable query, you can use the method search in ElasticsearchRepository.
repo.search(queryStringQuery(query))
to use queryStringQuery use the following import
import static org.elasticsearch.index.query.QueryBuilders.queryStringQuery;
I found the way to achieve this and posting here. Hope, this would help.
QueryBuilder queryBuilder = boolQuery().should(
queryString("Mike Mat").analyzeWildcard(true)
.field("firstName", 2.0f).field("lastName").field("title")
.field("location").field("industry").field("email"));
Thanks
Not a spring-data elasticsearch expert. But I see two directions you can go. The first would be to use the #Query option. That way you can create your own query. The second would be to use the example in the Filter builder section:
http://docs.spring.io/spring-data/elasticsearch/docs/current/reference/html/#elasticsearch.misc.filter
Within elasticearch you would want to use the multi_match query:
http://www.elastic.co/guide/en/elasticsearch/reference/1.5/query-dsl-multi-match-query.html
In java such a query would look like this:
QueryBuilder qb = multiMatchQuery(
"kimchy elasticsearch",
"user", "message"
);
Example coming from: http://www.elastic.co/guide/en/elasticsearch/client/java-api/current/query-dsl-queries.html#multimatch
We can write our own custom query as below.
we can specific index, routing value (this is used if alias is used)
SearchQuery searchQuery = new NativeSearchQueryBuilder().withIndices(INDEX)
.withRoute(yourQueryBuilderHelper.getRouteValue())
.withQuery(yourQueryBuilderHelper.buildQuery(yourSearchFilterRequestObject))
.withFilter(yourQueryBuilderHelper.buildFilter(yourSearchFilterRequestObject)).withTypes(TYPE)
.withSort(yourQueryBuilderHelper.buildSortCriteria(yourSearchFilterRequestObject))
.withPageable(yourQueryBuilderHelper.buildPaginationCriteria(yourSearchFilterRequestObject)).build();
FacetedPage<Ticket> searchResults = elasticsearchTemplate.queryForPage(searchQuery, YourDocumentEntity.class);
Its good to use your own queryBuilder helper which can seperate your elasticSearchService from queryBuilder responsibility.
Hope this helps
Thanks
QueryBuilder class is helpful to query from spring Dao to elastic search:
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.index.query.QueryBuilder;
QueryBuilder qb = QueryBuilders.boolQuery()
.must(QueryBuilders.termQuery("state", "KA"));
.must(QueryBuilders.termQuery("content", "test4"))
.mustNot(QueryBuilders.termQuery("content", "test2"))
.should(termQuery("content", "test3"));
.should(termQuery("content", "test3"));
Try like this, you can even set importance of the field
QueryBuilder queryBuilder = QueryBuilders.multiMatchQuery(query)
.field("name", 2.0f)
.field("email")
.field("title")
.field("jobDescription", 3.0f)
.type(MultiMatchQueryBuilder.Type.PHRASE_PREFIX);
Another way is using Query String query
Query searchQuery = new StringQuery(
"{\"query\":{\"query_string\":{\"query\":\""+ your-query-here + "\"}}}\"");
SearchHits<Product> products = elasticsearchOperations.search(
searchQuery,
Product.class,
IndexCoordinates.of(PRODUCT_INDEX_NAME));
This will search all the field from your document of specified index

Substring with spacebar search in RavenDB

I'm using such a query:
var query = "*" + QueryParser.Escape(input) + "*";
session.Query<User, UsersByEmailAndName>().Where(x => x.Email.In(query) || x.DisplayName.In(query));
With the support of a simple index:
public UsersByEmailAndName()
{
Map = users => from user in users
select new
{
user.Email,
user.DisplayName,
};
}
Here I've read that:
"By default, RavenDB uses a custom analyzer called
LowerCaseKeywordAnalyzer for all content. (...) The default values for
each field are FieldStorage.No in Stores and FieldIndexing.Default in
Indexes."
The index contains fields:
DisplayName - "jarek waliszko" and Email - "my_email#domain.com"
And finally the thing is:
If the query is something like *_email#* or *ali* the result is fine. But while I use spacebar inside e.g. *ek wa*, nothing is returned. Why and how to fix it ?
Btw: I'm using RavenDB - Build #960
Change the Index option for the fields you want to search on to be Analyzed, instead of Default
Also, take a look here:
http://ayende.com/blog/152833/orders-search-in-ravendb
Lucene’s query parser interprets the space in the search term as a break in the actual query, and doesn’t include it in the search.
Any part of the search term that appears after the space is also disregarded.
So you should escape space character by prepending the backslash character before whitespace character.
Try to query *jarek\ waliszko*.
So.., I've came up with an idea how to do it. I don't know if this is the "right way" but it works for me.
query changes to:
var query = string.Format("*{0}*", Regex.Replace(QueryParser.Escape(input), #"\s+", "-"));
index changes to:
public UsersByEmailAndName()
{
Map = users => from user in users
select new
{
user.Email,
DisplayName = user.DisplayName.Replace(" ", "-"),
};
}
I've just changed whitespaces into dashes for the user input text and spacebars to dashes in the indexed display name. The query gives expected results right now. Nothing else really changed, I'm still using LowerCaseKeywordAnalyzer as before.

Resources