Filtered Query in Elasticsearch Java API - elasticsearch

I am little bit confused while creating Filtered query in Elasticsearch Java API.
SearchRequestBuilder class has setPostFilter method, javadoc of this method clearly says that filter will be applied after Query is executed.
However, there is no setFilter method Or some other method which will allow to apply filter before
query is executed. How do I create filtered Query(which basically applies filter before query is executed) here? Am I missing something?

FilteredQueryBuilder builder =
QueryBuilders.filteredQuery(QueryBuilders.termQuery("test",
"test"),FilterBuilders.termFilter("test","test"));
It will build the filtered query...To filteredQuery, first argument is query and second arguments is Filter.
Update: Filtered query is depreciated in elasticsearch 2.0+.refer
Hope it helps..!

QueryBuilders.filteredQuery is deprecated in API v. 2.0.0 and later.
Instead, filters and queries have "equal rights". FilterBuilders no longer exists and all filters are built using QueryBuilders.
To implement the query with filter only (in this case, geo filter), you would now do:
QueryBuilder query = QueryBuilders.geoDistanceQuery("location")
.point(center.getLatitude(), center.getLongitude())
.distance(radius, DistanceUnit.METERS);
// and then...
SearchResponse resp = client.prepareSearch(index).setQuery(query);
If you want to query by two terms, you would need to use boolQuery with must:
QueryBuilder query = QueryBuilders.boolQuery()
.must(QueryBuilders.termQuery("user", "ben"))
.must(QueryBuilders.termQuery("rank", "mega"));
// and then...
SearchResponse resp = client.prepareSearch(index).setQuery(query);

In case you just want to execute filter without query, you can do like this:
FilteredQueryBuilder builder = QueryBuilders.filteredQuery(QueryBuilders.matchAllQuery(),
FilterBuilders.termFilter("username","stackoverflow"));
Ref: Filtering without a query

It seems that by wrapping the query (a BoolQueryBuilder object) by giving it as an argument to boolQuery().filter(..) and then setting that in the setQuery() as you suggested- then this can be achieved. The "score" key in the response is always 0 (though documents are found)
Be careful here! If the score is 0, the score has been calculated. That means the query is still in query context and not in filter context. In filter context the score is set to 1.0 (without calculation) source
To create a query in filter context without calculation of the score (score = 1.0) do the following (Maybe there is still a better way?):
QueryBuilder qb = QueryBuilders.constantScoreQuery(QueryBuilders.boolQuery().must(QueryBuilders.matchQuery("name", "blub)));
This returns the same results like:
GET /indexName/typeName/_search
{
"filter": {
"query": {
"bool": {
"must": [
{ "match": {
"name": "blub"
}}
]
}
}
}
}

Since FilteredQueryBuilder is deprecated in the recent versions, one can use the QueryBuilders.boolQuery() instead, with a must clause for the query and a filter clause for the filter.
import static org.elasticsearch.index.query.QueryBuilders.*;
QueryBuilder builder = boolQuery().must(termQuery("test", "test")).filter( boolQuery().must(termQuery("test", "test")));

Related

Elastic search duplicate aggregation section in a query with NEST

I am creating a simple elastic request with Nest.
var searchRequest = new SearchRequest()
{
Aggregations = new AggregationDictionary()
{
}
};
When I serialize the query with elasticClient.RequestResponseSerializer.SerializeToString(searchRequest the query I am seeing like this :
{"aggs":{},"aggregations":{}}
Aggregation sections were created two times.
What should I do?
Is this a problem?
My nest version and elastic Search version are 7.6.2
Thanks.

What is replacement for FullTextQuery.setCriteriaQuery() in Hibernate Search 6?

I am migrating Hibernate Search 5 to Hibernate Search 6.
Though, the documentation is really helpful, I am not able to find alternative for criteria query in Hibernate Search 6 and didn't quite get from documentation.
This is the Hibernate Search 5 query that I am trying to convert,
final Criteria criteria = entityManager.unwrap(Session.class).createCriteria(KnowledgeData.class);
criteria.add(Restrictions.eq("deleted", knowledgeSearchRequest.isDeleted()));
if (knowledgeSearchRequest.isPublished()) {
criteria.add(Restrictions.eq("published", knowledgeSearchRequest.isPublished()));
}
if (!allDesk) {
criteria.add(Restrictions.eq("deskId", deskId));
knowledgeSearchRequest.setDesk(deskId);
} else {
Disjunction orJunction = Restrictions.disjunction();
for (String desk : knowledgeSearchRequest.getDeskIds()) {
orJunction.add(Restrictions.eq("deskId", desk));
}
criteria.add(orJunction);
}
if (knowledgeSearchRequest.getLang() != null && knowledgeSearchRequest.getLang().size() > 0) {
criteria.createAlias("language", "lan");
Disjunction disJunction = Restrictions.disjunction();
for (String lang : knowledgeSearchRequest.getLang()) {
disJunction.add(Restrictions.eq("lan.elements", lang));
}
criteria.add(disJunction);
}
if (knowledgeSearchRequest.getTags() != null && knowledgeSearchRequest.getTags().size() > 0) {
criteria.createAlias("tags", "tag");
Disjunction disJunction = Restrictions.disjunction();
for (String tag : knowledgeSearchRequest.getTags()) {
disJunction.add(Restrictions.eq("tag.elements", tag));
}
criteria.add(disJunction);
}
criteria.add(Restrictions.ne("dataType", DataType.FOLDER));
// if (userProvider.getCurrentUser().isSystemUser() || visibleToUser) {
final List<DataVisibility> visibility = new ArrayList<>();
visibility.add(DataVisibility.PUBLIC);
if (knowledgeSearchRequest.isAddCpUserDocs()) {
visibility.add(DataVisibility.ALL_USERS_OF_CUSTOMER_PORTAL_ONLY);
}
if (knowledgeSearchRequest.isIncludeCpDocs()) {
visibility.add(DataVisibility.CUSTOMER_PORTAL);
visibility.add(DataVisibility.ALL_SIGNED_IN_USERS_OF_CUSTOMER_PORTAL_ONLY);
visibility.add(DataVisibility.ALL_USERS_OF_CUSTOMER_PORTAL_ONLY);
}
criteria.add(Restrictions.in("visibility", visibility));
// }
if (knowledgeSearchRequest.isPublished()) {
final long now = System.currentTimeMillis();
criteria.add(Restrictions.or(
Restrictions.and(Restrictions.isNotNull("validFrom"), Restrictions.lt("validFrom", now)),
Restrictions.isNull("validFrom")));
criteria.add(Restrictions.or(
Restrictions.and(Restrictions.isNotNull("validTo"), Restrictions.gt("validTo", now)),
Restrictions.isNull("validTo")));
}
And, the predicate that i have built so far is,
searchPredicateFactory.bool(
f -> f.should(searchPredicateFactory.phrase().field(KnowledgeData.STANDARD_FIELD_NAME_NAME).boost(3)
.field(KnowledgeData.STANDARD_FIELD_NAME_DISPLAY_NAME).boost(3)
.field("description").boost(2).field("content").matching(resultantQuery))
.should(searchPredicateFactory.wildcard().field(KnowledgeData.STANDARD_FIELD_NAME_NAME).boost(3)
.field(KnowledgeData.STANDARD_FIELD_NAME_DISPLAY_NAME).boost(3)
.field("description").boost(2).field("content").matching(resultantQuery))).toPredicate();
Any leads are appreciated.
This is the Hibernate Search 5 query that I am trying to convert,
I'll nitpick a bit: this is not a Hibernate Search 5 query, this is a Hibernate (ORM) Criteria query. Those restrictions are executed against the database, not against the search indexes.
From the title of your question, I'll assume you are adding those restrictions to your Hibernate Search query using FullTextQuery.setCriteriaQuery(). Be aware that the documentation in Hibernate Search 5 states "using restriction (ie a where clause) on your Criteria query should be avoided" and the javadoc goes even further by stating "No where restriction can be defined".
Regardless... it seems it used to work in Hibernate Search 5, at least in some cases.
Now, to migrate this to Hibernate Search 6+, there is a detailed migration guide, with a section specifically about your problem:
Hibernate Search 6 does not allow adding a Criteria object to a search query.
[...]
If your goal is to apply a filter expressed by an SQL "where" clause executed in-database, rework your query to project on the entity ID, and execute a JPA/Hibernate ORM query after the search query to filter the entities and load them.
So in short, do something like this:
List<Long> ids = Search.session(entityManager).search(MyEntity.class)
.select(f -> f.id(Long.class))
.where(f -> ...)
.fetchHits(20);
criteria.add(Restrictions.in("id", ids));
List<MyEntity> hits = criteria.list();
Note this is only a quick fix: just like setCriteria in Hibernate Search 5, this can perform very badly, plays very badly with pagination, and can result in incorrect hit counts.
I would recommend indexing the properties you use in your Criteria query, and defining your whole query using Hibernate Search only, so as to avoid running the query once against Elasticsearch and then once again against your database.
See also https://hibernate.atlassian.net/browse/HSEARCH-3630

How to create a search with or clause using Elasticsearch

I want to create a query with Elasticsearch Java API but I don't know how to create an OR clause? What I want to query is;
SELECT *
FROM USERS
WHERE (user.name = "admin") AND (user.message LIKE "test*") AND (user.age = "30" OR user.status = "major")
I have created a query like below but I don't know how to create an OR clause like sql query;
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.must(QueryBuilders.matchQuery("name", "admin"));
boolQueryBuilder.must(QueryBuilders.matchQuery("message", "test*"));
boolQueryBuilder.must(QueryBuilders.matchQuery("age", "30"));
boolQueryBuilder.must(QueryBuilders.matchQuery("status","major"));
You simply need to capture the OR condition inside another bool/should query
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.must(QueryBuilders.matchQuery("name", "admin"));
boolQueryBuilder.mustNot(QueryBuilders.matchQuery("message", "test*"));
BoolQueryBuilder orQuery = QueryBuilders.boolQuery();
orQuery.should(QueryBuilders.matchQuery("age", "30"));
orQuery.should(QueryBuilders.matchQuery("status","major"));
orQuery.minimumNumberShouldMatch(1);
boolQueryBuilder.must(orQuery);
PS: not sure why you have a mustNot for your second constraint.
The "should" keyword in bool queries works like "OR", but in Elasticsearch those queries are not very sharp as everything is scored.
Have a look at:
https://www.elastic.co/guide/en/elasticsearch/guide/current/bool-query.html#bool-query
Also there are very good examples in the book Elasticsearch in Action.

java: how to limit score results in mongo

I have this mongo query (java):
TextQuery.queryText(textCriteria).sortByScore().limit(configuration.getSearchResultSize())
which performs a text search and sort by score.
I gave different wiehgt to different fields in the docuemnt, and now I'd like to retrieve only those results with score lower then 10.
is there a way to add that criteria to the query?
this didn't work:
query.addCriteria(Criteria.where("score").lt(10));
if the only way is to use aggregation - I need a mongoTemplate example for that.
in other words
how the do I translate the following mongo shell aggregate command, to java spring's mongoTemplate command??
can't find anywhere how to use the aggregate's match() API with the $text search component (the $text is indexed on several different fields):
db.text.aggregate(
[
{ $match: { $text: { $search: "read" } } },
{ $project: { title: 1, score: { $meta: "textScore" } } },
{ $match: { score: { $lt: 10.0 } } }
]
)
Thanks!
Please check with below code sample, MongoDB search with pagination code in java
BasicDBObject query = new BasicDBObject()
query.put(column_name, new BasicDBObject("$regex", searchString).append("$options", "i"));
DBCursor cursor = dbCollection.find(query);
cursor.skip((pageNum-1)*limit);
cursor.limit(limit);
Write a loop and and call the above code from loop and pass the values like pageNum starts from 1 to n and limit depends on your requirement. check the cursor is empty or not. If empty skip the loop if not continue calling the above code base.
Hope this will be helpful.

Elasticsearch 2.x index mapping _id

I ran ElasticSearch 1.x (happily) for over a year. Now it's time for some upgrading - to 2.1.x. The nodes should be turned off and then (one-by-one) on again. Seems easy enough.
But then I ran into troubles. The major problem is the field _uid, which I created myself so that I knew the exact location of a document from a random other one (by hashing a value). This way I knew that only that the exact one will be returned. During upgrade I got
MapperParsingException[Field [_uid] is a metadata field and cannot be added inside a document. Use the index API request parameters.]
But when I try to map my former _uid to _id (which should also be good enough) I get something similar.
The reason why I used the _uid param is because the lookup time is a lot lower than a termsQuery (or the like).
How can I still use the _uid or _id field in each document for the fast (and exact) lookup of certain exact documents? Note that I have to call thousands exact ones at the time, so I need an ID like query. Also it may occur the _uid or _id of the document does not exist (in that case I want, like now, a 'false-like' result)
Note: The upgrade from 1.x to 2.x is pretty big (Filters gone, no dots in names, no default access to _xxx)
Update (no avail):
Updating the mapping of _uid or _id using:
final XContentBuilder mappingBuilder = XContentFactory.jsonBuilder().startObject().startObject(type).startObject("_id").field("enabled", "true").field("default", "xxxx").endObject()
.endObject().endObject();
CLIENT.admin().indices().prepareCreate(index).addMapping(type, mappingBuilder)
.setSettings(Settings.settingsBuilder().put("number_of_shards", nShards).put("number_of_replicas", nReplicas)).execute().actionGet();
results in:
MapperParsingException[Failed to parse mapping [XXXX]: _id is not configurable]; nested: MapperParsingException[_id is not configurable];
Update: Changed name into _id instead of _uid since the latter is build out of _type#_id. So then I'd need to be able to write to _id.
Since there appears to be no way around setting the _uid and _id I'll post my solution. I mapped all document which had a _uid to uid (for internal referencing). At some point it came to me, you can set the relevant id
To bulk insert document with id you can:
final BulkRequestBuilder builder = client.prepareBulk();
for (final Doc doc : docs) {
builder.add(client.prepareIndex(index, type, doc.getId()).setSource(doc.toJson()));
}
final BulkResponse bulkResponse = builder.execute().actionGet();
Notice the third argument, this one may be null (or be a two valued argument, then the id will be generated by ES).
To then get some documents by id you can:
final List<String> uids = getUidsFromSomeMethod(); // ids for documents to get
final MultiGetRequestBuilder builder = CLIENT.prepareMultiGet();
builder.add(index_name, type, uids);
final MultiGetResponse multiResponse = builder.execute().actionGet();
// in this case I simply want to know whether the doc exists
if (only_want_to_know_whether_it_exists){
for (final MultiGetItemResponse response : multiResponse.getResponses()) {
final boolean exists = response.getResponse().isExists();
exist.add(exists);
}
} else {
// retrieve the doc as json
final String string = builder.getSourceAsString();
// handle JSON
}
If you only want 1:
client.prepareGet().setIndex(index).setType(type).setId(id);
Doing - the single update - using curl is mapping-id-field (note: exact copy):
# Example documents
PUT my_index/my_type/1
{
"text": "Document with ID 1"
}
PUT my_index/my_type/2
{
"text": "Document with ID 2"
}
GET my_index/_search
{
"query": {
"terms": {
"_id": [ "1", "2" ]
}
},
"script_fields": {
"UID": {
"script": "doc['_id']"
}
}
}

Resources