ElasticSearch - creating exceptions for fuzzy terms - elasticsearch

I have simple elastic query that does a simple text field search with the fuziness distance of one:
GET /jobs/_search
{
"query": {
"fuzzy": {
"attributes.title": {
"value": "C#"
"fuzziness": 1
}
}
}
}
The above query does exactly what it is told to do, but I have a cases where I don't want a word to resolve (with fuzziness) to another specific word. In this case, I don't want C# to also return C++ results. Similarly I don't want cat to return car results.
However I do still need the fuzziness option if someone did actually misspelled cat. In that case it can return results of both cat and car.

I think this is possible with some bool query combination, it should be something like this:
bool:
//should
//match query without fuzzy
//bool
//must
//must with fuzzy query
//must_not with match query

Related

Elastic Search - Accessing a member of an element inside a list

I'm relatively new to elastic search and have a question about accessing an element inside of an element inside of a list. The structure is as follows:
{
'TestA':'1',
'TestB':{
'TestC':'2',
'TestD':[
{
'TestE':'3',
'TestF':'4'
},
{
'TestE':'5',
'TestF':'6'
}
]
}
}
With this following structure I want to return all the results from the query in which TestF has a value of 6. I was wondering if this is possible with the following template.
{
"query":{
"bool":{
"must":[
{
"match":{
"TestB.TestD.TestF":'6'
}
}
]
}
}
}
Would {"match" : { "TestB.TestD.TestF": '6'}} be able to search through each element of 'TestD' or would I need to use some other command to iterate through the list? This is with elastic search 5.0. Thanks in advance!
Yes, your match query should find the results you are looking for. Elasticsearch flattens arrays when it puts them in the inverted index. For more information, check out the docs:
https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html#_how_arrays_of_objects_are_flattened
Arrays of inner object fields do not work the way you may expect.
Lucene has no concept of inner objects, so Elasticsearch flattens
object hierarchies into a simple list of field names and values.

Format reading ElasticSearch dates

This is my mapping for one of the properties in my ElasticSearch model:
"timestamp":{
"type":"date",
"format":"dd-MM-yyyy||yyyy-MM-dd'T'HH:mm:ss.SSSZ||epoch_millis"
}
I'm not sure if I'm misunderstanding the documentation. It clearly says:
The first format will also act as the one that converts back from milliseconds to a string representation.
And that is exactly what I want. I would like to be able to read directly (if possible) the dates as dd-MM-yyyy.
Unfortunately, when I go to the document itself (so, accessing to the ElasticSearch's endpoint directly, not via the application layer) I still get:
"timestamp" : "2014-01-13T15:48:25.000Z",
What am I missing here?.
As #Val mentioned, you'd get the value/format as how it is being indexed.
However if you want to view the date in particular format regardless of the format it has been indexed, you can make use of Script Fields. Note that it would be applied at querying time.
Below query is what your solution would be.
POST <your_index_name>/_search
{
"query":{
"match_all":{ }
},
"script_fields":{
"timestamp":{
"script":{
"inline": "def sf = new SimpleDateFormat(\"dd-MM-yyyy\");def dt = new Date(doc['timestamp'].value);def mydate = sf.format(dt);return mydate;"
}
}
}
}
Let me know how it goes.

java: how to limit score results in mongo

I have this mongo query (java):
TextQuery.queryText(textCriteria).sortByScore().limit(configuration.getSearchResultSize())
which performs a text search and sort by score.
I gave different wiehgt to different fields in the docuemnt, and now I'd like to retrieve only those results with score lower then 10.
is there a way to add that criteria to the query?
this didn't work:
query.addCriteria(Criteria.where("score").lt(10));
if the only way is to use aggregation - I need a mongoTemplate example for that.
in other words
how the do I translate the following mongo shell aggregate command, to java spring's mongoTemplate command??
can't find anywhere how to use the aggregate's match() API with the $text search component (the $text is indexed on several different fields):
db.text.aggregate(
[
{ $match: { $text: { $search: "read" } } },
{ $project: { title: 1, score: { $meta: "textScore" } } },
{ $match: { score: { $lt: 10.0 } } }
]
)
Thanks!
Please check with below code sample, MongoDB search with pagination code in java
BasicDBObject query = new BasicDBObject()
query.put(column_name, new BasicDBObject("$regex", searchString).append("$options", "i"));
DBCursor cursor = dbCollection.find(query);
cursor.skip((pageNum-1)*limit);
cursor.limit(limit);
Write a loop and and call the above code from loop and pass the values like pageNum starts from 1 to n and limit depends on your requirement. check the cursor is empty or not. If empty skip the loop if not continue calling the above code base.
Hope this will be helpful.

elastic4s - control the analyzer to use in term query

I want to control the analyzer in my search query.
At the moment my code looks like this:
client.execute(search in indexName / documentType query {
bool {
must(
termQuery("email", email),
termQuery("name", name)
)
}
}
How can I control the analyzer here?
Note that a term query does not analyze the search terms, so what you're looking for is probably a match query instead and it would go like this:
client.execute(search in indexName / documentType query {
bool {
must(
termQuery("email", email),
matchQuery("name", name) <--- change this to match query
.analyzer(StandardAnalyzer) <--- add this line
)
}
}
The test cases are a good source of information as well. In the SearchDslTest.scala file you'll find how to set all possible properties of a match query.

ElasticSearch - specify range for a string field

I am trying to retrieve the mentions of years between 1933 and 1949 from a string field called text. However, I cannot seem to find the working range query for that. What I tried to so far crashes:
{"query":
{"query_string":
{
"text": [1933 TO 1949]
}
}
}
I have also tried it like this:
{"query":
{"filtered":
{"query":{"match_all":{}},
"filter":{"range":{"text":[1933 TO 1949]}
}
}
}
but it still crashes.
A sample text field looks like the one below, containing a mention of the year 1933:
"Primera División 1933 (Argentinië), seizoen in de Argentijnse voetbalcompetitie\n* Primera Divisió n 1933 (Chili), seizoen in de Chileense voetbalcompetitie\n* Primera División 1933 (Uruguay), seizoen in de Uruguayaanse voetbalcompetitie\n \n "
However, I also have documents not containing any years inside, and I would like to filter all the documents to preserve only the ones mentioning years in a given period. I read here http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html that the range query can be applied to text fields as well, and I don't want to use any intermediate solution to identify dates inside texts.
What I basically want to achieve is to be able to get the same results as when using a search URI query:
urltomyindex/_search?q=text:%7B1933%20TO%201949%7D%27
which works perfectly.
Is it still possible to achieve my goal? Any help much appreciated!
This should do it:
GET index1/type1/_search
{
"query": {
"filtered": {
"filter": {
"terms": {
"fieldNameHere": [
"1933",
"1934",
"1935",
"1936",
"1937",
"1938",
"1939",
"1940",
"1941",
"1942",
"1943",
"1944",
"1945",
"1946",
"1947",
"1948",
"1949"
]
}
}
}
}
}
If you know you're going to be needing this kind of search frequently it would be much better to create a new field "yearPublished" or something like that so you can search it as a number vs a text field.

Resources