Odd behavior with AND condition in elasticsearch - elasticsearch

When I run the query (tags:("a")) in elasticsearch, I get 0 results. My query URL looks like:
http://127.0.0.1:9200/haystack/_search?q=(tags%3A(%22a%22))
That is to be expected, since no objects have a tag set to "a".
Now when I change the condition, and add an AND, (org:("1") AND tags:("a")), I get 3 results back! The query URL looks like:
http://127.0.0.1:9200/haystack/_search?q=(org%3A(%221%22)%20AND%20tags%3A(%22a%22))
Getting more results back does not make any sense to me. I would expect that kind of behavior with the OR operator, but AND? What is going on?
Edit: This is caused by the snowball analyzer. See here

Related

How do I search Kibana field.keyword?

I'm on Kibana v7.4.2
I have 2 fields setup: message and message.keyword
If I just go to the Discover tab and search message.keyword:* it shows everything message:* does.
I want to search for a specific substring such as "[Special Message]". As I understand it, I have to use the message.keyword because message:"[Special Message]" will always be fuzzy. It will return entries that have this is a special message and not exactly here is [Special Message] which is insufficient for me.
However message.keyword:"[Special Message]" always returns No results match your search criteria
This behavior is observed the same in both KQL and Lucene.
How do I search field.keyword for something?

Why do Grafana ElasticSearch queries work when hard coded, but fail when using Grafana variable value substitution? and how to fix it?

ElasticSearch query works when hard coded, but fails when using Grafana variable value substitution:
Query: +nginx.access.upstream.response: [*, 1**, 2**, 3**, 4**, 5**, 500]
Each of these queries work when you hard code those values in the query.
Example Query: +nginx.access.upstream.response: 1**
^That works shows a table of data instead of "No data to show"
Although that works, it's better to use a variable with 7 values allows you to use 1 panel to display the same data that could be put in 7 hard coded panels, so that you end up with a cleaner user interface.
The problem is now that you've switched the hardcoded values to variable populated values the query no longer works.
The plugged in variable values [* and 500] work
The plugged in variable values [1**, 2**, 3**, 4**, 5**] don't work / result in "No data to show" as seen above.
There's something funny going on when the values get substituted into the query.
Q1.) What's the best tool/method to debug the true value of the variable after substitution/Figure out why it' failing?
Q2.) What's a method of fixing it/achieving the desired end result?
Q1.) What's the best tool/method to debug the true value of the variable after substitution/Figure out why it' failing?
Answer 1: Query Inspector
1** --when substituted becomes--> 1\\*\\*
which explains why it didn't work
Q2.) What's a method of fixing it/achieving the desired end result?
Answer 2: What worked for me was to avoid using the special character * in the variable values.
I rename the variable to HTTP Code Prefix and used the values [*,1,2,3,4,5]
I then used the Query: +nginx.access.upstream.response: $http_code_prefix*

Pythons Elasticsearch-DSL filter for exactly one match from list of values

I saw some realted posts but none of them match my exact issue.
Using Python 2.7 with Elasticsearch-dsl (6.3, that is also my Elasticsearch version).
I want to do something like,
s = Search(using=elastic_conn, index='my_index').filter("match", service_name=['exmp_name1', 'exmp_name2'])
This syntax doesn't work though.
I wish to get back all documents with service_name == 'exmp_name1' OR service_name == 'exmp_name2'
I prefer to use the filter context rather then query context as from my understanding it's faster and scoring really isn't important to me, just an absolute match (or mismatch).
How can I achieve this behavior?
Thanks
Ok. All I needed is to filter by terms rather then match.
The terms syntax supports several values.
Working code:
s = Search(using=elastic_conn, index='audit').filter("terms", service_name=['exmp_name1', 'exmp_name2'])

Solr query conundrum

I've recently swapped from using Lucene for Sitecore to Solr.
For the most part it has been smooth, but the way I was writing some queries (using Sitecore.ContentSearch.Linq) abstraction now don't seem to be compatible.
Specifically, I have a situation where I've got "global" content and "regional" content, like so:
Home (000)
X
Y
Z
Regions (ID: 111)
Region 1 (ID: 221)
A
B
Region 2 (ID: 222)
D
My code worked on Lucene, but now doesn't on Solr. It should find all "global" and a single region's content, excluding all other region's content. So as an example, if the user's current region was Region 1, I'd want the query to return content X, Y, Z, A, B.
Sitecore's Item Crawler has a field for each item in the index called "_path" which is a multivalued string field of IDs, so as an example, Region 1's _path field value would be [000, 111, 221 ].
When I write this using the Linq abstraction it comes out as below which doesn't return results.
-_path:(111) OR _path:(221)
But _path:(111) does return result. Mind blown.
When I use the Solr interface and wrap each side of the OR in extra brackets like below (which I'd consider redundant) it works! Mind blown v2.
(-_path:(111)) OR (_path:(221))
Firstly, what's the difference between those queries?
Secondly, my real problem is I can't add these extra brackets as I'm working in an abstraction Linq so the brackets will be "optimized" out.
Any advice would be awesome! Cheers.
The problem here is, lucene's negative queries don't work like you think they do. They only remove results from what has been found. -_path:111 doesn't find all documents which aren't in 111, it doesn't find anything at all. It only removes results. So you are finding all results with path "221", then removing any that also have path "111", which from your heirarchy, I assume is all of them. See my answer here for a bit more on that topic.
The OR makes it seem like it ought to work, but really -_path:(111) OR _path:(221) is the same as -_path:(111) _path:(221). The moral here is: Don't use Lucene's AND/OR/NOT syntax, if you can help it. Use +/-. +/- syntax actually expresses how the query operates, AND/OR/NOT doesn't. It attempts to shoehorn it into a different, SQL-like retrieval model and leads to some unexpected behavior like this.
So, what about: (-_path:(111)) OR (_path:(221))
Well, first, does it actually work? Or does it just get some results?
If it just gets some results, but just seems to get the same results as _path:221: The reason is -_path:111 gets no results, so your query is, in practice, something like: (nothing) OR (_path:221), which is equivalent to _path:221
If it really does get the results you expect (I'm guessing it probably does): Something is translating your query into something like: (*:* -_path:111) (_path:221). Solr does have some logic along these lines, though I'm not quite sure in this case. Essentially, it puts a match-all in front of any lonely negative queries it finds, allowing them to do what you were expecting. If the implicit *:* makes you nervous about performance, well, it should. But lucene is an inverted index, it does well with finding matches on a term quickly. Getting everything that doesn't match goes against the grain of that retrieval model, and will pretty much have to do a full scan of the index.

SolrNet query assistance and some debugging options

How would I use SolrNet to execute a GREATER THAN/LESS THAN query?
Example:
My documents have a field called "minimumDays" and I only want to return docs where that field is LESS THAN OR EQUAL TO the number I pass into the query.
I currently have this, but am not sure it's correct.
int requestedDays = 3;
var minimumNightsQuery = new SolrQueryByRange<int>("minimumDays", 0, requestedDays, true);
Am I on the right track?
The second part here is if there is some way to better understand the query that is being passed into Solr from SolrNet? Debugging value or something where I can inspect the "q" variable for instance.
Thanks again for your help
You can use SolrQueryByRange for the first part of your question. Your code does look good. debugging your query and results might help. I have found that SolrNet does some odd things. - http://code.google.com/p/solrnet/wiki/Facets#Arbitrary_facet_queries
For the second part, You can intercept the ISolrConnection and put in your own in between. For a good start check this out: http://code.google.com/p/solrnet/source/browse/trunk/SampleSolrApp/LoggingConnection.cs?r=513
I have one that logs the query and the results, and if a config setting is on it appends the debug param and logs that result also. Its great info to have.... and one of the only ways to get it.

Resources