Elasticsearch URI based query with AND operator - elasticsearch

How do I specify AND operation in URI based query? I'm looking for something like:
http://localhost:9200/_search?q="profiletype:student AND username:s*"

For URI search in ElasticSearch, you can use either the AND operator profiletype:student AND username:s, as you say but without quotes:
_search?q=profiletype:student AND username:s*
You can also set the default operator to AND
_search?q=profiletype:student username:s*&default_operator=AND
Or you can use the + operator for terms that must be present, i.e. one would use +profiletype:student +username:s as query string. This doesn't work without URL encoding, though. In URL encoding + is %2Band space is %20, therefore the alternative would be
_search?q=%2Bprofiletype:student%20%2Busername:s*

According to documentation, it should work as you described it. See http://www.elasticsearch.org/guide/reference/query-dsl/query-string-query.html
That said, you can also use the following:
http://localhost:9200/_search?q="+profiletype:student +username:s*"

You can try bellow line.
http://localhost:9200/_search?q=profiletype:student%20AND%20username:s*
http://localhost:9200/_search?q=profiletype:student AND username:s*

i had to combine the default operator and + sign to make it work
curl -XGET 'localhost:9200/apm/inventory/_search?default_operator=AND&q=tenant_id:t2+_all:node_1'

Related

elasticsearch - fulltext search for words with special/reserved characters

I am indexing documents that may contain any special/reserved characters in their fulltext body. For example
"PDF/A is an ISO-standardized version of the Portable Document Format..."
I would like to be able to search for pdf/a without having to escape the forward slash.
How should i analyze my query-string and what type of query should i use?
The default standard analyzer will tokenize a string like that so that "PDF" and "A" are separate tokens. The "A" token might get cut out by the stop token filter (See Standard Analyzer). So without any custom analyzers, you will typically get any documents with just "PDF".
You can try creating your own analyzer modeled off the standard analyzer that includes a Mapping Char Filter. The idea would that "PDF/A" might get transformed into something like "pdf_a" at index and query time. A simple match query will work just fine. But this is a very simplistic approach and you might want to consider how '/' characters are used in your content and use slightly more complex regex filters which are also not perfect solutions.
Sorry, I completely missed your point about having to escape the character. Can you elaborate on your use case if this turns out to not be helpful at all?
To support queries containing reserved characters i now use the Simple Query String Query (https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-simple-query-string-query.html)
As of not using a query parser it is a bit limited (e.g. no field-queries like id:5), but it solves the purpose.

Remove specific parts from url

Lets suppose I have a url like this:
https://www.youtube.com/watch/3e4345?v=rwmEkvPBG1s
What is the best and shorthest way to only get the 3e4345 part?
Sometimes it doesn't contain additional params in ?
I don't want to use any gems.
What I did was:
url = url.split('/watch/')
url = url[1].split('/')[0].split('?')[0]
Is there a better way? Thanks
possibly the safest and best one. use URI.
URI("https://www.youtube.com/watch/34345?v=rwmEkvPBG1s").path.split("/").last
For more refer How to extract URL parameters from a URL with Ruby or Rails?
You could do the following and using the match function to find a match based on a regular expression statement. The value at [1] is the first capture from the regular expression. I have included a breakdown from regexper.com to help illustrate what the expression is accomplishing.
You will notice parentheses around the \d+ which are what captures the digits out of the URL when it matches.
url.to_s.match(/\/watch\/(\d+).*$/)[1]
x = "https://www.youtube.com/watch/34345?v=rwmEkvPBG1s"
File.basename(URI(x).path)
=> "34345"

Search for a string that start with a wildcard in ElasticSearch

I am building a kibana dashboard that displays information about X509 certificates. I would like to build a pie chart of certificates that contain a wildcard in their CN or SAN attributes, but I cannot find a query syntax that works.
To match a string like subject.cn: "*.example.net", I tried the following kibana queries:
subject.cn:/\*./
subject.cn:/^\*./
subject.cn:\*\.
subject.cn:\*.
subject.cn:*.
Could someone point me to the proper syntax? Is this even something ES/Lucene supports?
Analysing *.example.net with the standard analyser will give you a single term of example.net - i.e. the asterisk and first "." have been stripped.
Using not_analyzed will store the complete field *.example.net (as expected!)
If the wildcard is always at the beginning of the CN name then using a simple prefix query will work (I've simplified the field name):
curl -XGET 'http://localhost:9200/mytest/certificates/_search?pretty' -d '{
"query": {
"prefix": { "cn.raw":"*"}
}
}'
However if you want to search against different levels of the domain name you'll need to change the analyser you're using.
E.g. use the pattern analyser and define "." as your delimiter or possibly create a custom analyzer that calls the path hierarchy tokenizer - it's going to depend on how user's want to search your data.
Thanks to Olly's answer, I was able to find a solution that works. Once the raw fields defined, the trick is to escape the wildcard to treat it as a character, and to surround it with unescape wildcards, to accept surrounding characters:
ca:false AND (subject.cn.raw:*\** OR x509v3Extensions.subjectAlternativeName.raw:*\**)

Marklogic: Forwardslash in attribute

Some of my attributes have a forward-slash in the value. I have XQuery that attempts to match on the attribute. However, I was recently changing indexing options and now the XQuery won't match on any attributes containing the forward-slash. I don't know what index/setting that might have affected the comparison. Help!
Used to work, but not longer works:
fn:doc()//model[#id='model/books/20']
This works fine:
fn:doc()//model[#id='model1']
Looks like it was a wildcard option that was set.
UPDATE:
Due to downvotes, here is the explicit name of the setting that fixed the issue for me:
"trailing wildcard searches" set to "false"

Convert string with white space into URL

I'm using ruby and googles reverse geocode yql table to ideally automate some search query I have. The problem I hit is turning the query into a legal url format. The issue is that the encoding I'm using is returning illegal urls. The query I'm running is as follows
query="select * from google.geocoding where q='40.714224,-73.961452'"
pQuery= CGI::escape(query)
The eventual output for the processed query looks like this
http://query.yahooapis.com/v1/public/yql?q=select+%2A+from+google.geocoding+where+q%3D%2740.3714224%2C--73.961452%27+format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys&callback=
Alas the url is illegal. When checking what the query shoud look like in the YQL console I get the following
http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20google.geocoding%20where%20q%3D%2240.714224%2C-73.961452%22&format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys&callback=
As you can hopefully see :), the encoding is all wrong. I was wondering does anyone know how I can go about generating correct urls.
If you want to escape a URI, you should use URI::escape:
require 'uri'
URI.escape("select * from google.geocoding where q='40.714224,-73.961452'")
# => "select%20*%20from%20google.geocoding%20where%20q='40.714224,-73.961452'"

Resources