I'm using Meteor (so Javascript, Node, NPM, etc) and would like to provide a simple text input for users to search via Elasticsearch. I would like to be able to use modifiers on the text like + and "" and search for a specific field. I'm looking for something that can convert a plain text input into Elasticsearch Query DSL.
These would be some example queries:
This query would mean that the keyword "tatooine" must exist:
stormtrooper +tatooine
This would mean that "death star" should be one keyword:
stormtrooper "death star"
This would search for the keyword "bloopers" only in the category field:
stormtrooper category=bloopers
Is there a library that can do this? Can a generic solution exist or is this why I can't find any existing answers to this?
simple_query_string would support your query syntax out of the box, except for category=bloopers which should be category:bloopers instead, but otherwise it should work:
curl -XPOST localhost:9200/your_index/_search -d '{
"query": {
"simple_query_string": {
"query": "stormtrooper category:bloopers"
}
}
}'
curl -XPOST localhost:9200/your_index/_search -d '{
"query": {
"simple_query_string": {
"query": "stormtrooper +tatooine"
}
}
}'
You can also send the query in the query string directly like this:
curl -XPOST localhost:9200/your_index/_search?q=stormtrooper%20%22death%20star%22"
Related
We have an Elastic Search structure that specifies fields in a multi_match query like this:
"multi_match": {
"query": "find this string",
"fields": ["*_id^20", "*_name^20", "*"]
}
This works great - except under certain circumstances like when query is "Find NOWAK". This is because "NOW" is a reserved word for date searching and field "*" matches fields that are defined as dates.
So what I would like to do is ignore fields that match "*_at".
Is there way to tell Elastic Search to ignore certain fields in a multi_match query?
If the answer to that is "no" then the follow up question is how to escape the search term so that it won't trigger key words
Running version 6.7
Try this:
Exclude a field on a Elasticsearch query
curl -XGET 'localhost:9200/testidx/items/_search?pretty=true' -d '{
"query" : {
"query_string": {
"fields": ["title", "field2", "field3"], <-- add this
"query": "Titulo"
}},
"_source" : {
"exclude" : ["*.body"]
}
}'
Apparently the answer is "No: there is not a way to tell ElasticSearch to ignore certain fields in a multi_match query"
For my particular issue I found an inexpensive way to find the necessary white-listed fields (this is performed outside the scope of ElasticSearch otherwise I would post it here) and list those in place of the "*" when building the query.
I am hopeful someone will tell me I'm wrong, but I don't think I am.
I happen to look into a scenario in Elasticsearch, where proximity search is not working as expected. Let me explain it below.
When I tried the search term "apple samsung"~1 it brought me around 10 results from my local cluster. But when the proximity term is "samsung apple"~1 it brought me only 2 results.
As per the Elasticsearch documentation in below URL, both the terms should bring me same number of results
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#_proximity_searches
Could anybody here would help me on this.
Thanks in advance,
Manoj
It's going to depend on your data - if your query is:
curl -XGET 'http://localhost:9200/test/message/_search?pretty' -d '{
"query": {
"query_string": { "query":"\"apple samsung\"~1"}
}
}'
this is a slop of one, it would match
"samsung apple"
"samsung xxxx apple"
to match where "apple" appears first, you'd need to specify a slop 2:
curl -XGET 'http://localhost:9200/test/message/_search?pretty' -d '{
"query": {
"query_string": { "query":"\"apple samsung\"~2"}
}
}'
I'm running a fuzzy search, and need to see which words were matched. For example, if I am searching for the query testing, and it matches a field with the sentence The boy was resting, I need to be able to know that the match was due to the word resting.
I tried setting the parameter explain = true, but it doesn't seem to contain the information I need. Any thoughts?
Alright, this is what I was looking for:
After a bit of research, I found the Highlighting feature of elasticsearch.
By default it returns a snippet of context surrounding the match, but you can set the fragment size to the query length to return only the exact match. For example:
{
query : query,
highlight : {
"fields" : {
'text' : {
"fragment_size" : query.length
}
}
}
}
Using explain should give you some clues, although not very easily available.
If you run the following, also available at https://www.found.no/play/gist/daa46f0e14273198691a , you should see e.g. description: "weight(text:nesting^0.85714287 in 1) […], description: "weight(text:testing in 1) [PerFieldSimilarity] […] and so on in the hit's _explanation.
#!/bin/bash
export ELASTICSEARCH_ENDPOINT="http://localhost:9200"
# Create indexes
curl -XPUT "$ELASTICSEARCH_ENDPOINT/play" -d '{}'
# Index documents
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_bulk?refresh=true" -d '
{"index":{"_index":"play","_type":"type"}}
{"text":"The boy was resting"}
{"index":{"_index":"play","_type":"type"}}
{"text":"The bird was testing while nesting"}
'
# Do searches
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_search?pretty" -d '
{
"query": {
"match": {
"text": {
"query": "testing",
"fuzziness": 1
}
}
},
"explain": true
}
'
Is it possible to use ElasticSearch to do keyword searches, exactly like in a search engine?
Let me rephrase:
As far as I understand, an ElasticSearch term query requires to specify in which field(s?) to search for keywords.
Given the fact that ElasticSearch can be "schemaless", I wish I could declare a query than can search for keywords in any field.
Is there a syntax for that?
You're looking for the behavior provided by the _all-field, which happens to be on by default:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-all-field.html
Here's a runnable example: https://www.found.no/play/gist/14688f48c75b9931272b
export ELASTICSEARCH_ENDPOINT="http://localhost:9200"
# Index documents
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_bulk?refresh=true" -d '
{"index":{"_index":"play","_type":"type"}}
{"foo":"bar"}
{"index":{"_index":"play","_type":"type"}}
{"something_else":"foo bar"}
'
# Do searches
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_search?pretty" -d '
{
"query": {
"match": {
"_all": {
"query": "bar"
}
}
}
}
'
I am fairly new to ElasticSearch and have a question on stop words. I have an index that contains state names for the USA....ex: New York/NY, California/CA,Oregon/OR. I believe Oregon's abbreviation, 'OR' is a stop word, so when I insert the state data into the index, I cannot search on 'OR'. Is there a way I can set up custom stopwords for this or am I doing something wrong?
Here is how I am building the index:
curl -XPUT http://localhost:9200/test/state/1 -d '{"stateName": ["California","CA"]}'
curl -XPUT http://localhost:9200/test/state/2 -d '{"stateName": ["New York","NY"]}'
curl -XPUT http://localhost:9200/test/state/3 -d '{"stateName": ["Oregon","OR"]}'
A search for 'NY', works fine. Ex:
curl -XGET 'http://localhost:9200/test/state/_search?pretty=1' -d '
{
"query" : {
"match" : {
"stateName" : "NY"
}
}
}'
But a search for 'OR', returns zero hits:
curl -XGET 'http://localhost:9200/test/state/_search?pretty=1' -d '
{
"query" : {
"match" : {
"stateName" : "OR"
}
}
}'
I believe this search returns no results because OR is stop word, but I don't know how to work around this. Thanks for you help.
You can (and definitely should) control the way you index data by modifying your mapping according to your data and the way you want to search against it.
In your case I would disable stopwords for that specific field rather than modifying the stopword list, but you could do the latter too if you wish to. The point is that you're using the default mapping which is great to start with, but as you can see you need to tweak it depending on your needs.
For each field, you can specify what analyzer to use. An analyzer defines the way you split your text into tokens (tokenizer) that will be indexed and also additional changes you can make to each token (even remove or add new ones) using token filters.
You can specify your mapping either while creating your index or update it afterwards using the put mapping api (as long as the changes you make are backwards compatible).