How can I achieve this type of queries in ElasticSearch? - elasticsearch

I have added a document like this to my index
POST /analyzer3/books
{
"title": "The other day I went with my mom to the pool and had a lot of fun"
}
And then I do queries like this
GET /analyzer3/_analyze
{
"analyzer": "english",
"text": "\"The * day I went with my * to the\""
}
And it successfully returns the previously added document.
My idea is to have quotes so that the query becomes exact, but also wildcards that can replace any word. Google has this exact functionality, where you can search queries like this, for instance "I'm * the university" and it will return page results that contain texts like I'm studying in the university right now, etc.
However I want to know if there's another way to do this.
My main concern is that this doesn't seem to work with other languages like Japanese and Chinese. I've tried with many analyzers and tokenizers to no avail.
Any answer is appreciated.

Exact matches on the tokenized fields are not that straightforward. Better save your field as keyword if you have such requirements.
Additionally, keyword data type support wildcard query which can help you in your wildcard searches.
So just create a keyword type subfield. Then use the wildcard query on it.
Your search query will look something like below:
GET /_search
{
"query": {
"wildcard" : {
"title.keyword" : "The * day I went with my * to the"
}
}
}
In the above query, it is assumed that title field has a sub-field named keyword of data type keyword.
More on wildcard query can be found here.
If you still want to do exact searches on text data type, then read this

Elasticsearch doesn't have Google like search out of the box, but you can build something similar.
Let's assume when someone quotes a search text what they want is a match phrase query. Basically remove the \" and search for the remaining string as a phrase.
PUT test/_doc/1
{
"title": "The other day I went with my mom to the pool and had a lot of fun"
}
GET test/_search
{
"query": {
"match_phrase": {
"title": "The other day I went with my mom to the pool and had a lot of fun"
}
}
}
For the * it's getting a little more interesting. You could just make multiple phrase searches out of this and combine them. Example:
GET test/_search
{
"query": {
"bool": {
"must": [
{
"match_phrase": {
"title": "The"
}
},
{
"match_phrase": {
"title": "day I went with my"
}
},
{
"match_phrase": {
"title": "to the"
}
}
]
}
}
}
Or you could use slop in the phrase search. All the terms in your search query have to be there (unless they are being removed by the tokenizer or as stop words), but the matched phrase can have additional words in the phrase. Here we can replace each * with 1 other words, so a slop of 2 in total. If you would want more than 1 word in the place of each * you will need to pick a higher slop:
GET test/_search
{
"query": {
"match_phrase": {
"title": {
"query": "The * day I went with my * to the",
"slop": 2
}
}
}
}
Another alternative might be shingles, but this is a more advanced concept and I would start off with the basics for now.

Related

Elasticsearch: Must include all words in search if all exist, but ignore one or two if they don't?

I hope what I'm trying to explain makes sense, and there is a way that I could achieve it.
Currently I am searching in 40 million documents, with a query like this:
GET /all/_search
{
"query": {
"match": {
"full_text": {
"query": "insert ten or twelve words here to search",
"operator": "and"
}
}
}
}
Now I want to only return docs that their 'full_text' includes all of the words in the query. I am able to achieve that with above snippet.
My question is, when there is no match at all, but for example removing "ten" would yield one result, is there a way to configure my search to do that? I.e. to tell ES "aim for 100% match, but if nothing found, 90% would do just fine" !
Hope this is clear :)
You can use minimum_should_match clause along with match query
{
"query": {
"match": {
"text":{
"query": "insert ten or twelve words here",
"minimum_should_match":"90%"
}
}
}
}

Search in two fields on elasticsearch with kibana

Assuming I have an index with two fields: title and loc, I would like to search in this two fields and get the "best" match. So if I have three items:
{"title": "castle", "loc": "something"},
{"title": "something castle something", "loc": "something,pontivy,something"},
{"title": "something else", "loc": "something"}
... I would like to get the second one which has "castle" in its title and "pontivy" in its loc. I tried to simplify the example and the base, it's a bit more complicated. So I tried this query, but it seems not accurate (it's a feeling, not really easy to explain):
GET merimee/_search/?
{
"query": {
"multi_match" : {
"query": "castle pontivy",
"fields": [ "title", "loc" ]
}
}
}
Is it the right way to search in various field and get the one which match the in all the fields?
Not sure my question is clear enough, I can edit if required.
EDIT:
The story is: the user type "castle pontivy" and I want to get the "best" result for this query, which is the second because it contains "castle" in "title" and "pontivy" in "loc". In other words I want the result that has the best result in both fields.
As the other posted suggested, you could use a bool query but that might not work for your use case since you have a single search box that you want to query against multiple fields with.
I recommend looking at a Simple Query String query as that will likely give you the functionality you're looking for. See: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-simple-query-string-query.html
So you could do something similar to this:
{
"query": {
"simple_query_string" : {
"query": "castle pontivy",
"fields": ["title", "loc"],
"default_operator": "and"
}
}
}
So this will try to give you the best documents that match both terms in either of those fields. The default operator is set as AND here because otherwise it is OR which might not give you the expected results.
It is worthwhile to experiment with other options available for this query type as well. You might also explore using a Query String query as it gives more flexibility but the Simple Query String term works very well for most cases.
This can be done by using bool type of query and then matching the fields.
GET _search
{
"query":
{
"bool": {"must": [{"match": {"title": "castle"}},{"match": {"loc": "pontivy"}}]
}
}
}

Search within the results got from elasticsearch

Is it possible to search within the results that I get from elasticsearch?
To achieve that currently I need to run & wait for two searches on elasticsearch: the first search is
{ "match": { "title": "foo" } }
It takes 5 seconds and returns 500 docs etc.. And then a second search
{
"bool": {
"must": [
{ "match": { "title": "foo" } },
{ "match": { "title": "bar" } }
]
}
}
It takes another 5 seconds and returns 200 docs, which basically has nothing to do with the first search from elasticsearch's perspective.
Instead of doing it this way, I'd like to offer a "search further within the result" option to my users. Hopefully with this option, users can make a search with more keyword provided based on the result returned from the first search.
So my scenario is that a user makes a first search with keyword "foo", and gets 500 results on the webpage, and then selects "search further within the result", to make a second search within the 500 results, and hope to get some refined results really quick.
How can I achive it? Thanks!
What you could do is use the IDS query. Collect all document IDs from the first request, and then post them with a new Bool query that includes an IDS query in a must clause next to the original query. You could efficiently collect the IDs in the first request using the Scroll API. Since you will return the second result sorted anyway, it does not make sense to do any sorting in the first request, so you can speed up the first request.
See:
Scroll API: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html
IDS Query: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-ids-query.html
post filter is a way to search inside an other search.
In your case :
GET _search
{
"query": {
"match": {
"title": "foo"
}
},
"post_filter": {
"match": {
"title": "bar"
}
}
}
post_filter will be executed on the query result.

Elasticsearch: filter by any field

I am playing with filters in elasticsearch (we use old version 1.3.1), and I need to filter my search results by any field. With query, this can be done like this:
"query": {
"query_string": {
"query": "_all:test"
}
}
But filters seems to not work with _all statement. What can I do? Would newer elasticsearch version solve my problem?
Thanks in advance!
PS: I need to search exact values, so I cannot use queries. There is difference between queries and filters - if you search for my brown, then you can expect results like:
my brown
This is my brown dog.
someone stolen my brown wallet
But filter will return only my brown, and that is what I need.
You might want to read up a little on the distinction between queries and filters. What you're doing there is a query string query.
If you do actually want to filter against exact text tokens (read up on analysis if you don't know what I mean by "tokens"), AND you have your mapping set up such that the "_all" field behaves as you're expecting then try something like this:
POST /test_index/_search
{
"query": {
"filtered": {
"filter": {
"term": {
"_all": "test"
}
}
}
}
}
If, on the other hand, you want to allow some analysis (so that "Test" is tokenized to "test", for example), you may want this instead:
POST /test_index/_search
{
"query": {
"match": {
"_all": "Test"
}
}
}
Here is some code I used to play around with it:
http://sense.qbox.io/gist/44adf2c2ade8abd6758f0e08ed2e40434850fc1c

elasticsearch - confused on how to searching items that a field contains string

This query is returning fine only one item "steve_jobs".
{
"query": {
"constant_score": {
"filter": {
"term": {
"name":"steve_jobs"
}
}
}
}
}
So, now I want to get all people with name prefix steve_. So I try this:
{
"query": {
"constant_score": {
"filter": {
"term": {
"name": "steve_"
}
}
}
}
}
This is returning nothing. Why?
I'm confused about when to use term query / term filter / terms filter / querystring query.
What you need is Prefix Query.
If you are indexing your document like so:
POST /testing_nested_query/class/
{
"name": "my name is steve_jobs"
}
And you are using the default analyzer, then the problem is that the term steve_jobs will be indexed as one term. So your Term Query will never be able to find any docs matching the term steve as there is no term like in the index. Prefix Query helps you solve your problem by searching for a prefix in all the indexed terms.
You can solve the same problem by making your custom analyzers (read this and this) so that steve_jobs is stored as steve and jobs.

Resources