Elasticsearch wildcard fails when there are numbers in search string - elasticsearch

I have indexed some string in this index:
{
"mappings": {
"record" : {
"properties" : {
"my_suggest" : {
"type":"completion"
}
}
}
}
}
In my index there are these values:
my_suggest = foo1
my_suggest = bar
my_suggest = something2
If I query:
{
"query":{
"wildcard":{"my_suggest":"*foo*"}
}
}
I have returned the record number 1.
If I do this query:
{
"query":{
"wildcard":{"my_suggest":"*foo1*"}
}
}
I have returned blank results. I am expecting the record number one.
Why this happens?
Thanks.

Elasticsearch uses simple analyser by default, which removes any non letter characters.
https://www.elastic.co/guide/en/elasticsearch/reference/5.5/analysis-simple-analyzer.html
Please use another type of analyser or custom analyser as per your requirements.
https://www.elastic.co/guide/en/elasticsearch/reference/5.5/analysis-analyzers.html

Related

How can I find entries in elastic where a specific value is present in array?

Ok, this is the schema:
id
generated{
status{
myStatuses[]
}
}
So having now these entries:
id=1
generated.status.myStatuses=['busy', 'free']
id=2
generated.status.myStatuses=['busy']
id=3
generated.status.myStatuses=['free']
I want to match all the documents where "generated.status.myStatuses" contains the word "free".
In the example above I would find id=1 and id=3.
There's no dedicated array datatype in ES so you can treat your keyword arrays as keywords. This means either
GET generated/_search
{
"query": {
"match": {
"generated.status.myStatuses": "free"
}
}
}
or, for exact matches,
GET generated/_search
{
"query": {
"term": {
"generated.status.myStatuses.keyword": "free"
}
}
}
{
"query":{
"match":{
"generated.status.myStatuses":"free"
}
}
}
You need to use match or term queries based on your data mapping.

Elasticsearch exeact match on analyzed field of integers

I want to find exact matches on a (analyzed string) field in ES. All values are integers but mapped as strings. I, unfortunately, cannot change the mapping and using
query: {
match: {
fieldName: '1234'
}
}
also gives me 0 hits.
I cannot figure out if it's the standard analyzer working in a bizarre way when the mapping is
index: {
type: {
properties: {
fieldName: {
type: string
}
}
}
}
and data is
{fieldName: '12345'}
or there is something in the match query that I'm missing.
Thanks :)
Change your quotations for the fieldNames value from ticks ' to quotes ". Trying your query will the correct quotes returns the expected results on my end.
{
"query": {
"match": {
"fieldName": "1234"
}
}
}

How to return all documents where a string occurs in the document at least N times

If I wanted to return all documents that contain the term beetlejuice, I could use a query like
{
"bool":{
"should":[
{
"term":{
"description":"beetlejuice"
}
}
]
}
}
What's not clear is how to return all documents where the description field contains the string beetlejuice at least 3 times within it. I see minimum_should_match, but I think that is to be used for separate queries in a bool. How can I craft a query to match when a word occurs at least N times within the document's description field?
You can use scripting for achieving what you desired.
Basically, all you need is the term frequency of the desired term in the document field and you can access the value using scripts.
_index['FIELD']['TERM'].tf()
Sample Filter Script :
"filter" : {
"script" : {
"script" : "_index['description']['beetlejuice'].tf() > N",
"params" : {
"N" : 2
}
}
}

Elsticsearch : Contains query

I have a column in my mapping that holds an array of strings
col1
["asd","fgh","wer"]
["qwer","cvbvbn","popop"]
["cvbml","fhjhfrjk","fsdfd"]
["asd","trth","fdf"]
The column col is not analyzed in the index and i do not want to change the mapping.
"col1":
{
"type":"string",
"index":"not_analyzed"
}
Now, i want to retrieve all records where the string asd appears. so in this case, i want the first and fourth records. I tried using the query
query: {
wildcard:{
"col1":"asd"
}
}
with
POST localhost:9200/indexName/test/_search
but that gives me empty results? Which query should i use in this case?
Edit
So i was able to solve the above problem. Here is a follow up. Consider that this was my data
col1
["asd fd","fgh bn","wer kl"]
["qwer","cvbvbn","popop"]
["cvbml","fhjhfrjk wewe","fsdfd rtr"]
["asd","trth","fdf"]
so now, the array contains some strings that have multiple words. Now, i still want to return the first and fourth record. If i go with the solution that i posted, i only get the fourth one. How can i apply the contains logic to each element of the array in col1?
Note
A partial solution is
{ "query": { "match_phrase_prefix": { "col1": "asd" } } }
so again, for the data
col1
["asd fd","fgh bn","wer kl"]
["qwer","cvbvbn","popop"]
["cvbml","fhjhfrjk wewe","fsdfd rtr"]
["asd","trth","fdf"]
it returns the first and fourth records. However, if i have
col1
["fd asd","fgh bn","wer kl"]
["qwer","cvbvbn","popop"]
["cvbml","fhjhfrjk wewe","fsdfd rtr"]
["asd","trth","fdf"]
then, once again it only returns the fourth one, which is understandable as now, asd is no longer a prefix for that value in the first record.
Is there a way to to a contains type match instead of just prefix match?
You can use a simple term query and it should work
POST localhost:9200/indexName/test/_search
{
"query": {
"terms": { "col1" : "asd" }
}
}
so, here is the proper query
{
fields : ["col1","col2"],
query: {
filtered: {
query: {
match_all: {}
},
filter: {
terms: {
col1: ["asd"]
}
}
}
}
}
Final Answer
query: {
wildcard:{
col1:{
value:"*asd*"
}
}
}
:)

ElasticSearch query failing due to state codes "in" and "or" being reserved words

I'm querying for states using the state code as the query string, and "in" and "or" (Indiana and Oregon) are failing, presumably because they're reserved words.
I can confirm that the data exists in the index correctly, because when I run:
curl -XGET 'localhost:9200/state/_search?size=200&pretty=true' -d '{"query" : {"match_all" : {}}}' > out.txt
I can see the data there for both the working states and the non-working states. Plus, if I change the state code of a non-working state in CouchDB to something like XYZ, I can verify that the change makes it to ES by running the above command and searching for XYZ. So I know I'm looking at the right data and it's indexing fine.
The problem is the query. Right now, here's what my entire query object looks like:
var q = {
size: 0,
query: {
filtered: {
query: { term: { postcode: 'tn' } },
filter: { term: { version: 2 } }
}
},
facets: {
version: { terms: { field: "version" } },
count : { statistical : { field : "latestValues.enroll" } }
}
};
If I run that query, I get no results. If I change the "or" out with "tn" or "tx" or "sc" etc., then it works fine.
I looked for a way to escape reserved words and found this link but it doesn't seem to work for me, when running the following query:
var q = {
size: 0,
query: {
filtered: {
query: { match_all: { } },
filter: { term: { version: 2, postcode: 'or' } }
}
},
facets: {
version: { terms: { field: "version" } },
count : { statistical : { field : "latestValues.enroll" } }
}
};
(Note that that query also works when changing out "or" with a non-reserved-word-state so I know it's not a problem with the query itself).
Any ideas?
This is not about "reserved" words, its about stop words. You are using an analyzer which removes stop words (the default analyzer up to a more recent version of Elasticsearch).
You'll need to change the analyzer for the field, see here: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis.html
This will change require reindexing, though

Resources