Query DSL regexp pattern doesn't work with some strings - elasticsearch

I have a pattern ".TP-V." which returns strings like "SSTP-VPN". But the pattern ".SSH." Does not return anything, although there are lines like "core:Login:SSH:Cisco". I have no idea what pattern is need.

You need to use ".*SSH.*" instead of ".SSH.".
Adding a working example -
Index Data:
{
"name":"core:Login:SSH:Cisco"
}
{
"name":"SSTP-VPN"
}
Search Query:
{
"query": {
"regexp": {
"name.keyword": {
"value": ".*SSH.*"
}
}
}
}
Search Result:
"hits": [
{
"_index": "68015371",
"_type": "_doc",
"_id": "2",
"_score": 1.0,
"_source": {
"name": "core:Login:SSH:Cisco"
}
}
]
Search Query:
{
"query": {
"regexp": {
"name.keyword": {
"value": ".*TP-V.*"
}
}
}
}

Related

Elasticsearch - unify search results from different indexes

I want to perform a search query on different indexes with different search queries and unify the results.
I know there is a multi-target syntax, which allows me to perform specific query over multiple indexes.
What I want is different query for each index and then perform something like UNION (SQL).
Is there a way to achieve that?
You can use the _index metadata field. This will help you to query on multiple indexes with different queries
Adding a working example with index data, search query and search result
Index Data
POST /index1/_doc/1
{
"name":"foo"
}
POST /index2/_doc/1
{
"name":"bar"
}
Search Query:
{
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"match": {
"name": "foo"
}
},
{
"term": {
"_index": "index1"
}
}
]
}
},
{
"bool": {
"must": [
{
"match": {
"name": "bar"
}
},
{
"term": {
"_index": "index2"
}
}
]
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "index1",
"_type": "_doc",
"_id": "1",
"_score": 1.287682,
"_source": {
"name": "foo"
}
},
{
"_index": "index2",
"_type": "_doc",
"_id": "1",
"_score": 1.287682,
"_source": {
"name": "bar"
}
}
]

Elastic search partial match but strict phrase matching

I'm looking for a way to fuzzy partial match against a field where the words match, however I want to also add in strict phrase matching.
i.e. say I have fields such as
foo bar
bar foo
I would like to achieve the following search behaviour:
If I search foo, I would like to return back both results.
If I search ba, I would like to return back both results.
If I search bar foo, I would like to only return back one result.
If I search bar foo foo, I don't want to return any results.
I would also like to add in single character fuzziness matching, so if a foo is mistyped as fbo then it would return back both results.
My current search and index analyzer uses an edge_gram tokenizer and is working fairly well, except if any gram matches, it will return the results regardless if the following words match. i.e. my search would return the back the following result for the search bar foo buzz
foo bar
bar foo
My tokenzier:
ngram_tokenizer: {
type: "edge_ngram",
min_gram: "2",
max_gram: "15",
token_chars: ['letter', 'digit', 'punctuation', 'symbol'],
},
My analyzer:
nGram_analyzer: {
filter: [
lowercase,
"asciifolding"
],
type: "custom",
tokenizer: "ngram_tokenizer"
},
My field mapping:
type: "search_as_you_type",
doc_values: false,
max_shingle_size: 3,
analyzer: "nGram_analyzer"
One way to achieve all your requirements is to use span_near query
Span near query are much longer, but these are suitable for doing phrase match along with fuzziness parameter
Adding a working example with index data, search queries and search results
Index Mapping:
{
"mappings": {
"properties": {
"title": {
"type": "text"
}
}
}
}
Index Data:
{
"title":"bar foo"
}
{
"title":"foo bar"
}
Search Queries:
If I search foo, I would like to return back both results.
{
"query": {
"bool": {
"must": [
{
"span_near": {
"clauses": [
{
"span_multi": {
"match": {
"fuzzy": {
"title": {
"value": "foo",
"fuzziness": 2
}
}
}
}
}
],
"slop": 0,
"in_order": true
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "67205552",
"_type": "_doc",
"_id": "2",
"_score": 0.18232156,
"_source": {
"title": "bar foo"
}
},
{
"_index": "67205552",
"_type": "_doc",
"_id": "1",
"_score": 0.18232156,
"_source": {
"title": "foo bar"
}
}
]
If I search ba, I would like to return back both results.
{
"query": {
"bool": {
"must": [
{
"span_near": {
"clauses": [
{
"span_multi": {
"match": {
"fuzzy": {
"title": {
"value": "ba",
"fuzziness": 2
}
}
}
}
}
],
"slop": 0,
"in_order": true
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "67205552",
"_type": "_doc",
"_id": "2",
"_score": 0.18232156,
"_source": {
"title": "bar foo"
}
},
{
"_index": "67205552",
"_type": "_doc",
"_id": "1",
"_score": 0.18232156,
"_source": {
"title": "foo bar"
}
}
]
If I search bar foo foo, I don't want to return any results.
{
"query": {
"bool": {
"must": [
{
"span_near": {
"clauses": [
{
"span_multi": {
"match": {
"fuzzy": {
"title": {
"value": "bar",
"fuzziness": 2
}
}
}
}
},
{
"span_multi": {
"match": {
"fuzzy": {
"title": {
"value": "foo",
"fuzziness": 2
}
}
}
}
},
{
"span_multi": {
"match": {
"fuzzy": {
"title": {
"value": "foo",
"fuzziness": 2
}
}
}
}
}
],
"slop": 0,
"in_order": true
}
}
]
}
}
}
Search Result will be empty

Elastic Multimatch string with dash (or other symbol)

I am trying to match dashes (and other symbols) in my elastic query.
It is fuzzysearch on all the fields using default whitespace analyzer.
My query:
function_score: {
query: {
multi_match: {
query: string
analyzer: "whitespace",
fuzziness: 1
}
}
}
However this has unexpected results with dash characters. E.x. Central-Park doesnt work with this. Or
Dashes only work well when I use a phrase match and strip out the double quotes. But there is no fuzziness.
Does anyone know how I can get the fuzzysearch normally with dashes please?
Adding a working example with index mapping, index data, search query, and search result
Index Mapping:
{
"mappings": {
"properties": {
"place": {
"type": "text",
"analyzer":"whitespace"
}
}
}
}
Index Data:
{
"place": "Cwntral-Park"
}
{
"place": "Central-Park"
}
{
"place": "Central-Area"
}
Search Query:
{
"query": {
"bool": {
"should": {
"match": {
"place": {
"query": "Central-Park",
"fuzziness": 1
}
}
}
}
}
}
Search Result:
"hits": [
{
"_index": "65605120",
"_type": "_doc",
"_id": "1",
"_score": 0.9808291,
"_source": {
"place": "Central-Park"
}
},
{
"_index": "65605120",
"_type": "_doc",
"_id": "3",
"_score": 0.8990934,
"_source": {
"place": "Cwntral-Park"
}
}
]

Elasticsearch associating exact match terms

I have a search index of filenames containing over 100,000 entries that share about 500 unique variations of the main filename field. I have recently made some modifications to certain filename values that are being generated from my data. I was wondering if there is a way to link certain queries to return an exact match. In the following query:
"query": {
"bool": {
"must": [
{
"match": {
"filename": "foo-bar"
}
}
],
}
}
how would it be possible to modify the index and associate the results so that above query will also match results foo-bar-baz, but not foo-bar-foo or any other variation?
Thanks in advance for your help
You can use a term query instead of a match query. Perfect to use on a keyword:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html
Adding a working example with index data and search query. (Using the default mapping)
Index Data:
{
"fileName": "foo-bar"
}
{
"fileName": "foo-bar-baz"
}
{
"fileName": "foo-bar-foo"
}
Search Query:
{
"query": {
"bool": {
"should": [
{
"match": {
"fileName.keyword": "foo-bar"
}
},
{
"match": {
"fileName.keyword": "foo-bar-baz"
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "test",
"_type": "_doc",
"_id": "1",
"_score": 0.9808291,
"_source": {
"fileName": "foo-bar"
}
},
{
"_index": "test",
"_type": "_doc",
"_id": "2",
"_score": 0.9808291,
"_source": {
"fileName": "foo-bar-baz"
}
}
]

elasticSearch: bool query with multiple values on one field

This works:
GET /bitbucket$$pull-request-activity/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"prid": "12343"
}
},
{
"match": {
"repoSlug": "com.xxx.vserver"
}
}
]
}
}
}
But I would like to capture multiple prids in one call.
This does not work however:
GET /bitbucket$$pull-request-activity/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"prid": "[12343, 11234, 13421]"
}
},
{
"match": {
"repoSlug": "com.xxx.vserver"
}
}
]
}
}
}
any hints?
As you are using must in your bool query, then this represents logical AND, so be sure that all the documents that you are Matching of the prid field, should also match with "repoSlug": "com.xxx.vserver".
If none of the documents match with "repoSlug": "com.xxx.vserver", then no result will return.
And, if only 2 documents match, then only 2 of them will be returned in the search result, and not all the documents.
Adding Working example with mapping, sample docs and search query
Index Sample Data :
{
"id":"1",
"message":"hello"
}
{
"id":"2",
"message":"hello"
}
{
"id":"3",
"message":"hello-bye"
}
Search Query:
{
"query": {
"bool": {
"must": [
{
"match": {
"id": "[1, 2, 3]"
}
},
{
"match": {
"message": "hello"
}
}
]
}
}
}
Search Result :
"hits": [
{
"_index": "foo14",
"_type": "_doc",
"_id": "1",
"_score": 1.5924306,
"_source": {
"id": "1",
"message": "hello"
}
},
{
"_index": "foo14",
"_type": "_doc",
"_id": "3",
"_score": 1.4903541,
"_source": {
"id": "3",
"message": "hello-bye"
}
},
{
"_index": "foo14",
"_type": "_doc",
"_id": "2",
"_score": 1.081605,
"_source": {
"id": "2",
"message": "hello"
}
}
]

Resources