Delete all documents with a certain property value? - elasticsearch

I'm trying to delete all documents where a certain property has a certain value. The code below is my best attempt, but the ES API returns a parse error:
const userProperty = "couchDbOrigin";
client.deleteByQuery({
index: "_all",
body: { query: { bool: { must: [{ terms: { [userProperty]: user } }] } } }
});
What is wrong with this code?

terms query expect criteria as an array :
so you should use :
client.deleteByQuery({
index: "_all",
body: { query: { bool: { must: [{ terms: { [userProperty]: [user] } }] } } }
});
But if you delete document for one user at a time, you should use a term query that expects a single value and can perform better
client.deleteByQuery({
index: "_all",
body: { query: { bool: { must: [{ term: { [userProperty]: user } }] } } }
});

Related

Kibana: vega-lite visualizations if data result is empty

I'm trying to create a vega-lite visualization with a query. But if the result of the query is return empty, I get the "cannot read property 'xx' undefined" message. Some part of my visualization code is below:
{
$schema: https://vega.github.io/schema/vega-lite/v2.6.0.json
data: {
name: our_data
url: {
index: index-7.0.1-index*
body: {
query: {
bool: {
filter: [
{
match_all: {}
}
{
match_all: {}
}
]
should: []
must_not: []
}
}
aggs: {
xx: {
top_hits: {
docvalue_fields: [
{
field: someField
format: use_field_mapping
}
]
_source: ["someField"]
size: 1
sort: [
{
#timestamp: {order: "desc"}
}
]
}
}
}
}
}
format: {property: "aggregations.xx.hits.hits"}
}
Is there any way not to get "cannot read property 'xx' undefined" message? I just want if there is no data result, vega-lite visualization looks blank.
Thank you.

Elasticsearch return only results that match array of ids

Is it possible to use elastic search to query only within a set of roomIds?
I tried using bool and should:
query: {
bool: {
must: [
{
multi_match: {
operator: 'and',
query: keyword,
fields: ['content'],
type: 'most_fields'
}
},
{ term: { users: caller } },
{
bool: {
should:
term: {
room: [list of roomIds]
}
}
}
]
}
},
It works but when I have more than 1k roomIds I get "search_phase_execution_exception".
Is there a better way to do this? Thanks.
For array search you should be using terms query instead of term
query: {
bool: {
must: [
{
multi_match: {
operator: 'and',
query: keyword,
fields: ['content'],
type: 'most_fields'
}
},
{ term: { users: caller } },
{
bool: {
should:
terms: {
room: [list of roomIds]
}
}
}
]
}
},
From documentation
By default, Elasticsearch limits the terms query to a maximum of
65,536 terms. This includes terms fetched using terms lookup. You can
change this limit using the index.max_terms_count setting.

Elasticsearch: Order by date field (descending): gauss or field_value_factor?

I have an issue concerning the modification of the score document according to its creation date. I have tried gauss function and field_value_factor.
The fist one is (all the query clause):
#search_definition[:query] = {
function_score:{
query: {
bool: {
must: [
{
query_string: {
query: <query_term>,
fields: %w( field_1ˆ2
field_2ˆ3
...
field_n^2),
analyze_wildcard: true,
auto_generate_phrase_queries: false,
analyzer: 'brazilian',
default_operator: 'AND'
}
}
],
filter: {
bool: {
should: [
{ term: {"boolean_field": false}},
{ terms: {"array_field_1": options[:key].ids}},
{ term: {"array_field_2.id": options[:key].id}}
]
}
}
}
},
gauss:{
date_field: {
scale: "1d",
decay: "0.5"
}
}
}
}
With this configuration, I am telling elastic that the last documents must have a higher score. When I execute the query with it, the result is totally the opposite! The oldest documents are being returned firstly. Even if I change the origin to
origin: "2010-05-01 00:00:00"
which is the date of the first document, the oldest ones are also being retrieved firstly. What am I doing wrong?
With field_value_factor, the things are better, but not yet what I am waiting for.... (all the query clause is)
#search_definition[:query] = {
function_score:{
query: {
bool: {
must: [
{
query_string: {
query: <query_term>,
fields: %w( field_1ˆ2
field_2ˆ3
...
field_n^2),
analyze_wildcard: true,
auto_generate_phrase_queries: false,
analyzer: 'brazilian',
default_operator: 'AND'
}
}
],
filter: {
bool: {
should: [
{ term: {"boolean_field": false}},
{ terms: {"array_field_1": options[:key].ids}},
{ term: {"array_field_2.id": options[:key].id}}
]
}
}
}
},
field_value_factor: {
field: "date_field",
factor : 100,
modifier: "sqrt"
}
}
}
With this other configuration, the documents from 2016 and 2015 are being returned firstly, however there are tons of documents from 2016 that receive less score than others from 2015, even if I set a modifier "sqrt" with factor: 100 !!!!
I suppose guass function would be the appropriate solution. How can I invert this gauss result? Or how can I increase the field_value_factor so that the 2016 comes before the 2015??
Thanks a lot,
Guilherme
You might want to try putting gauss function insides functions param and give it a weight like following query. I also think scale is too low which could be making lot of documents score zero. I have also increased decay to 0.8 and given higher weight to recent documents. You could also use explain api to see how scoring is done.
{
"function_score": {
query: {
bool: {
must: [{
query_string: {
query: < query_term > ,
fields: % w(field_1ˆ2 field_2ˆ3
...field_n ^ 2),
analyze_wildcard: true,
auto_generate_phrase_queries: false,
analyzer: 'brazilian',
default_operator: 'AND'
}
}],
filter: {
bool: {
should: [{
term: {
"boolean_field": false
}
}, {
terms: {
"array_field_1": options[: key].ids
}
}, {
term: {
"array_field_2.id": options[: key].id
}
}]
}
}
}
},
"functions": [{
"gauss": {
"date_field": {
"origin": "now"
"scale": "30d",
"decay": "0.8"
}
},
"weight": 20
}]
}
}
Also the origin should be latest date so rather than origin: "2010-05-01 00:00:00", try
origin: "2016-05-01 00:00:00"
Does this help?

Filter query with or without keyword

I have a table with a name column and category_id column. The query below returns a BadRequest error. I think this is the most meaningful returned from the error blob.
*note: "something*" is a keyword I pass in to search for.
Expected field name but got START_OBJECT
I am looking for a query that will return results filtered by category_id with or without a keyword search.
{
query: {
bool: {must: [{
wildcard: {name: "something*"}
]},
filtered: {
filter: {
bool: {
must: {
term: {category_id: 1}
}
}
}
}
},
sort: [{
_geo_distance: {
store_location: {:lat=>0, :lon=>0},
order: "asc",
unit: "miles"
}
}]
}
What is wrong with this query?
If I were to guess (ie. this is untested), I'd say it's because you've got your bool outside of your filtered sub-query. Move it inside a query sub-query inside your filtered sub-query like so:
{
query: {
filtered: {
query: {
bool: {must: [{
wildcard: {name: "something*"}
]},
},
filter: {
bool: {
must: {
term: {category_id: 1}
}
}
}
}
},
sort: [{
_geo_distance: {
store_location: {:lat=>0, :lon=>0},
order: "asc",
unit: "miles"
}
}]
}

Multiple types in Elasticsearch Type Filter

I have a filtered query like this
query: {
filtered: {
query: {
bool: {
should: [{multi_match: {
query: #query,
fields: ['title', 'content']
}
},{fuzzy: {
content: {
value: #query,
min_similarity: '1d',
}
}}]
}
},
filter: {
and: [
type: {
value: #type
}]
}}}
That works fine if #type is a string, but does not work if #type is an array. How can I search for multiple types?
This worked, but I'm not happy with it:
filter: {
or: [
{ type: { value: 'blog'} },
{ type: { value: 'category'} },
{ type: { value: 'miscellaneous'} }
]
}
I'd love to accept a better answer
You can easily specify multiple types in your search request's URL, e.g. http://localhost:9200/twitter/tweet,user/_search, or with type in the header if using _msearch, as documented here.
These are then added as filters for you by Elasticsearch.
Also, you usually want to be using bool to combine filters, for reasons described in this article: all about elasticsearch filter bitsets
This worked for me:
Within the filter parameter, wrap multiple type queries as should clauses for a bool query
e.g
{
"query": {
"bool": {
"must": {
"term": { "foo": "bar" }
},
"filter": {
"bool": {
"should": [
{ "type": { "value": "myType" } },
{ "type": { "value": "myOtherType" } }
]
}
}
}
}
}
Suggested by dadoonet in the Elasticsearch github issue Support multiple types in Type Query

Resources