Indexes Arangodb Slow query - performance

I'm usong the lastest version of arangodb 3.1.25 .
I'm trying to insert fields but it is taking forever .
I thought of creating indexes, with running
db.col1.ensureIndex({ type: "hash", fields: [ "Entry" ] })
Yet I'm getting an error saying
Query: syntax error, unexpected identifier near 'db.Protein_H.ensureIndex({ type:...' at position 1:1 (while parsing)
What is wrong ?
Why is it taking so many time to load ?
how do I fix this ?
thank you

Related

elasticsearch search_phase_execution_exception error

I don't understand the error I get when scrolling through the elasticsearch API. I sometimes get an error when running an airflow dag and sometimes I don't? which makes me so confused and would like to understand the error.
This is the error I get:
elasticsearch.exceptions.NotFoundError: NotFoundError(404, 'search_phase_execution_exception', 'No search context found for id [605457312]')
data = es.search(index = "xyz/xyz", body = match, scroll = '1m', request_timeout=30)
where match is just filtering over a range and the size is set to 50.
Any comments would be highly appreciated.
Assuming that you are trying iterating over the data in elasticsearch with the data retrieved by es.search method, I suppose the scroll timeframe is not enough to iterate over all the results.
scroll = '1m' means Elasticsearch will discard the search session after 1 min you executed the first search. So I can first suggest increasing the scroll to a larger value.

How to aggregate between multiple indices in Elasticsearch

I was wondering if anyone could help me with my problem:
I have a template and the number of indices is mapped using the template.
i.e.
cdr_xyz_1234
cdr_xyz_5689
cdr_xyz_9876
I run a search query on all the indices and it works fine
GET cdr_xyz_*/_search
but if I run an aggregate search query
GET cdr_xyz_*/_serach
I get "request [/cdr_xyz_*/_serach/] contains unrecognized parameter: [expand_wildcards]"
I even created a user with the role which has "cdr_xyz_*" all privileges but still get the same error.
Could you please tell me how to resolve this issue.
Many thanks

Indices with nested property in both Kibana vizualization and index queries

So I have following problem which I'm trying to solve last two days. I have python script which parses logs and inserts data in elastic search, dynamically creating indices via bulk function.
Problem is my mapping has one "type": "nested" property, something like "users" field. And particularly when I'm only adding "type": "nested" in this property I can't query objects from Kibana nor creating any vizualization (because nested objects are separate documents If I'm not making mistakes). First think I tried: adding aditional "include_in_parent": true parameter to users field, but as result I got "wrong" queries (i.e. running something like +users.name: 'test' +users.age: 30) would result in ANY document which has those two fields, not exactly referring to ONE user object. Also vizualization was obviously wrong too.
Second solution I found was adding parent-child relationship. But this could be potentially be waste of time as I don't know will it result in correct queries. So I'm asking, if it will be normal solution to my problem?
Found out that Kibana doesn't support nested objects.
But ppadovani made this fork which supports this feature.
https://github.com/homeaway/kibana/tree/nestedSupport-4.5.4

How to do 'getNearest' geospatial query in RethinkDB?

I was going through RethinkDB docs and found out about geospatial queries. So I thought, why not give it a try to build an UBER kind of database to get Drivers near a User.
So here is how I approached:
Creating the Customer and Driver
r.table('customer').insert({
name: "John",
currentLocation: [77.627108, 12.927923]
})
r.table('driver').insert({
name: "Carl",
currentLocation: [77.612319, 12.934784]
})
Creating a geospatial index on Driver table as Customer will be searching for Drivers nearest to them.
r.table('driver').indexCreate('currentLocation', {geo: true})
According to their docs, we can find the nearest point using getNearest api
r.table('driver').getNearest(r.point(77.627108, 12.927923),
{index: 'currentLocation', maxDist: 2000, unit: 'm'}
)
(r.point(77.627108, 12.927923) is on Customer. Right now I am not concerned with querying the Customer table and turning that into ReQL geometry object)
Theoretically, the above query should work but it isn't. It returns an empty array. Am I missing something?
ANSWER
Just found this - I missed this important line in docs "A geospatial index field should contain only geometry objects. It will work with geometry ReQL terms (getIntersecting and getNearest) as well as index-specific terms (indexStatus, indexWait, indexDrop and indexList)."
All the queries are fine except I have to make a small modification in Driver query:
r.table('driver').insert({
name: "Carl",
currentLocation: r.point(77.612319, 12.934784)
})
currentLocation attribute needs to have geometry object and not an array. After making this change, everything worked fine.

ELK Type Conversion - Not a number but a string

I'm trying to set up an elk dashboard to see some numbers like total bytes, avg load time, etc. I'm forcing some conversions in logstash to make sure these fields aren't strings
convert => [ "bytes", "integer" ]
convert => [ "seconds", "float" ]
convert => [ "milliseconds", "integer" ]
Those Logstash conversions are working. See this excerpt from my logstash.log. Statuscode is a string, bytes, ... are numbers
"http_statuscode" => "200",
"bytes" => 2731,
"seconds" => 0.0,
"milliseconds" => 9059,
But when I try to build my dashboard with avg, min, max and total bytes for instance elasticsearch logs this:
Facet [stats]: field [bytes] isn't a number field, but a string
Am I missing some kind of conversion or something? Anybody already expierenced this behavior?
Thanks gus yand regards. Sebastian
One possible issue is that the mapping for fields in an index is set when the first document is inserted in the index. Changing the mapping will not update any old documents in the index, nor affect any new documents that are inserted into that index.
If you're in development, the easiest thing is to drop the index (thus deleting your earlier data). Any new documents would then use your new mapping.
If you can't toss the old data, you can wait for tomorrow, when you'll get a new index.
If necessary, you can also rebuild the index, but I've always felt it to be a pain.
One other possibility is that you have the same field name with different mappings in different types in the same index. [ repeat that a few times and it will make sense ]. Field [foo] must have the same mapping definition in each type of the same index.
I recently solved this problem (I mean use bytes or request time as numbers in Kibana, I use v4 beta 3 and you ?). The three following points might help you :
How do you parse your log ? Using Grok filter ? If yes, you can try matching your logs with the following patterns %{INT:bytes:int} instead of using the convert filter.
Did you "reload field list" (yellow button) in Kibana 4 (settings->indices) after you've done your changes ?
If you have old indexes in your ES cluster, did you correctly remove these indexes ? If not, you might have some conflicts between old types and new ones.
Hope it will help.

Resources