Kibana (elasticsearch visualization) - how add plot based on sub-string of field?

Kibana (elasticsearch visualization) - how add plot based on sub-string of field? - elasticsearch

I have a field in my logs called json_path containing data like /nfs/abc/123/subdir/blah.json and I want to create count plot on part of the string abc here, so the third chunk using the token /. I have tried all sorts of online answers, but they're all partial answers (nothing I can easily understand how to use or integrate). I've tried running POST/GET queries in the Console, which all failed due to syntax errors I couldn't manage to debug (they were complaining about newline control chars, when there were none that I could obviously see or see in a text editor explicitly showing control-characters). I also tried Management -> Index Patterns -> Scripted Field but after adding my code there, basically the whole Kibana crashed (stopped working temporarily) until I removed that Scripted Field.
All this elasticsearch and kibana stuff is annoyingly difficult, all the docs expect you to be an expert in their tool, rather than just an engineer needing to visualize some data.
I don't really want to add a new data field in my log-generation code, because then all my old logs will be unsupported (which have the relevant data, it just needs that bit of string processing before data viz). I know I could probably back-annotate the old logs, but the whole Kibana/elasticsearch experience is just frustrating and I don't use it enough to justify learning such detailed procedures (I actually learned a bunch of this stuff a year ago, and then promptly forgot it due to lack of use).

You cannot plot on a sub string of a field unless you extract that sub string into a new field. I can understand the frustration in learning a new product but to be able to achieve what you want you need to have that sub string value in a new field. Scripted fields are generally used to modify a field. To be able to extract sub string from a field I’d recommend using Ingest Node processor like grok processor. This will add a new field which you can use to plot in Kibana visualizations..

Related

Show last state of boolean field from elastic in grafana's stat panel

I have elastic index that get a new document every X minutes, which one of its fields is a boolean field that determines whether a server is up or down (true or false, and no prometheus can't help because the logic behind it is complex and already implemented by another team).
I'm trying to create a simple grafana stat panel, which will be green or red based on the value of the most recent document we get from the query.
I tried many transformations and options with no success. I don't mind using value mapping or another solution as long as I can get red/green panel (and I also tried this direction with no success)
The only way I managed to do it is using logs table, using group by transformation using the first timestamp and is_up field, and only showing the is_up value using organize fields.
But this solution is ugly and bad, and for some reason I couldn't color the cell even after using value mapping (true->1, false->0)
EDIT: I managed to do it (and will post it as answer if no one will be able to improve it), I used logs metrics and with the calculation I chose first, and by using filter by name transform I chose to only show to is_up field.
But now my problem is that I cannot color the fields based and the state. I tried using value mapping which helped me before, but now for some reason it doesn't work at all.
EDIT2: It worked by changing the true/false values to 1/0 and then everything worked as I wanted in the panel. I'll post this solution as an answer after the bounty ends

Posting my solution as an answer because this is the best way I managed to solve it, and no one answered it yet, so I hope it would help other people in the future:
The solution:
Pick logs as your metric
Use first as the calculation in Display options within Panel options
Add Filter by Name transformation, and only show the field you're interested in (is_up in my question)
If you want to color the panel correctly you will have to change the way you create new documents in your index- instead of using a boolean field change it to int and just use 0 as false and ` as true, than you will be able to use thresholds correctly.
If you still want to show true/false or up/down as the stat of the panel use value mapping and change 1 to up/true and 0 to down/false.

Partial Indexing of an XML file (Bleve)

I am evaluating a couple different libraries to see which one will best fit what I need.
Right now I am looking at Bleve, but I am happy to use any library.
I am looking to index full files except specific ones which are in XML format. For those I only want Bleve to index specific tags as most of the tags are worthless to search. I am trying to evaluate if this is possible but, being new to Bleve, I am not sure what part I need to customize.
The documentation is very good, but I can't seem to find this answer. All I need is an explanation with keywords and steps, no code is required, I just need a push as I have spent hours spinning my wheels with google searches and I am getting no where.

There are probably many ways to approach this. Here's one.
Bleve indexes documents which are collections of key/value metadata pairs.
In your case, a document could be represented by 2 key/value pairs: name of .xml file (to uniquely identify the document) and content of the file.
type Doc struct {
Name string
Body string
}
The issue is that body is XML and Bleve doesn't support XML out-of-the-box.
A way to address it would be to pre-process XML file by stripping unwanted tags and content. You can do it using encoding/xml standard library.
For an example of a similar task you can see the code of https://github.com/blevesearch/fosdem-search/
In there they index file in custom format (https://github.com/blevesearch/fosdem-search/blob/master/fosdem.ical) by parsing it into a format they can submit to Bleve for indexing (https://github.com/blevesearch/fosdem-search/blob/master/ical.go).

Elastic Greeklish to Greek conversion

I am new to a elastic and I am trying to find a way to convert greeklish character to greek when the search executes.
e.g word "papoutsia" to be searched as "παπουτσια" (shoes)
Due to my search I found the following plugins:
elasticsearch-analysis-greeklish
elasticsearch-skroutz-greekstemmer
Applied the filters to my index as the example but my queries still hit nothing.
Do I have to apply the filter some way in every query or do a special one?
Sorry I this question has a very large/broad answer to be given.
I trying to figure how the whole filtering thing works for a couple of days to understand if I am even in the correct direction or have to find an other way for this solution.

Unfortunately, the intention of the greeklish plugin / char filter is the inverse of what you want to achieve:
Using this filter, you can retrieve greek text from a document, using a query that is written in latin characters ("greeklish").
So, for your example, you can add a document with the text παπούτσια and retrieve it using the terms papoutsia, papoutsi, etc.
We have prepared a detailed text pipeline example in the repo's wiki for future reference.

ELK Dashboard showing wrong data/results

I'm new to ELK Stack and trying to setup a dashboard to analyze my apache access logs. Setting up the environment and displaying data from my logfiles all worked. But it seems like Kibana is mistakenly using spaces (and in another dashboard colons and minuses) as separators.
The first two screenshots show that the information inside my attribute "server_node" are correct.
Sadly this one shows that every space-sign is used as separator. So instead of "Tomcat Website Prod 1" or "Tomcat Website Prod 2" as seen in server_node there are too many entries and thus falsify my graph.
This is my widget setting. As mentioned I'm new to ELK and hence don't have that much knowledge to set up good dashbards.
Does anyone of you have any expirience with setting up kibana to analyze apache access logs and can give me a hint on how to setup expressive dashboards or can give me a sample dashboard to use as a model?
Thanks for your help and time and regards, Sebastian

The basic problem you are running into is that strings are analyzed by default -- which is what you want in a text search engine, but not what you want in an analytics type of situation. You need to set the field to not_analyzed before loading it in.
If you are using logstash 1.3.1 or later to load your data, you should be able to change your field to server_node.raw (see http://www.elasticsearch.org/blog/logstash-1-3-1-released/):
Most folks, in this situation, sit and scratch their heads, right? I know I did the first time. I’m pretty certain “docs” and “centralized” aren’t valid paths on the logstash.net website! The problem here is that the pie chart is built from a terms facet. With the default text analyzer in elasticsearch, a path like “/docs/1.3.1/filters/” becomes 3 terms {docs, 1.3.1, filters}, so when we ask for a terms facet, we only get individual terms back!
Index templates to the rescue! The logstash index template we provide adds a “.raw” field to every field you index. These “.raw” fields are set by logstash as “not_analyzed” so that no analysis or tokenization takes place – our original value is used as-is! If we update our pie chart above to instead use the “request.raw” field, we get the following:

CouchDB, all_docs and filter design documents with endkey

First, this question - filter design documents from all_docs - already seemed to be solved like described here:
https://plus.google.com/+JasonDeRose/posts/1iP5tu3wVqw
/mydb/_all_docs?endkey=%22_%22
and worked in first place. However, suddenly in a different setup (actually just different deploy), the query only returns an empty collection []. It seems like the ordering changed, without endkey="_" the full collection is returned (including design documents). I tried various combinations of endkey/startkey but cannot achieve to filter the design documents again.
Finally I added a filter and switched to _changes?include_docs=true to load the initial documents. I also thought about defining a view, but don't like that this results in data replication and some inconveniences with the changes feed (needed in another context). The filter on the other hand will be executed for every document.
Is it a bug that endkey=%22_%22 doesn't work anymore and is there a more convenient, still working way?

/_all_docs is a special case for CouchDB. Instead of the normal Unicode Collation, it uses ASCII collation.
The '_' character in ASCII order shows up between uppercase letters and lowercase letters. So if your doc id starts with lowercase letters (default behaviour), they will show up after any design docs. If your doc ids start with uppercase letters, they will show up before design docs.
Try creating a document with an id of: "ABC" You will see it show up before the design doc and your trick to filter design docs would work in this case.
However, I recommend you stop using the `_all_docs view altogether. Instead use the normal view functionality. When you create a view, CouchDB automatically skips design docs for you. So if your view looked like:
function(doc){
emit(doc._id, null);
}
You could query this with no start or end key, and get all docs without design docs.
Also, please look at Unicode Collation order, this is the order all your other views will be in, and it's important to understand as you work with CouchDB. You can read all about it here:
http://docs.couchdb.org/en/stable/ddocs/views/collation.html

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio