ELK Dashboard showing wrong data/results - elasticsearch

I'm new to ELK Stack and trying to setup a dashboard to analyze my apache access logs. Setting up the environment and displaying data from my logfiles all worked. But it seems like Kibana is mistakenly using spaces (and in another dashboard colons and minuses) as separators.
The first two screenshots show that the information inside my attribute "server_node" are correct.
Sadly this one shows that every space-sign is used as separator. So instead of "Tomcat Website Prod 1" or "Tomcat Website Prod 2" as seen in server_node there are too many entries and thus falsify my graph.
This is my widget setting. As mentioned I'm new to ELK and hence don't have that much knowledge to set up good dashbards.
Does anyone of you have any expirience with setting up kibana to analyze apache access logs and can give me a hint on how to setup expressive dashboards or can give me a sample dashboard to use as a model?
Thanks for your help and time and regards, Sebastian

The basic problem you are running into is that strings are analyzed by default -- which is what you want in a text search engine, but not what you want in an analytics type of situation. You need to set the field to not_analyzed before loading it in.
If you are using logstash 1.3.1 or later to load your data, you should be able to change your field to server_node.raw (see http://www.elasticsearch.org/blog/logstash-1-3-1-released/):
Most folks, in this situation, sit and scratch their heads, right? I know I did the first time. I’m pretty certain “docs” and “centralized” aren’t valid paths on the logstash.net website! The problem here is that the pie chart is built from a terms facet. With the default text analyzer in elasticsearch, a path like “/docs/1.3.1/filters/” becomes 3 terms {docs, 1.3.1, filters}, so when we ask for a terms facet, we only get individual terms back!
Index templates to the rescue! The logstash index template we provide adds a “.raw” field to every field you index. These “.raw” fields are set by logstash as “not_analyzed” so that no analysis or tokenization takes place – our original value is used as-is! If we update our pie chart above to instead use the “request.raw” field, we get the following:

Related

Show last state of boolean field from elastic in grafana's stat panel

I have elastic index that get a new document every X minutes, which one of its fields is a boolean field that determines whether a server is up or down (true or false, and no prometheus can't help because the logic behind it is complex and already implemented by another team).
I'm trying to create a simple grafana stat panel, which will be green or red based on the value of the most recent document we get from the query.
I tried many transformations and options with no success. I don't mind using value mapping or another solution as long as I can get red/green panel (and I also tried this direction with no success)
The only way I managed to do it is using logs table, using group by transformation using the first timestamp and is_up field, and only showing the is_up value using organize fields.
But this solution is ugly and bad, and for some reason I couldn't color the cell even after using value mapping (true->1, false->0)
EDIT: I managed to do it (and will post it as answer if no one will be able to improve it), I used logs metrics and with the calculation I chose first, and by using filter by name transform I chose to only show to is_up field.
But now my problem is that I cannot color the fields based and the state. I tried using value mapping which helped me before, but now for some reason it doesn't work at all.
EDIT2: It worked by changing the true/false values to 1/0 and then everything worked as I wanted in the panel. I'll post this solution as an answer after the bounty ends
Posting my solution as an answer because this is the best way I managed to solve it, and no one answered it yet, so I hope it would help other people in the future:
The solution:
Pick logs as your metric
Use first as the calculation in Display options within Panel options
Add Filter by Name transformation, and only show the field you're interested in (is_up in my question)
If you want to color the panel correctly you will have to change the way you create new documents in your index- instead of using a boolean field change it to int and just use 0 as false and ` as true, than you will be able to use thresholds correctly.
If you still want to show true/false or up/down as the stat of the panel use value mapping and change 1 to up/true and 0 to down/false.

Kibana (elasticsearch visualization) - how add plot based on sub-string of field?

I have a field in my logs called json_path containing data like /nfs/abc/123/subdir/blah.json and I want to create count plot on part of the string abc here, so the third chunk using the token /. I have tried all sorts of online answers, but they're all partial answers (nothing I can easily understand how to use or integrate). I've tried running POST/GET queries in the Console, which all failed due to syntax errors I couldn't manage to debug (they were complaining about newline control chars, when there were none that I could obviously see or see in a text editor explicitly showing control-characters). I also tried Management -> Index Patterns -> Scripted Field but after adding my code there, basically the whole Kibana crashed (stopped working temporarily) until I removed that Scripted Field.
All this elasticsearch and kibana stuff is annoyingly difficult, all the docs expect you to be an expert in their tool, rather than just an engineer needing to visualize some data.
I don't really want to add a new data field in my log-generation code, because then all my old logs will be unsupported (which have the relevant data, it just needs that bit of string processing before data viz). I know I could probably back-annotate the old logs, but the whole Kibana/elasticsearch experience is just frustrating and I don't use it enough to justify learning such detailed procedures (I actually learned a bunch of this stuff a year ago, and then promptly forgot it due to lack of use).
You cannot plot on a sub string of a field unless you extract that sub string into a new field. I can understand the frustration in learning a new product but to be able to achieve what you want you need to have that sub string value in a new field. Scripted fields are generally used to modify a field. To be able to extract sub string from a field I’d recommend using Ingest Node processor like grok processor. This will add a new field which you can use to plot in Kibana visualizations..

I have an AWS Elasticsearch instance that I want to change the delimiter used when tokenizing

I'm currently using Jest to communicate with an AWS Elasticsearch instance running Elasticsearch 5.3. One of the fields is a URL, but I don't think a single period without following white space is considered a delimiter by default when Elasticsearch tokenizes. Therefore, I can't search for "www.google.com" with "google," for example.
I'd really like to be able to add a single period to the delimiter pattern. I've seen documentation on Elasticsearch's website about how to alter the delimiter when using Elasticsearch natively, but I haven't seen anyone change it through Jest. Is this possible, and if so, how would I go about doing so?
I'd like to configure it using some client in a Java application if possible.
I believe that a pattern Tokenizer could help. See https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-pattern-analyzer.html
Or a char filter where you could replace a dot by a space? See https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-mapping-charfilter.html

Skip common/duplicate parts while indexing web pages with ElasticSearch

I don't have any experience with ElasticSearch yet, but from what I read I think it suits most my needs. I have a web scraper which scrapes pages of certain domains.
I want to feed these pages into SE and offer a front end interface to search the scraped content. I'm building some sort of vertical search engine.
But as we all know, web pages of one host often only contain a little bit of unique content, a great part of the pages are common. Footer, header, menu etc. are the same on every page.
Does ElasticSearch have some build in intelligence that can filter out the common parts and only search the real content??
It's not terribly difficult to pump web content into Elastic, so I'll assume you have that down. =)
I think this article is fantastic for understanding how to index/search web pages:
http://blog.urx.com/urx-blog/2014/9/4/the-science-of-crawl-part-1-deduplication-of-web-content
It's a complex problem and they have some great detail. There is nothing I know of natively in Elastic that has intelligence to help you eliminate duplicates etc.
The strategy you need to adopt here would be to create a unique key per document. Taking checksum using sha1 or similar algorithm will do the job for getting the unique key. Make this the document ID so that only one page occurs at all point of time. Again use _create API to index if you dont want new duplicates to be indexed ( More efficient ) , and in case you want the new ones to be the document use normal indexing.
In case you need to modify the orginal document in case of disocvery of duplicate document , use upser.
I have explained a great deal of this in this blog.

Multiple synonym(s) for search terms

I'm in need to have more than 1 synonym for a search term in magento (version 1.4.2.0 - can't upgrade it for now), but all my attempts to add multiple synonyms have failed.
I've been looking around without any solution, any of you had a similar need and managed to find a solution?
Thanks for any help,
Mat.
So you have people look for 'doodad' or 'dodad' and you want to show people the 'macguffin' instead.
So far you have tried to add these search terms in on the back-end but it has not worked.
The fix-workaround is surprisingly simple.
Type in 'dodad' in the frontend - no result given.
Now type 'doodad' in the frontend - again no results.
Now go into the backend and go to the last page of the search terms.
The entries for 'dodad' and 'doodad' will be in there. You can now put 'macguffin' in the synonym box.
Now go to the front and type in 'dodad' or 'doodad' into the search box and it will take you straight to the 'macguffin' item.

Resources