Scale y-Axis in kibana 4.1.0 - kibana-4

In kibana 4.1.0 Is there some way to scale the Y-axis?, I have an average metric with a field that is in seconds, but I want it to be shown in hours, I mean the 25,000 seconds should be somenthing like 25000/3600 or something like that.
Kibana 4 provides a feature to introduce a Json input taht should be merged with the existing configuration but I cannot make it work, I saw that this script should work :
{'script':'(_value)/3600'}
but it doesnt, it throws an error :
Visualize: Request to Elasticsearch failed: {"error":"SearchPhaseExecutionException[Failed to execute phase [query], all shards failed; shardFailures {[leNBGA9VRmuUiPaMidqeVw][logstash-2014.05.20][0]: SearchPar...
any ideas??

I found a solition for this here:
https://www.elastic.co/guide/en/elasticsearch/reference/current//modules-scripting.html
You have to add this to your elasticsearch.yml:
script.inline: on
script.indexed: on
and use this :{'script':'(_value)/3600'}

Related

Wazuh - Filebeat - Elasticsearch non-zero metrics

Could you please help me solve this Filebeat error?
Its Wazuh manager server. All is working, I can connect to Kibana web, enter Wazuh app and I can see there my three Wazuh agents connected and active.
I want FIM monitoring nad If I change file on agent server, alert is created and I can see that alert in alert.log on manager server. Issue is, that Filebeat wont send this alert to elasticsearch so I cant see that alert on Kibana web.
Wazuh manager>
Wazuh 4.2.5
Filebeat 7.14.2
Elasticsearch 7.14.2
Kibana 7.14.2
Wazuh alert log - /var/ossec/logs/alerts/2022/Feb/ and /var/ossec/logs/alerts
systemctl status filebeat is active, but I can see there lines:
WARN [elasticsearch] elasticsearch/client.go:405 Cannot>
This is error from > filebeat -e
2022-02-03T12:46:20.386+0100 INFO [monitoring] log/log.go:153 Total non-zero metrics {"monitoring": {"metrics": {"beat":{"cgroup":{"memory":{"id":"session-248447.scope","mem":{"limit":{"bytes":9223372036854771712},"usage":{"bytes":622415872}}}},"cpu":{"system":{"ticks":70,"time":{"ms":72}},"total":{"ticks":300,"time":{"ms":311},"value":300},"user":{"ticks":230,"time":{"ms":239}}},"handles":{"limit":{"hard":262144,"soft":1024},"open":9},"info":{"ephemeral_id":"641d7fdd-47a0-4b10-bda9-36f29c29fdef","uptime":{"ms":98413},"version":"7.14.2"},"memstats":{"gc_next":18917616,"memory_alloc":14197072,"memory_sys":75383816,"memory_total":71337840,"rss":115638272},"runtime":{"goroutines":11}},"filebeat":{"harvester":{"open_files":0,"running":0}},"libbeat":{"config":{"module":{"running":2,"starts":2},"reloads":1,"scans":1},"output":{"events":{"active":0},"type":"elasticsearch"},"
And here is error found in /var/log/messages
Feb 3 10:27:54 filebeat[2531915]: 2022-02-03T10:27:54.707+0100#011WARN#011[elasticsearch]#011elasticsearch/client.go:405#011Cannot index event publisher.Event{Content:beat.Event{Timestamp:time.Time{wall:0xc07705e669760167, ext:958857091513, loc:(*time.Location)(0x5620964fb2a0)}, Meta:{"pipeline":"filebeat-7.14.0-wazuh-alerts-pipeline"}, Fields:{"agent":{"ephemeral_id":"33cb9baa-af71-4b44-99a6-1379c747722f","hostname":"xlc","id":"03fb57ca-9940-4886-9e6e-a3b3e635cd35","name":"xlc","type":"filebeat","version":"7.14.0"},"ecs":{"version":"1.10.0"},"event":{"dataset":"wazuh.alerts","module":"wazuh"},"fields":{"index_prefix":"wazuh-monitoring-"},"fileset":{"name":"alerts"},"host":{"name":"xlc"},"input":{"type":"log"},"log":{"file":{"path":"/var/ossec/logs/alerts/alerts.json"},"offset":122695554},"message":"{\"timestamp\":\"2022-02-03T10:27:52.438+0100\",\"rule\":{\"level\":5,\"description\":\"Registry Value Integrity Checksum Changed\",\"id\":\"750\",\"mitre\":{\"id\":[\"T1492\"],\"tactic\":[\"Impact\"],\"technique\":[\"Stored Data Manipulation\"]},\"firedtimes\":7,\"mail\":false,\"groups\":[\"ossec\",\"syscheck\",\"syscheck_entry_modified\",\"syscheck_registry\"],\"pci_dss\":[\"11.5\"],\"gpg13\":[\"4.13\"],\"gdpr\":[\"II_5.1.f\"],\"hipaa\":[\"164.312.c.1\",\"164.312.c.2\"],\"nist_800_53\":[\"SI.7\"],\"tsc\":[\"PI1.4\",\"PI1.5\",\"CC6.1\",\"CC6.8\",\"CC7.2\",\"CC7.3\"]},\"agent\":{\"id\":\"006\",\"name\":\"CPP\",\"ip\":\"10.74.37.3\"},\"manager\":{\"name\":\"xlc\"},\"id\":\"1643880472.68132386\",\"full_log\":\"Registry Value '[x32] HKEY_LOCAL_MACHINE\\\\System\\\\CurrentControlSet\\\\Services\\\\W32Time\\\\Config\\\\LastKnownGoodTime' modified\\nMode: scheduled\\nChanged attributes: md5,sha1,sha256\\nOld md5sum was: '5df5b1598b729d98734105148103abf2'\\nNew md5sum is : '361334bf60bdd83e30894c4f313d16ec'\\nOld sha1sum was: 'c233c8ccb56fbd363c44b51a9d51c7fa32512474'\\nNew sha1sum is : '7163cffa48f1a7c0bcb4a3ddff6278ae9a4895a6'\\nOld sha256sum was: '3aad3da22f2d53e8ac33c46c73f40c3e8f5db07188d166e24957d8a20b62b5f1'\\nNew sha256sum is : 'bee8072335d870a1624a541cb13ca5085ba85646a8417d4d894deff71c3f4a92'\\n\",\"syscheck\":{\"path\":\"HKEY_LOCAL_MACHINE\\\\System\\\\CurrentControlSet\\\\Services\\\\W32Time\\\\Config\",\"mode\":\"scheduled\",\"arch\":\"[x32]\",\"value_name\":\"LastKnownGoodTime\",\"size_after\":\"8\",\"md5_before\":\"5df5b1598b729d98734105148103abf2\",\"md5_after\":\"361334bf60bdd83e30894c4f313d16ec\",\"sha1_before\":\"c233c8ccb56fbd363c44b51a9d51c7fa32512474\",\"sha1_after\":\"7163cffa48f1a7c0bcb4a3ddff6278ae9a4895a6\",\"sha256_before\":\"3aad3da22f2d53e8ac33c46c73f40c3e8f5db07188d166e24957d8a20b62b5f1\",\"sha256_after\":\"bee8072335d870a1624a541cb13ca5085ba85646a8417d4d894deff71c3f4a92\",\"changed_attributes\":[\"md5\",\"sha1\",\"sha256\"],\"event\":\"modified\"},\"decoder\":{\"name\":\"syscheck_registry_value_modified\"},\"location\":\"syscheck\"}","service":{"type":"wazuh"}}, Private:file.State{Id:"native::1049-64776", PrevId:"", Finished:false, Fileinfo:(*os.fileStat)(0xc000fc9380), Source:"/var/ossec/logs/alerts/alerts.json", Offset:122697450, Timestamp:time.Time{wall:0xc07704f6d4cb3764, ext:510354422, loc:(*time.Location)(0x5620964fb2a0)}, TTL:-1, Type:"log", Meta:map[string]string(nil), FileStateOS:file.StateOS{Inode:0x419, Device:0xfd08}, IdentifierName:"native"}, TimeSeries:false}, Flags:0x1, Cache:publisher.EventCache{m:common.MapStr(nil)}} (status=400): {"type":"illegal_argument_exception","reason":"data_stream [<wazuh-monitoring-{2022.02.03||/d{yyyy.MM.dd|UTC}}>] must not contain the following characters [ , \", *, \\, <, |, ,, >, /, ?]"}
Could you please help with this? I tried google but with no success. Thank you.
Filebeat reads from alerts.json, you can check this file to see if the alerts are being generated. Judging from the log you provided, it looks like filebeat cannot send some logs to elasticsearch (Cannot index event publisher.Event), but we would need more details about the complete error and source logs causing that error. The output of the command # journalctl -f -u filebeat will be useful in this case to provide further assistance.
Based on previous experience. the problem could be that you have reached the maximum limit of shards opened, by default this number is set to 1000. If this is the case, you will see an error like the following: {"type":"validation_exception","reason":"Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [1000]/[1000] maximum shards open;"}
If that's the case, you can either reduce the number of shards, or increase the limit to solve the situation right now. I'd recommend the first approach if you only have 1 Elasticsearch node, having 1000 shards is not healthy for the environment in these cases.
To reduce the number of shards in /etc/filebeat/wazuh-template.json check this information and change it to "1", then restart filebeat. These actions will affect the index from now on, but checking This guide can help you with cases like this one.
Also, you can try to remove old indexes. I would first check what are the indices you have stored. I suppose some of them are related to statistics or other stuff, so I would first try to remove those before actual data (wazuh-alerts-)
You can use:
GET /_cat/indices
As the indices are stored per day by default, so you can remove, for instance, those indices older than 1 month and we only keep one month of those indices
To prevent this from happening in the future, you may try implementing an Index Management Policy after you solve the issue at hand.

Logsash pipeline terminating before timeout duration of aggregation filter plugin

I am having a logstash pipeline that has an elasticsearch plugin for input and aggregation filter plugin. The aggregation filter plugin has a timeout duration of 15 seconds. Refer to the filter plugin configuration
aggregate {
task_id => "%{[test.region.keyword]}"
code => "
map['total_count_per_region'] ||= 0;
map['total_count_per_region'] += event.get('Count');
"
push_map_as_event_on_timeout => true
timeout_task_id_field => "test.region.keyword"
timeout => 15
timeout_tags => ['_aggregatetimeout']
timeout_code => "
event.set('total_count_per_region', event.get('total_count_per_region'));
"
}
My pipeline configuration fetches documents from index as input and the filter plugin performs an aggregation using the aggregation plugin as shown above. This is going to generate aggregation result as an event after 15 seconds.
But when I start my pipeline, it starts successfully without issues. However, before the aggregation events are generated(after 15 seconds) the pipeline is terminated and hence I don't get those events in my output.
[2021-03-23T14:12:24,610][INFO ][logstash.javapipeline ][pTest] Starting pipeline {:pipeline_id=>"pTest", "pipeline.workers"=>3, "pipeline.batch.size"=>1500, "pipeline.batch.delay"=>600, "pipeline.max_inflight"=>4500, "pipeline.sources"=>["/etc/logstash/conf.d/test1.conf"], :thread=>"#<Thread:0x7eabfdb5 run>"}
[2021-03-23T14:12:24,689][INFO ][logstash.agent ] Pipelines running {:count=>2, :running_pipelines=>[:".monitoring-logstash", :pTest], :non_running_pipelines=>[ ]}
[2021-03-23T14:12:28,601][INFO ][logstash.javapipeline ][pTest] Pipeline terminated {"pipeline.id"=>"pTest"}
Please help in how to resolve this
Assuming you are running 7.7 or 7.8 you could either upgrade to 7.9.1, or disable the java execution engine. This can be done in logstash.yml using pipeline.java_execution: false or on the command line using --java-execution false.
The aggregate timeout is triggered by a "periodic flusher", that runs every 5 seconds. When a pipeline shuts down a "final flusher" triggers the timeout early. A bug was introduced that prevented any of the flushers being called by the java engine. This was fixed in 7.9.1.
I figured out the reason for the issue I was facing. I was using the elasticsearch input plugin without a schedule attribute and hence, by pipeline ran once when I started/restarted logstash and then terminated after the run. I had to schedule it to run once per day (which was my requirement as well) using the schedule attribute and that made sure that my pipeline was in running state.
This resolved the issue I have posted about. When the aggregation event was generated 15 seconds after my pipeline ran each day, my pipeline is still in running state and the events go through to the output successfully.
Thank you

Error working with "ScrollElasticSearchHttp" processor in NiFi

I am trying to retrieve data from an index in ElasticSearch. I configured the "QueryElasticSearchHttp" processor and it works just fine. However when I try to use the ScrollElasticsearchHttp processor with the same URL, query, index properties and set the 'scroll' to default 1 minute, it doesn't work.
I get an error response of 404 : "Elasticsearch returned code 404 with message Not found".
I am also tailing the log on the ES cluster and I see this error;
[DEBUG][o.e.a.s.TransportSearchScrollAction] [2] Failed to execute query phase
org.elasticsearch.transport.RemoteTransportException:[127.0.0.1:9300][indices:data/read/search[phase/query+fetch/scroll]]
Caused by: org.elasticsearch.search.SearchContextMissingException: No search context found for id [2]
at org.elasticsearch.search.SearchService.getExecutor(SearchService.java:457) ~[elasticsearch-7.5.2.jar:7.5.2]
I am on Apache NiFi 1.10.0
Here is the config for the processor:
I should see a total of 441 hits, and with page size 20 I should see 23 queries being made to ES.
But I don't get a single result back. I have tried higher values for "scroll" and also played around with "page size" to no avail.
I also noticed that even though the ScrollElasticsearchHttp processor is set to run every 1m, on the ES log I don't see any error log repeated every minute.
Update:
When I cleared the state via UI: "View state" -> "Clear State", I was able to make a single call, that returned a page full of hits in one flowfile.
However, there are more pages to be retrieved. How do I make the processor to go fetch the next page?
My understanding was that the single invocation of the ScrollElasticsearchHttp will page through all the result sets and bring in each page as one flowfile. Is this not correct?
Please decrease the scheduling time to around 10-20 sec. So in every 10-20 sec processor will fetch the next set of records based on your page size.
You can check the state value when the fetching process is in progress i.e. you will find a scroll id in it. Once the fetching process is complete then state value will be changed to "finishedQuery" : true.

How to Fix Read timed out in Elasticsearch

I used Elasticsearch-1.1.0 to index tweets.
The indexing process is okay.
Then I upgraded the version. Now I use Elasticsearch-1.3.2, and I get this message randomly:
Exception happened: Error raised when there was an exception while talking to ES.
ConnectionError(HTTPConnectionPool(host='127.0.0.1', port=8001): Read timed out. (read timeout=10)) caused by: ReadTimeoutError(HTTPConnectionPool(host='127.0.0.1', port=8001): Read timed out. (read timeout=10)).
Snapshot of the randomness:
Happened --33s-- Happened --27s-- Happened --22s-- Happened --10s-- Happened --39s-- Happened --25s-- Happened --36s-- Happened --38s-- Happened --19s-- Happened --09s-- Happened --33s-- Happened --16s-- Happened
--XXs-- = after XX seconds
Can someone point out on how to fix the Read timed out problem?
Thank you very much.
Its hard to give a direct answer since the error your seeing might be associated with the client you are using. However a solution might be one of the following:
1.Increase the default timeout Globally when you create the ES client by passing the timeout parameter. Example in Python
es = Elasticsearch(timeout=30)
2.Set the timeout per request made by the client. Taken from Elasticsearch Python docs below.
# only wait for 1 second, regardless of the client's default
es.cluster.health(wait_for_status='yellow', request_timeout=1)
The above will give the cluster some extra time to respond
Try this:
es = Elasticsearch(timeout=30, max_retries=10, retry_on_timeout=True)
It might won't fully avoid ReadTimeoutError, but it minimalize them.
Read timeouts can also happen when query size is large. For example, in my case of a pretty large ES index size (> 3M documents), doing a search for a query with 30 words took around 2 seconds, while doing a search for a query with 400 words took over 18 seconds. So for a sufficiently large query even timeout=30 won't save you. An easy solution is to crop the query to the size that can be answered below the timeout.
For what it's worth, I found that this seems to be related to a broken index state.
It's very difficult to reliably recreate this issue, but I've seen it several times; operations run as normal except certain ones which periodically seem to hang ES (specifically refreshing an index it seems).
Deleting an index (curl -XDELETE http://localhost:9200/foo) and reindexing from scratch fixed this for me.
I recommend periodically clearing and reindexing if you see this behaviour.
Increasing various timeout options may immediately resolve issues, but does not address the root cause.
Provided the ElasticSearch service is available and the indexes are healthy, try increasing the the Java minimum and maximum heap sizes: see https://www.elastic.co/guide/en/elasticsearch/reference/current/jvm-options.html .
TL;DR Edit /etc/elasticsearch/jvm.options -Xms1g and -Xmx1g
You also should check if all fine with elastic. Some shard can be unavailable, here is nice doc about possible reasons of unavailable shard https://www.datadoghq.com/blog/elasticsearch-unassigned-shards/

logstash with elasticsearch_http

Apparently logstash OnDemand account does not work when I wanted to post an issue.
Anyways, I have a logstash setup with redis, elasticsearch, and kibana. My logstash are collecting logs from several files and putting in redis just fine.
Logstash version 1.3.3
Elasticsearch version 1.0.1
The only thing I have in elasticsearch_http for logstash is the host name. This all setup seems to glue together just fine.
The problem is that the elasticsearch_http is not consuming the redis entries as they come. What I have seen by running it in debug mode is that it flush about 100 entries after every 1 min (flush_size and idle_flush_time's default values). The documentation however states, from what I understand is, that it will force a flush in case the 100 flush_size is not satisfied (for example we had 10 messages in last 1 min). But it seems to work the other way. Its flushing about 100 messages every 1 min only. I changed the size to 2000 and it flush 2000 every min or so.
Here is my logstash-indexer.conf
input {
redis {
host => "1xx.xxx.xxx.93"
data_type => "list"
key => "testlogs"
codec => json
}
}
output {
elasticsearch_http {
host => "1xx.xxx.xxx.93"
}
}
Here is my elasticsearch.yml
cluster.name: logger
node.name: "logstash"
transport.tcp.port: 9300
http.port: 9200
discovery.zen.ping.unicast.hosts: ["1xx.xxx.xxx.93:9300"]
discovery.zen.ping.multicast.enabled: false
#discovery.zen.ping.unicast.enabled: true
network.bind_host: 1xx.xxx.xxx.93
network.publish_host: 1xx.xxx.xxx.93
The indexer, elasticsearch, redis, and kibana are on same server. The log collection from file is done on another server.
So I'm going to suggest a couple of different approaches to solve you problem. Logstash as you are discovering can be a bit quirky so I've found a these approaches useful in dealing with unexpected behavior from logstash.
Use the elasticsearch output instead of elasticsearch_http. You
can get the same functionality by using elasticsearch output with
protocol set to http. The elasticsearch output is more mature
(milestone 2 vs milestone 3) and I've seen this change make a
difference before.
Set the defaults for idle_flush_time and flush_size. There have
been issues with Logstash defaults previously, I've found it to be a
lot safer to set them explicitly. idle_flush_time is in seconds,
flush_size is the number of records to flush.
Upgrade to more recent versions of logstash. There is
enough of a change in how logstash is deployed with version 1.4.X
(http://logstash.net/docs/1.4.1/release-notes) that I'd that I'd
bite the bullet and upgrade. It's also significantly easier to get
attention if you still have a problem with the most recent stable
major release.
Make certain your Redis version matches those support by your
logstash version.
Experiment with setting the batch, batch_events and batch_timeout
values for the Redis output. You are using the list data_type.
list supports various batch options and as with some other
parameters it's best not to assume the defaults are always being set
correctly.
Do all of the above. In addition to trying the first set of
suggestions, I'd try all of them together in various combinations.
Keep careful records of each test run. Seems obvious but between all
the variations above it's easy to lose track - I'd keep careful
records and try to change only one variation at a time.

Resources