Ingesting Logs older than 30d into Google Cloud Logging discarded despite custom retention - google-cloud-logging

We have an application that tries to migrate existing application logs of about a year into google cloud logging.
I have created a custom logging bucket with a retention time of 400d and a custom sink that writes all the requests for a given logName into this bucket.
I also excluded this logName from the _DEFAULT bucket.
Now if i write log entries for the given logName, with timestamps from all over the last year, all entires that are older than 30d are discarded.
I even increased the retention time of the _DEFAULT bucket to e.g. 60d but still no logs older than 30d can be written.
That means that the write successfully happened for all log entries but only entries that are older than 30d do not show up in the Log Explorer.
According to Routing - Log Retention the bucket should define the retention period:
Cloud Logging retains logs according to retention rules applying to the log bucket type where the logs are held.
Also there doesn't seem to be any quota that should limit this.
Does anybody know why entries with a timestamp older than 30d are silently discarded despite properly configured logging bucket and sink?
Or are there any better solutions to import logs into GCL without having to write a custom app to do so?

Cloud Logging currently has time bounds on the timestamp of the LogEntries it can ingest in its storage. Logs can only be ingested if the timestamps are within the last 30 days or 1 day in the future. This applies even if your bucket retention period is set to 60 days or more.
This is a current limitation and may change in the future.
Disclaimer: I work in Cloud Logging

Related

Archive old data from Elasticsearch to Google Cloud Storage

I have an elasticsearch server installed in Google Compute Instance. A huge amount of data is being ingested every minute and the underline disk fills up pretty quickly.
I understand we can increase the size of the disks but this would cost a lot for storing the long term data.
We need 90 days of data in the Elasticsearch server (Compute engine disk) and data older than 90 days (till 7 years) to be stored in Google Cloud Storage Buckets. The older data should be retrievable in case needed for later analysis.
One way I know is to take snapshots frequently and delete the indices older than 90 days from Elasticsearch server using Curator. This way I can keep the disks free and minimize the storage cost.
Is there any other way this can be done without manually automating the above-mentioned idea?
For example, something provided by Elasticsearch out of the box, that archives the data older than 90 days itself and keeps the data files in the disk, we can then manually move this file form the disk the Google Cloud Storage.
There is no other way around, to make backups of your data you need to use the snapshot/restore API, it is the only safe and reliable option available.
There is a plugin to use google cloud storage as a repository.
If you are using version 7.5+ and Kibana with the basic license, you can configure the Snapshot directly from the Kibana interface, if you are on an older version or do not have Kibana you will need to rely on Curator or a custom script running with a crontab scheduler.
While you can copy the data directory, you would need to stop your entire cluster everytime you want to copy the data, and to restore it you would also need to create a new cluster from scratch every time, this is a lot of work and not practical when you have something like the snapshot/restore API.
Look into Snapshot Lifecycle Management and Index Lifecycle Management. They are available with a Basic license.

Send multiple logs from filebeat to logstash in a timely manner

I have a server where all logs are present in a directory.
Now these files are separated by date. How can I setup filebeat such that all log files from these are sent to kibana (and how to configure this) on other server to receive logs in the same timely manner in a single file.
For example: in server A: I have 40 log files for last 40 days of log
I want these 40 logs in a timely manner, from oldest to newest in a single file in other server.
And also the file with today's date will be updating with new logs.
I have configured filebeat and logstash such that sync is being maintained, but the logs are not in timely manner because of which I'm facing problem in processing it by some of my logic.
glob pattern
/directory to logs/*.log
If you are asking how to remotely sync a set of log files to a single file in time sorted order using filebeat and logstash then...
If you set the harvester_limit to 1, so that only one file is processed at a time, then I think you can use scan.order and scan.sort to get filebeat to send the data in the right order. logstash is more of a problem. In the current version you can disable the java execution engine ('pipeline.java_execution: false' in logstash.yml) and set '--pipeline.workers 1', in which case logstash will preserve order.
In future releases I do not forsee elastic maintaining two execution engines, so once the ruby execution engine is retired it will not be possible to prevent events being re-ordered in the pipeline (the java engine routinely re-orders events in the pipeline in reproducible but unpredicatable ways).

Time mismatch in kibana

We are having ELK setup with Kibana version 5.6.10. We are facing a time mismatch in displaying logs from different servers.
We are fetching log from 8 IIS server and parsing via Logstash to Elastic search Kibana. While filtering logs for past hour we could notice only 2 server logs were displayed. We have checked filebeat configuration in each IIS servers and found same configuration setup; also verified IIS log time format and other configurations. We could see indexing is happening properly in Elastic Search but while filtering the display option for an hour only throwing results for 2 servers. If we filter for four hours we can see multiple servers with the different time value in the display.
Would like to know anyone facing a similar issue and welcoming solution for it.
I have had the same issue. The issue is with the time zone. Kibana works on UTC by default. Please check if the time zone in your ES docs is same as that of Kibana. You can do so by (kibana) Management Tab ->Advance Settings ->dateFormat:tz
If the time is zone is different, please use 'Today' in kibana time window to check your recent documents.
Alternatively, you can also index your timestamp field with UTC timezone(or your desired time zone) in ES. Then set up kibana with the same timezone as ES to check your documents.
The issue is of the timezone. The server of which the logs are not being displayed are most probably in a different timezone than the timezone of Kibana. This is an issue of Kibana, it doesn't work on global. Here is the issue reported on GIT. You can keep track of this.
https://discuss.elastic.co/t/kibana-timestamp-in-browser-local-time-but-incoming-logs-utc/57501
https://github.com/elastic/kibana/issues/1600

Elasticsearch - missing data

I have been planning to use ELK for our production environment and seems to be running into a weird problem -
the problem is that while loading a sample of the production log file I realized that there is a huge mismatch in the number of events being published by Filebeat and what we see in kibana. My first doubt was on filebeat but i could verify that all the events were successfully received in logstash.
I also checked logstash (by enabling debug mode ) and could see all the events were received and processed (i am using the following filters date , json ) and i could see them getting processed successfully
but when i do a search in kibana I only get to see the percent of the number of logs being actually published (e.g. only 16000 out of 350K). No exception or error in either logstash or elasticsearch logs.
I have tried zapping the entire data by doing the following so far :
Stopped all processes for ES, Logstash and kibana.
Deleted all the index files, cleared the cache , deleted mappings
stopped filebeat, deleted registry files (since its running in windows)
Restarted elasticsearch, logstash and filebeat (in that order)
but same results. i get only 2 out of 8 records (in the shortened file) and even less when i use the full file
i tried increasing the time windows in kibana to 10 years (:)) to see if they are being pushed to the wrong year but got nothing
I have read almost all threads related to the missing data but nothing seems to work.
any pointers would help !

VMSS Autoscaling: WADPerformanceCounters

I’ve added the autoscaling settings to my ServiceFabric template and after deploying it, the portal shows that auto scale is configured, but what I am not able to see is the table WADPerformanceCounters; mentioned in the documentation; in my storage account. So how is the auto scaling executed without the information about the couters?
Thanks.
If autoscale cannot find the data it's configured to look at, it will set your capacity equal to the "default" configured in the autoscale rule.
As for what could explain the behavior you're seeing, here are a couple hypotheses:
1) There are two types of metrics in Azure today: host and guest; host metrics live in Azure-internal data stores and as such don't require a storage account to store data in. Guest metrics, however, do live in a storage account. So depending on how you added autoscale, you might have added host metrics instead of guest metrics? For more info, see this doc: https://learn.microsoft.com/en-us/azure/monitoring-and-diagnostics/insights-autoscale-common-metrics
2) As you can see in this template using guest metrics, for guest metrics the scale set must have the WAD extension configured to point to the storage account; it's probably worth checking that the storage account specified in the WAD extension config is the same storage account you looked for the table in.
For host metrics, you can find the list of supported metrics here:
https://learn.microsoft.com/en-us/azure/monitoring-and-diagnostics/monitoring-supported-metrics#microsoftcomputevirtualmachinescalesets
For guest metrics, as mentioned above you need to configure the Windows Azure Diagnostic (WAD) extension correctly on your VMSS. Specifically autoscale engine will query from the WAD{value}PT1M{value} tables in your configured diagnostic storage account. These tables contain the local 1 minute aggregation of the performance counter data.

Resources