rsyslog parse message for each unique name in one pattern? - rsyslog

Currently, I have for example the following logs messages based on different names/id (something like that, it is generated based on the specific pattern that is set in the code)
example:
log from one source named "container-ABC":
"This is the example log, id: 123abc, name:container-ABC"
"This is the example log, id: 456def, name:container-ABC"
So in rsyslog, how can I split the source into multiple log files? But note that it is generated based on specific a preset pattern regex and does not have a fixed id.
message has "id: 123abc" -> into 123abc.log
message has "id: 456def" -> into 456def.log
Thank you.

Related

Amazon Cloudwatch JSON format not correct

I am logging Amazon skill requests and replies in the Lambda function of my Alexa skill. These are JSON objects that I am logging as follows:
logger.debug('Incoming request:\n' + JSON.stringify(event, null, 2));
logger.debug('Final response:\n' + JSON.stringify(alexaResponse, null, 2) + '\n\n');
When viewing the logs in Cloudwatch with Expand all = Row I see this:
Cloudwatch Log with Expand all set to Row
When I set Expand all to Text, the result is slightly better, but the leading spaces are trimmed, causing the indentation structure of the JSON document to be lost.
Cloudwatch Log with Expand all set to Text
I got this code from a tutorial. It is possible that changes in Amazon Cloudwatch have made code that worked in the tutorial fail now.
In the tutorial, the output looks like this:
Tutorial screen shot
That's the log output I want. How do I get it?
My cloudwatch logs automatically display JSON strings as you want them without using the optional 2nd and 3rd arguments of JSON.stringify()
So try removing the null and 2 from your log statement's JSON.stringify.
logger.debug('Incoming request: \n' + JSON.stringify(event));
logger.debug('Final response: \n '+ JSON.stringify(alexaResponse));
I'm not at my computer to test, but I think that new line (\n) should give you the desired effect in cloudwatch, placing the json in its own line that you can expand. And its the extra whitespace that is making cloudwatch set new rows in the JSON.
With the limited experience i have, all the new lines are converted into a separate log entry.
IMHO it is behaving as line logger, instead of entry logger.
So what you can do is, remove the new lines from log entries, when sending to cloudwatch.
It should automatically show the log json in formatted structure.

Parse email subject office 365 flows

I am trying to get some data parsed out of a subject line in Office 365 Flows. I have an email that has a consistent format:
Help Desk [Ticket #12345]
I want to get the number '12345' for use in later steps in the flow. So far, I've attempted to use the substring expression in a compose connector:
substring(triggerBody()?['Subject'], 20, 5)
But I get an error about the string being null.
Besides the index being incorrect (to retrieve '12345' from Help Desk [Ticket #12345] you need to use substring(value, 0, 5) as the index is 0-based), the expression looks correct. But you can take a step-by-step approach to see what is wrong.
To start, take a look at the flow run to see exactly what the trigger outputs are:
If you see the Subject field (as I do in my case), create a variable containing that value only to make sure that you don't have any typo:
If it works correctly, then you should see in the flow run the subject:
If everything is still good at that point, create a new variable with the substring that you want:
And again, check the value.
If you got to this point, then you should be able to retrieve the ticket id.

Facing Issue while sending data from Filebeats to Multiple Logstash files

To be Precise, I am handling a log file which has almost millions of records. Since it is a Billing Summary log, Customer Information will be recorded in no particular order.
I am Using customized GROK Patterns and logstash XML filter plugin to extract the data which would be sufficient to track. To track the The Individual Customer Activities, I am using "Customer_ID" as a unique key. So Even though I am using Multiple Logstash Files, and Multiple GROK Patterns, All his Information could be bounded/Aggregated using his "Customer_ID" (Unique Key)
here is my sample of log file,
7-04-2017 08:49:41 INFO abcinfo (ABC_RemoteONUS_Processor.java52) - Customer_Entry :::<?xml version="1.0" encoding="UTF-8"?><ns2:ReqListAccount xmlns:ns2="http://vcb.org/abc/schema/"/"><Head msgId="1ABCDEFegAQtQOSuJTEs3u" orgId="ABC" ts="2017-04-27T08:49:51+05:30" ver="1.0"/><Cust id="ABCDVFR233cd662a74a229002159220ce762c" note="Account CUST Listing" refId="DCVD849512576821682" refUrl="http://www.ABC.org.in/" ts="2017-04-27T08:49:51+05:30"
My Grok Pattern,
grok {
patterns_dir => "D:\elk\logstash-5.2.1\vendor\bundle\jruby\1.9\gems\logstash-patterns-core-4.0.2\patterns"
match => [ "message" , "%{DATESTAMP:datestamp} %{LOGLEVEL:Logseverity}\s+%{WORD:ModuleInfo} \(%{NOTSPACE:JavaClass}\)%{ABC:Customer_Init}%{GREEDYDATA:Cust}"]add_field => { "Details" => "Request" }remove_tag => ["_grokparsefailure"]}
My Customized pattern which is stored inside Pattern_dir,
ABC ( - Customer_Entry :::)
My XML Filter plugin,
xml {
source => "Cust"
store_xml =>false
xpath => [
"//Head/#ts", "Cust_Req_time",
"//Cust/#id", "Customer_ID",
"//Cust/#note", "Cust_note", ]
}
So whatever the details comes behind ** - Customer_Entry :::**, I will be able to extract it using XML Plugin Filter (will be stored similar to multi-line codec). I have written 5 different Logstash files to extract different Activities of Customer with 5 different Grok Patterns. Which will tell,
1.Customer_Entry
2.Customer_Purchase
3.Customer_Last_Purchase
4.Customer_Transaction
5.Customer_Authorization
All the above Grok patterns has different set of Information, which will be grouped by Customer_ID as I said earlier.
I can able to Extract the Information and Visualize It clearly in Kibana without any flaw by using my Customized pattern with different log files.
Since I have 100's of Log files each and everyday to put into logstash, I opted for Filebeats, but Filebeats run with only one port "5044". I tried to run with 5 different ports for 5 different logstash files but that was not working, Only one logstash file of 5 was getting loaded rest of the config files were being Idle.
here is my sample filebeat output.prospector,
output.logstash:
hosts: ["localhost:5044"]
output.logstash:
hosts: ["localhost:5045"]
output.logstash:
hosts: ["localhost:5046"]
I couldn't add all the grok patterns in one logstash config file, because XML Filter plugin takes the source "GREEDYDATA". in such case I will be having 5 different Source=> for 5 different Grok pattern.
I even tried that too but that was not working.
Looking for better approach.
Sounds like you're looking for scale, with parallel ingestion. As it happens, File beats supports something called load-balancing which sounds like what you're looking for.
output.logstash:
hosts: [ "localhost:5044", "localhost:5045", "localhost:5046" ]
loadbalance: true
That's for the outputs. Though, I believe you wanted multithreading on the input. FileBeats s supposed to track all files specified in the prospector config, but you've found limits. Globbing or specifying a directory will single-thread the files in that glob/directory. If your file-names support it, creative-globbing may get you better parallelism by defining multiple globs in the same directory.
Assuming your logs are coming in by type:
- input_type: log
paths:
- /mnt/billing/*entry.log
- /mnt/billing/*purchase.log
- /mnt/billing/*transaction.log
Would enable prospectors on multiple threads reading in parallel files here.
If your logs were coming in with random names, you could use a similar setup
- input_type: log
paths:
- /mnt/billing/a*
- /mnt/billing/b*
- /mnt/billing/c*
[...]
- /mnt/billing/z*
If you are processing lots of files with unique names that never repeat, adding the clean_inactive config config-option to your prospectors will keep your FileBeat running fast.
- input_type: log
ignore_older: 18h
clean_inactive: 24h
paths:
- /mnt/billing/a*
- /mnt/billing/b*
- /mnt/billing/c*
[...]
- /mnt/billing/z*
Which will remove all state for files older than 24 hours old, and won't bother processing any file more than 18 hours old.

What does the Mikrotik :execute command return and what can I do with it?

When I run the following command from Mikrotik SSH, I see an asterisk followed by a hex value, which will look something like like *4e.
:put [:execute { :ping localhost count=10 }]
According to the Mikrotik wiki, the :execute command executes commands in the background as a script job. When I print out the currently running jobs, I don't see any relation between the output above with the output of this.
[admin#MikroTik] > /system script job print
# SCRIPT OWNER STARTED
0 admin jun/05/2017 16:58:09
1 admin jun/05/2017 17:07:31
So, what does the output represent and how can I use it?
The return that you get is a MikroTik RouterOS Internal ID. In this case, this is an internal ID for the running job. Internal IDs are immutable references to the object which they represent and have the advantage of being the same between command executions even if concurrent operations are taking place which would change the objects # as shown by print.
You can identify an internal ID by the * followed by a hexadecimal number. In some cases, where there are multiple items to return as is common with find, you will get a ; delimited list of IDs such as:
*d;*1;*18;*3;*19;*1a;*20
Internal IDs can be used as when running other commands just like the printable number and is distinguished by the preceding *. Where a list of multiple internal ids is given, the command will run on each item.
Information on the internal ID data type can be found in the Scripting Manual. Internal IDs do not follow any ordering and you should not assume that they do. To get the internal ID of an object from its ID show in a print output, you can use :put [get <id>] where <id> is the printable id - the output will show the internal id as the .id property.

validating ip address in datastage

I have a source file that contains two fields: IP_ADDRESS and USER_NAME. I want to check whether the IP address is correct or not before loading it to the datawarehouse using DATASTAGE. How do I do this?
I was browsing Stack Overflow and think I might have a solution to your question. If you create a job to grab all of the IP_ADDRESS's from the file and send them to a BASIC transformer (search for BASIC transformer in DataStage. It is NOT the one that is normally on the palette). From there, set the Stage Variables as 'SetUserStatus() and write out the column name to a peek stage (You don't need the output at all. The SetUserStatus is the important part). This will now allow you to pass up the Command Output (list of IP Addresses) to a Sequence. From the Sequence, start with the job you just created (BASIC transformer job) and link that to a User Variables Activity. In the User Variables Activity stage, Set the name to something like 'IP Address' and Expression as IP_ADDRESS.$UserStatus. You can then use a Loop to take that output that is now a List and send each individual IP Address to an Execute Stage with a Ping command to see if it returns a valid IP Address. If it does return a valid IP then have your job that writes the USER_NAME and IP_ADDRESS to do a 'Select' statement where the IP_ADDRESS = the valid IP_ADDRESS. For the ones that aren't valid, you can send them down a different path and have them write out to '.txt' file somewhere so you know which ones weren't valid. I'm sure you will need a few more steps in there but that should be the gist of it.
Hope my quick stab at your issue helps.
Yes, you can use a transformer or a transformer and a filter to do that, depending on the version of Datastage you're using. If you're using PX, just encode the validation logic in a transformer stage, and then, on the output link set up a filter that doesn't allow the rows to pass forward if they didn't pass the validation logic.

Resources