Send logs to datadog with custom formatter shows only the first log of multiple ones - ruby

I am trying to send logs to DD (datadog) in such a way that the logs are being received as json and therefore shown properly in the portal through attributes.
My logger is a simple Logger.new(STDOUT, level: Logger::INFO).
If I stick to its standard output, it will in the form
I, [2022-07-30T22:43:35.216846 #1] INFO -- my-app: {"user":"1234"}
which is not really parsable by DD since not a proper JSON. In this case however all the logs appear at least on the DD portal.
Now.. I am trying to format the logs in a JSON manner in this way:
def self.logger
#logger ||= Logger.new(STDOUT, level: Logger::INFO)
#logger.progname = 'my-app'
#logger.formatter = proc do |severity, datetime, progname, msg|
{timestamp: datetime.to_s, progname: progname, severity: severity, correlation: Datadog::Tracing.log_correlation, message: msg}.to_json
end
#logger
end
This is my logger and thanks to this logs are seen properly in DD and parsed correctly because formatted in my app in a proper JSON.
The problem with this approach though seems to be that the logs are sent in 1 full block. Meaning that only the very first log is being visible. Let's say that I want to log this:
my_hash = {"message" => '1', "prop" => '1234'}.to_json
logger.info(my_hash)
my_hash = {"message" => '2', "prop" => '12345'}.to_json
logger.info(my_hash)
only the first log will be shown correctly on the DD portal. Parsed correctly with its message and prop attributes, but nothing about the second log.
Here is the thing, if I see the output of my app locally in the console I see this:
{"timestamp":"2022-07-31 01:15:39 +0200","progname":"my-app","severity":"INFO","correlation":"dd.service=my-app dd.trace_id=2976451780376429536 dd.span_id=0","message":"{"message":"1","prop":"1234"}"}{"timestamp":"2022-07-31 01:15:39 +0200","progname":"my-app","severity":"INFO","correlation":"dd.service=my-app dd.trace_id=2976451780376429536 dd.span_id=0","message":"{"message":"2","prop":"12345"}"}127.0.0.1 - - [31/Jul/2022:01:15:39 +0200] "GET /controller/test_controller HTTP/1.1" 200 - 0.0024
so the 2nd log gets actually outputted! But DD somehow sees only the first log..
(I know there is even a 3rd one shown in this message.. but that's just Sinatra automatic behavior for every http call reaching the api). What do you guys think is the problem?

Related

How to raise timeout error in unittesting

This is first time i am touching ruby, so no sure about correct terminology. I have tried searching for mulitple things, but couldn't find a solution.
I have this code block
domain_response = MyDomain::Api::MyApi::Api.new(parameters: message.to_domain_object, timeout: 1000)
# :nocov:
case (response = domain_response.response)
when MyDomain::Api::MyApi::SuccessResponse
## do something
when Domain::ErrorResponses::TimeoutResponse
## do something.
now i am trying to testing TimeoutResponse, I have written(tried) this
it "when api call timesout" do
expect(MyDomain::Api::MyApi::Api).to{
receive(:new)
} raise_error(MyDomain::ErrorResponses::TimeoutResponse)
end
this gave me error that unexpected identifier.
I have also tried by not providing receive, and it gave me error that block is expected.
Whats the proper way to raise an error that i can test?
Update:
Here is where i am stuck now
it "when api call timesout" do
# 1
expect(MyDomain::Api::MyApi::Api).to(
receive(:new),
).and_return(domain_api_instance)
# 2
expect(domain_api_instance.response).to receive(:response).and_raise(Domain::ErrorResponses::TimeoutResponse)
expect(domain_api_instance.response).to eq(ApiError::Timeout)
end
But with this code i am getting this error
1) Rpc::Package::SubPackage::V1::PackageService#first_test testing when api call timesout
Failure/Error: expect(domain_api_instance.response).to receive(:response).and_raise(Domain::ErrorResponses::TimeoutResponse)
#<InstanceDouble(MyDomain::Api::MyApi::Api) (anonymous)> received unexpected message :response with (no args)

The "error_marshaling_enabled" setting seems to be always enabled

When I run this code:
$client->evaluate('
box.session.settings.error_marshaling_enabled = false
box.error{code = 42, reason = "Foobar", type = "MyError"}
');
regardless of the value of error_marshaling_enabled I always get a response with a new (extended) error format:
[
49 => 'Foobar',
82 => [
0 => [
0 => [
0 => 'CustomError',
2 => 3,
1 => 'eval',
3 => 'Foobar',
4 => 0,
5 => 42,
6 => [
'custom_type' => 'MyError',
],
],
],
],
],
Why is that?
Short answer.
error_marshaling_enabled option affects only how error objects are encoded in response body (48, IPROTO_DATA). It does not affect how they are returned as exceptions, in the response header (82, IPROTO_ERROR).
Long answer.
In Tarantool an error object can be returned in 2 ways: as an exception and as an object. For example, this is how to throw an error as exception:
function throw_error()
box.error({code = 1000, reason = "Error message"})
-- Or
error('Some error string')
end
This is how to return it as an object:
function return_error()
return box.error.new({code = 1000, reason = "Error message"})
end
If the function was called remotely, using IPROTO protocol via a connector like netbox, or PHP connector, or any other one, the error return way affects how it is encoded into MessagePack response packet. When the function throws, and the error reaches the top stack frame without being caught, it is encoded as IPROTO_ERROR (82) and IPROTO_ERROR_24 (49).
When the error object is returned as a regular value, not as an exception, it is encoded also as a regular value, inside IPROTO_DATA (48). Just like a string, a number, a tuple, etc.
With encoding as IPROTO_ERROR/IPROTO_ERROR_24 there is no much of a configuration space. Format of these values can't be changed. IPROTO_ERROR is always returned as a MessagePack map, with a stack of errors in it. IPROTO_ERROR_24 is always an error message. The IPROTO_ERROR_24 field is kept for compatibility with connectors to Tarantool versions < 2.4.1.
With encoding as a part of IPROTO_DATA you can choose serialization way using error_marshaling_enabled option. When it is true, errors are encoded as MessagePack extension type MP_EXT, and contain the whole error stack, encoded exactly like IPROTO_ERROR value. When the option is false (default behaviour in 2.4.1), the error is encoded as a string, MP_STR, which is the error's message. If there is a stack of errors, only the newest error is encoded.
error_marshaling_enabled option exists for backward compatibility, in case your application on Tarantool wants to be compatible with old connectors, which don't support MP_EXT encoded errors.
In Tarantool < 2.4.1 errors were encoded into result MessagePack as a string with error message, and error stacks didn't exist at all. So when the new format and the error stacks feature were introduced, making the new format default would be a too radical change breaking the old connectors.
Consider these examples of how error marshaling affects results. I use Tarantool 2.4.1 console here, and built-in netbox connector. The code below can be copy pasted into the console.
First instance:
box.cfg{listen = 3313}
box.schema.user.grant('guest', 'super')
function throw_error()
box.error({code = 1000, reason = "Error message"})
end
function return_error()
return box.error.new({code = 1000, reason = "Error message"})
end
Second instance:
netbox = require('net.box')
c = netbox.connect(3313)
Now I try to call the function on the second instance:
tarantool> c:call('throw_error')
---
- error: Error message
...
The c:call('throw_error') threw an exception. If I catch it using pcall() Lua function, I will see the error object.
tarantool> ok, err = pcall(c.call, c, 'throw_error')
tarantool> err:unpack()
---
- code: 1000
base_type: ClientError
type: ClientError
message: Error message
trace:
- file: '[string "function throw_error()..."]'
line: 2
...
As you can see, I didn't set error_marshaling_enabled, but got the full error. Now I will call the other function, without exceptions. But the error object won't be full.
tarantool> err = c:call('return_error')
tarantool> err
---
- Error message
...
tarantool> err:unpack()
---
- error: '[string "return err:unpack()"]:1: attempt to call method ''unpack'' (a nil
value)'
...
The error was returned as a mere string, error message. Not as an error object. Now I will turn on the marshaling:
tarantool> c:eval('box.session.settings.error_marshaling_enabled = true')
---
...
tarantool> err = c:call('return_error')
---
...
tarantool> err:unpack()
---
- code: 1000
base_type: ClientError
type: ClientError
message: Error message
trace:
- file: '[C]'
line: 4294967295
...
Now the same function returned the error in the new format, more featured.
On the summary: error_marshaling_enabled affects only returned errors. Not thrown errors.

Cloudwatch to Elasticsearch parse/tokenize log event before push to ES

Appreciate your help in advance.
In my scenario - Cloudwatch multiline logs needs to be shipped to elasticsearch service.
ECS--awslog->Cloudwatch---using lambda--> ES Domain
(Basic flow though very open to change how data is shipped from CW to ES )
I was able to solve multi-line issue using multi_line_start_pattern BUT
The main issue I am experiencing now - is my logs have ODL format (following format)
[yyyy-mm-ddThh:mm:ss.SSS-Z][ProductName-Version][Log Level]
[Message ID][LoggerName][Key Value Pairs][[
Message]]
AND I will like to parse and tokenize log events before storing in ES (vs the complete log line ).
For example:
[2018-05-31T11:08:49.148-0400] [glassfish 4.1] [INFO] [] [] [tid: _ThreadID=43 _ThreadName=Thread-8] [timeMillis: 1527692929148] [levelValue: 800] [[
[] INFO : (DummyApplicationFunctionJPADAO) EntityManagerFactory located under resource lookup name [null], resource name=AuthorizationPU]]
Needs to be parsed and tokenize using format
timestamp 2018-05-31T11:08:49.148-0400
ProductName-Version glassfish 4.1
LogLevel INFO
MessageID
LoggerName
KeyValuePairs tid: _ThreadID=43 _ThreadName=Thread-8
Message [] INFO : (DummyApplicationFunctionJPADAO)
EntityManagerFactorylocated under resource lookup name
[null], resource name=AuthorizationPU
In above Key Value pairs repeat and are variable - for simplicity I can store all as one long string.
As far as what I gathered about Cloudwatch - It seems Subscription Filter Pattern reg ex support is very limited really not sure how to fit the above pattern. For lambda function that pushes the data to ES have not seen AWS doc or examples that support lambda as means to parse and push for ES.
Will appreciate if someone can please guide what/where will be best option to parse CW logs before it gets into ES => Subscription Filter -Pattern vs in lambda function or any other way.
Thank you .
From what I can see your best bet is what you're suggesting, a CloudWatch log triggered lambda that reformats the logged data into your ES prefered format and then posts it into ES.
You'll need to subscribe this lambda to your CloudWatch logs. You can do this on the lambda console, or the cloudwatch console (https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/Subscriptions.html).
The lambda's event payload will be: { "awslogs": { "data": "encoded-logs" } }. Where encoded-logs is a Base64 encoding of a gzipped JSON.
For example, the sample event (https://docs.aws.amazon.com/lambda/latest/dg/eventsources.html#eventsources-cloudwatch-logs) can be decoded in node, for example, using:
const zlib = require('zlib');
const data = event.awslogs.data;
const gzipped = Buffer.from(data, 'base64');
const json = zlib.gunzipSync(gzipped);
const logs = JSON.parse(json);
console.log(logs);
/*
{ messageType: 'DATA_MESSAGE',
owner: '123456789123',
logGroup: 'testLogGroup',
logStream: 'testLogStream',
subscriptionFilters: [ 'testFilter' ],
logEvents:
[ { id: 'eventId1',
timestamp: 1440442987000,
message: '[ERROR] First test message' },
{ id: 'eventId2',
timestamp: 1440442987001,
message: '[ERROR] Second test message' } ] }
*/
From what you've outlined, you'll want to extract the logEvents array, and parse this into an array of strings. I'm happy to give some help on this too if you need it (but I'll need to know what language you're writing your lambda in- there are libraries for tokenizing ODL- so hopefully it's not too hard).
At this point you can then POST these new records directly into your AWS ES Domain. Somewhat crypitcally the S3-to-ES guide gives a good outline of how to do this in python: https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-aws-integrations.html#es-aws-integrations-s3-lambda-es
You can find a full example for a lambda that does all this (by someone else) here: https://github.com/blueimp/aws-lambda/tree/master/cloudwatch-logs-to-elastic-cloud

Logstash: Attaching to previous line using multiline attaches somewhere else

I have a filter that looks like so:
multiline {
pattern => "(^.+Exception.*)|(^\tat .+)"
negate => false
what => "previous"
}
But for some reason, it's not attaching to the previous line for lines with ^\tat. Sometimes it does, but most of the time it doesn't. It attaches to the line way far back. I don't see anything wrong with my code.
Does anyone know if this is a bug?
Edit: This worked properly just now but couple minutes after it doesn't work again. Is it a buffer overflow? How would I debug this?
Edit: Example of success:
2014-06-20 09:09:07,989 http-bio-8080-exec-629 WARN com.rubiconproject.rfm.adserver.filter.impl.PriorityFilter - Request : NBA_DIV=Zedge_Tier1_App_MPBTAG_320x50_ROS_Android&NBA_APPID=4E51A330AD7A0131112022000A93D4E6&NBA_PUBID=111657&NBA_LOCATION_LAT=&NBA_LOCATION_LNG=&NBA_KV=device_id_sha-1_key=5040e46d15bd2f37b3ba58860cc94c1308c0ca4b&_v=2_0_0&id=84472439740784460, Response : Unable to Score Ads.. Selecting first one and Continuing...
java.lang.IndexOutOfBoundsException: Index: 8, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:604)
at java.util.ArrayList.get(ArrayList.java:382)
Edit: Example of failure:
2014-06-20 09:02:31,139 http-bio-8080-exec-579 WARN com.rubiconproject.rfm.adserver.web.AdRequestController - Request : car=vodafone UK&con=0&model=iPhone&bdl=com.racingpost.general&sup=adm,dfp,iAd&id=8226846&mak=Apple&sze=320x50&TYP=1&rtyp=json&app=F99D88D0FDEC01300BF5123139244773&clt=MBS_iOS_SDK_2.4.0&dpr=2.000000&apver=10.4&osver=7.1&udid=115FC62F-D4FF-44E0-8D92-5A060043EFDD&pub=111407&tud=3&osn=iPhone OS&, Response : No Ad Selected to Serve..Exiting
at java.util.ArrayList.get(ArrayList.java:382)
My file has 13000+ lines, and when it errors, it attaches to couple hundred lines back. But strangely each attaches to a line with the exact same offset in between (by offset I mean those couple hundred lines that it skips).
Your logs is java stack logs.
You can try to use this pattern. Use the date as the pattern, which is the beginning of each log.
input {
stdin{}
}
filter {
multiline {
pattern => "^(?>\d\d){1,2}-(?:0?[1-9]|1[0-2])-(?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])"
what => "previous"
}
}
output {
stdout {
codec => "rubydebug"
}
}
This pattern parses the date, if the line do not start with date, logstash will multiline it.
I have try it with your logs, it's worked on both two logs.
Hope this can help you.

With Zabbix API, how do I get the values of items/resources rather than just the ID's?

I have some data in a Custom Screen in Zabbix, and would like to pull the data from the screen via the API. I'm using this Ruby gem: https://github.com/express42/zabbixapi
I'm able to successfully connect and query, but the results I'm getting are not very useful:
p zbx.query(
:method => "item.get",
:params => {
:itemids => "66666",
:output => "extend"
}
)
# [{"itemid"=>"66666", "type"=>"0", "snmp_community"=>"", "snmp_oid"=>"", "hostid"=>"77777", "name"=>"Fro Packages", "key_"=>"system.sw.packages[davekey1|davekey2|davekey3|davekey4]", "delay"=>"300", "history"=>"90", "trends"=>"365", "status"=>"0", "value_type"=>"1", "trapper_hosts"=>"", "units"=>"", "multiplier"=>"0", "delta"=>"0", "snmpv3_securityname"=>"", "snmpv3_securitylevel"=>"0", "snmpv3_authpassphrase"=>"", "snmpv3_privpassphrase"=>"", "formula"=>"1", "error"=>"", "lastlogsize"=>"0", "logtimefmt"=>"", "templateid"=>"88888", "valuemapid"=>"0", "delay_flex"=>"", "params"=>"", "ipmi_sensor"=>"", "data_type"=>"0", "authtype"=>"0", "username"=>"", "password"=>"", "publickey"=>"", "privatekey"=>"", "mtime"=>"0", "flags"=>"0", "filter"=>"", "interfaceid"=>"25", "port"=>"", "description"=>"", "inventory_link"=>"0", "lifetime"=>"30", "snmpv3_authprotocol"=>"0", "snmpv3_privprotocol"=>"0", "state"=>"0", "snmpv3_contextname"=>""}]
You can see that it's returning a bunch of ID's for the items, including the correct keys, but I can't seem to get the actual plain text values, which is the data I'm interested in.
I started with the screen_id, then got the screenitem_id, now the item_id, but I don't seem to be getting any closer to what I want!
Thanks for any help
Getting items or getting hosts means getting their description, not the data. is You are after history. Reading the actual Zabbix user manual and API docs is highly recommended.

Resources