New lines in context data do not output valid json - laravel

Monolog version 2
Noticed when using in a Laravel project and trying to ingest logs into ELK.
Something like this:
Log::debug('Testing new lines in log context.', ['yep' => "This\nhas\nNew Lines."]);
Outputs
[2022-01-25 09:53:22] local.DEBUG: Testing new lines in log context. {"yep":"This
has
New Lines."}
Expecting
[2022-01-25 09:53:22] local.DEBUG: Testing new lines in log context. {"yep":"This\\nhas\\nNew Lines."}
Just wonder if there is a setting I can alter to make it escape the new line characters?
Note: The data containing new lines is perfectly fine, wanting to log this information out in its entirety, but the logging library does not output valid json?
https://github.com/Seldaek/monolog/issues/1629

Use single quotes and they will be escaped. '\n' doesn't output the newline.
Another option is to json_encode what you want to log if you do not want the newlines to be outputted.
php > print_r(['e' => "a\nb"]); <- double quotes
Array
(
[e] => a
b <- newline is outputted
)
php > print_r(['e' => 'a\nb']); <- single quotes
Array
(
[e] => a\nb <- newline is not outputted
)
php > print_r(json_encode(['e' => "a\nb"])); <- json_encode
{"e":"a\nb"} <- newline is not outputted
Log::debug('Testing new lines in log context.', ['yep' => 'This\nhas\nNew Lines.']);
Log::debug('Testing new lines in log context.', json_encode(['yep' => "This\nhas\nNew Lines."]));

Related

Telegraf tail with grok pattern error

I am using Telegraf to get logs information from Apache NiFi, for this task I am using this config:
[[inputs.tail]]
## files to tail.
files = ["/var/log/nifi/nifi-app.log"]
## Read file from beginning.
from_beginning = true
#name_override = "nifi_app"
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "grok"
grok_patterns = [ "%{DATE:date} %{TIME:time} %{WORD:EventType} \[%{GREEDYDATA:NifiTask} %{NOTSPACE:Thread}\] %{NOTSPACE:NifiEventType} %{GREEDYDATA:EventText} %{NUMBER:EventDuration} %{WORD:EventDurationUnits}" ]
When I try to start telegraf it give me this error:
Error parsing /etc/telegraf/telegraf.conf, toml: line 10: parse error
The pattern I wrote was tested in a Grok debugger with this text:
2018-08-02 10:53:16,976 INFO [Heartbeat Monitor Thread-1]
o.a.n.c.c.h.AbstractHeartbeatMonitor Finished processing 1 heartbeats
in 11863 nanos
These are the results of some testing:
grok_patterns = ["\[%{GREEDYDATA:NifiTask}\]"] ==> toml: line 10: parse error
grok_patterns = ["[%{GREEDYDATA:NifiTask}]"] ==> Invalid data format: grok
grok_patterns = ['\[%{GREEDYDATA:NifiTask}\]'] ==> Invalid data format: grok
grok_patterns = ["\\[%{GREEDYDATA:NifiTask}\\]"] ==> Invalid data format: grok
grok_patterns = ['[%{GREEDYDATA:NifiTask}]'] -> Invalid data format: grok
The first option for me is the right one, but doesn't works, and the problem seems to be the way the bracket is being escaped.
How is possible to solve this issue?
To fix the issue about escaping bracket, a "partial" solution is to change double quote by simple quote, with this way, in my case (telegraph version 1.13.4) the bracket is correctly escaped by \
There was more than one problem:
First problem: the grok dataformat is added to Telegraf in the 1.8 release (ref), so I must use a nightly install until this version is released.
Second problem: how to escape the brackets, there are problems doing it in a regular way, so what I finally did was to put this part in a custom pattern file, this way it works perfectly.

Ruby - CSV works while SmarteCSV doesn't

I want to open a csv file using SmarterCSV.process
market_csv = SmarterCSV.process(market)
p "just read #{market_csv}"
The problem is that the data is not read and this prints:
[]
However, if I attempt the same thing with the default CSV library implementation the content of the file is read(the following print statement prints the file).
CSV.foreach(market) do |row|
p row
end
The content of the file I was reading is of the form:
Date,Close
03/06/15,0.1634
02/06/15,0.1637
01/06/15,0.1638
31/05/15,0.1638
The problem could come from the line separator, the file is not exactly the same if you're using windows or unix system ("\r\n" or "\r"). Try to identify and specify the character in the SmarterCSV.process like this:
market_csv = SmarterCSV.process(market, row_sep: "\r")
p "just read #{market_csv}"
or like this:
market_csv = SmarterCSV.process(market, row_sep: :auto)
p "just read #{market_csv}"

Create a file descriptor in ruby

I am writing a script will perform various tasks with DSV or positional files. These tasks varies and are like creating an DB table for the file, or creating a shell script for parsing it.
As I have idealized my script would receive a "descriptor" as input to perform its tasks. It then would parse this descriptor and perform its tasks accordingly.
I came up with some ideas on how to specify the descriptor file, but didn't really manage to get something robust - probably due my inexperience in ruby.
It seems though, the best way to parse the descriptor would be using ruby language itself and then somehow catch parsing exceptions to turn into something more relevant to the context.
Example:
The file I will be reading looks like (myfile.dsv):
jhon,12343535,27/04/1984
dave,53245265,30/03/1977
...
Descriptor file myfile.des contains:
FILE_TYPE = "DSV"
DSV_SEPARATOR = ","
FIELDS = [
name => [:pos => 0, :type => "string"],
phone => [:pos => 1, :type => "number"],
birthdate => [:pos => 2, :type => "date", :mask = "dd/mm/yyyy"]
]
And the usage should be:
ruby script.rb myfile.des --task GenerateTable
So the program script.rb should load and parse the descriptor myfile.des and perform whatever tasks accordingly.
Any ideas on how to perform this?
Use YAML
Instead of rolling your own, use YAML from the standard library.
Sample YAML File
Name your file something like descriptor.yml, and fill it with:
---
:file_type: DSV
:dsv_separator: ","
:fields:
:name:
:pos: 0
:type: string
:phone:
:pos: 1
:type: number
:birthdate:
:pos: 2
:type: date
:mask: dd/mm/yyyy
Loading YAML
You can read your configuration back in with:
require 'yaml'
settings = YAML.load_file 'descriptor.yml'
This will return a settings Hash like:
{:file_type=>"DSV",
:dsv_separator=>",",
:fields=>
{:name=>{:pos=>0, :type=>"string"},
:phone=>{:pos=>1, :type=>"number"},
:birthdate=>{:pos=>2, :type=>"date", :mask=>"dd/mm/yyyy"}}}
which you can then access as needed to configure your application.

Logstash: Attaching to previous line using multiline attaches somewhere else

I have a filter that looks like so:
multiline {
pattern => "(^.+Exception.*)|(^\tat .+)"
negate => false
what => "previous"
}
But for some reason, it's not attaching to the previous line for lines with ^\tat. Sometimes it does, but most of the time it doesn't. It attaches to the line way far back. I don't see anything wrong with my code.
Does anyone know if this is a bug?
Edit: This worked properly just now but couple minutes after it doesn't work again. Is it a buffer overflow? How would I debug this?
Edit: Example of success:
2014-06-20 09:09:07,989 http-bio-8080-exec-629 WARN com.rubiconproject.rfm.adserver.filter.impl.PriorityFilter - Request : NBA_DIV=Zedge_Tier1_App_MPBTAG_320x50_ROS_Android&NBA_APPID=4E51A330AD7A0131112022000A93D4E6&NBA_PUBID=111657&NBA_LOCATION_LAT=&NBA_LOCATION_LNG=&NBA_KV=device_id_sha-1_key=5040e46d15bd2f37b3ba58860cc94c1308c0ca4b&_v=2_0_0&id=84472439740784460, Response : Unable to Score Ads.. Selecting first one and Continuing...
java.lang.IndexOutOfBoundsException: Index: 8, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:604)
at java.util.ArrayList.get(ArrayList.java:382)
Edit: Example of failure:
2014-06-20 09:02:31,139 http-bio-8080-exec-579 WARN com.rubiconproject.rfm.adserver.web.AdRequestController - Request : car=vodafone UK&con=0&model=iPhone&bdl=com.racingpost.general&sup=adm,dfp,iAd&id=8226846&mak=Apple&sze=320x50&TYP=1&rtyp=json&app=F99D88D0FDEC01300BF5123139244773&clt=MBS_iOS_SDK_2.4.0&dpr=2.000000&apver=10.4&osver=7.1&udid=115FC62F-D4FF-44E0-8D92-5A060043EFDD&pub=111407&tud=3&osn=iPhone OS&, Response : No Ad Selected to Serve..Exiting
at java.util.ArrayList.get(ArrayList.java:382)
My file has 13000+ lines, and when it errors, it attaches to couple hundred lines back. But strangely each attaches to a line with the exact same offset in between (by offset I mean those couple hundred lines that it skips).
Your logs is java stack logs.
You can try to use this pattern. Use the date as the pattern, which is the beginning of each log.
input {
stdin{}
}
filter {
multiline {
pattern => "^(?>\d\d){1,2}-(?:0?[1-9]|1[0-2])-(?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])"
what => "previous"
}
}
output {
stdout {
codec => "rubydebug"
}
}
This pattern parses the date, if the line do not start with date, logstash will multiline it.
I have try it with your logs, it's worked on both two logs.
Hope this can help you.

Cannot read unicode .csv into R

I have a .csv file, which contains the following data:
"Ա","Բ"
1,10
2,20
I cannot read it into R so that the column names are displayed like they are in the file.
d <- read.csv("./Data/1.csv", fileEncoding="UTF-8")
head(d)
Produces the following:
> d <- read.csv("./Data/1.csv", fileEncoding="UTF-8")
Warning messages:
1: In read.table(file = file, header = header, sep = sep, quote = quote, :
invalid input found on input connection './Data/1.csv'
2: In read.table(file = file, header = header, sep = sep, quote = quote, :
incomplete final line found by readTableHeader on './Data/1.csv'
> head(d)
[1] X.
<0 rows> (or 0-length row.names)
Meanwhile, doing the same without specifying the fileEncoding produces this:
> d <- read.csv("./Data/1.csv")
> head(d)
Ô. Ô²
1 1 10
2 2 20
When I run the "file" utility to find out the encoding of the file, it says it is UTF-8:
Data\1.csv: UTF-8 Unicode text, with CRLF line terminators
I am using RStudio, Windows 7, R version 2.15.2, 32-bit.
Thanks in advance.
I wrote a longer answer on the same issue here: R on Windows: character encoding hell .
Quick answer, using the parameter encoding instead of fileEncoding should fix your first issue. You will not be able to read it possibly in either console or table view in RStudio, but you will be able to use it in formulaes.
d <- read.csv("./Data/1.csv", encoding="UTF-8")
head(d)
Having saved your table into a UTF-8 file:
> test2 <- read.csv("test2.csv", header = FALSE, sep = ",", quote = "\"", dec = ".", fill = TRUE, comment.char = "", encoding = "UTF-8")
Warning message:
In read.table(file = file, header = header, sep = sep, quote = quote, :
incomplete final line found by readTableHeader on 'test2.csv'
This gives you how it looks like in the console and RStudio view
> test2
V1 V2
1 <U+0531> <U+0532>
2 1 10
3 2 20
However importantly you are able to manipulate this within R. Thus in my case it is possible to see that the script window input Ա has UTF-8 encoding, and a grep correctly finds this encoding in your table.
> Encoding("Ա")
[1] "UTF-8"
> grep("Ա", as.character(test2[1,1]))
[1] 1
You may need to find suitable encoding variants that work on your settings, or possibly change them. Unfortunately I am not sure where it is done.
You might not be able to make it pretty in all stages, but it is definitely possible to get it to work also in Windows 7 environment.
I tried two ways to replicate your problem.
I copied the characters above into RStudio, saved it to a csv with this code:
write.csv(c("Ա","Բ",
1,10,
2,20), "test.csv")
df <- read.csv("test.csv")
This worked fine.
Then I thought, well maybe R is cheating when I save it to CSV with R? So I just pasted the characters to a text file and save it as a CSV. This approach doesn't have problems either.
Here's my session info:
sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C LC_TIME=en_CA.UTF-8
[4] LC_COLLATE=en_CA.UTF-8 LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_CA.UTF-8
[7] LC_PAPER=C LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 grid stats graphics grDevices utils datasets methods base
other attached packages:
[1] party_1.0-9 modeltools_0.2-21 strucchange_1.4-7 sandwich_2.2-10 zoo_1.7-10
[6] GGally_0.4.4 reshape_0.8.4 plyr_1.8 ggplot2_0.9.3.1
loaded via a namespace (and not attached):
[1] coin_1.0-23 colorspace_1.2-2 dichromat_2.0-0 digest_0.6.3
[5] gtable_0.1.2 labeling_0.2 lattice_0.20-23 MASS_7.3-29
[9] munsell_0.4.2 mvtnorm_0.9-9995 proto_0.3-10 RColorBrewer_1.0-5
[13] reshape2_1.2.2 scales_0.2.3 splines_3.0.1 stringr_0.6.2
I had the same problem and found out that the file was corrupted.
I opened the file with OpenOffice and saved it back using "UTF8" character set (you need to click the edit filter settings box) and then imported it with the read.csv()(no encoding or filencoding option) and it worked fine.

Resources