How to add code line number using structlog - structlog

Using python standard logging module, the line number for the originating log call can be added using: %(lineno)s.
How can this be accomplished using structlog?

EDIT:
Structlog version 21.5.0 introduced the CallsiteParameter processor, so this should be a much more straightforward process right now, as #vitvlkv's answer shows.
I had a similar need and I ended up creating a custom processor
I took a look to what structlog does to output the module and line number when it is told to "pretend" to format in a compatible mode with the logging library (meaning: when it's using a regular stdlib.LoggerFactory) and I found inspiration in that. The key were the following words...
By using structlog’s structlog.stdlib.LoggerFactory, it is also ensured that variables like function names and line numbers are expanded correctly in your log format.
... from this documentation page
The code seems to keep looking for execution frames until it finds one that is in a non logging-related module.
I have all the setup for structlog inside a module called my_libs.util.logger so I want to get the first frame that is NOT inside that module. In order to do that, I told it to add my logging-related my_libs.util.logger to those exclusions. That's what the additional_ignores in the code below does.
In the example I hardcoded the module's name ('my_libs.util.logger') in the exclusion list for clarity, but if you have a similar setup you'll probably be better off using __name__ instead. What this does is ignoring execution frames that exist because of the logging machinery in place. You can look at it as a way of ignoring calls that may have occurred as part of the process of actually logging the message. Or, otherwise said, calls that happened after the logging.info("Foo") that happened in the actual module/line that you do want to output.
Once it finds the right frame, extracting any kind of information (module name, function name, line number... ) is very easy, particularly using the inspect module. I chose to output the module name and the line number, but more fields could be added.
# file my_libs/util/logger.py
import inspect
from structlog._frames import _find_first_app_frame_and_name
def show_module_info_processor(logger, _, event_dict):
# If by any chance the record already contains a `modline` key,
# (very rare) move that into a 'modline_original' key
if 'modline' in event_dict:
event_dict['modline_original'] = event_dict['modline']
f, name = _find_first_app_frame_and_name(additional_ignores=[
"logging",
'my_libs.util.logger', # could just be __name__
])
if not f:
return event_dict
frameinfo = inspect.getframeinfo(f)
if not frameinfo:
return event_dict
module = inspect.getmodule(f)
if not module:
return event_dict
if frameinfo and module:
# The `if` above is probably redundant, since we already
# checked for frameinfo and module but... eh... paranoia.
event_dict['modline'] = '{}:{}'.format(
module.__name__,
frameinfo.lineno,
)
return event_dict
def setup_structlog(env=None):
# . . .
ch.setFormatter(logging.Formatter('%(message)s'))
logging.getLogger().handlers = [ch]
processors = [
structlog.stdlib.add_logger_name,
structlog.stdlib.add_log_level,
# . . . more . . .
show_module_info_processor, # THIS!!!
structlog.processors.TimeStamper(fmt="%Y-%m-%d %H:%M:%S"),
structlog.processors.format_exc_info,
structlog.processors.StackInfoRenderer(),
# . . . more . . .
]
# . . . more . . .
structlog.configure_once(
logger_factory=structlog.stdlib.LoggerFactory(),
wrapper_class=structlog.stdlib.BoundLogger,
context_class=structlog.threadlocal.wrap_dict(dict),
processors=processors,
)
This produces an output like:
server_1
| INFO [my_libs.hdfs] 2019-07-01 01:01:01 [info ] Initialized HDFS
[my_libs.hdfs] modline=my_libs.hdfs:31

According to official docs, you may add
structlog.configure(
processors=[
# ...
# Add callsite parameters.
structlog.processors.CallsiteParameterAdder(
[CallsiteParameter.FILENAME,
CallsiteParameter.FUNC_NAME,
CallsiteParameter.LINENO],
),
# ...
],
So, I guess there is no need to write a custom processor for this. It was hard to find in the official docs though.

Have a look at this answer to the more general question of how to get a line number.
https://stackoverflow.com/a/3056270/5909155
This cannot be bound to the logger with log.bind(...) because it has to be evaluated each time you log. Thus, you should add a key-value pair like this
logger.log(..., lineno=inspect.getframeinfo(inspect.currentframe()).lineno)
each time. Maybe wrap this in a function, though, like this: https://stackoverflow.com/a/20372465/5909155
Don't forget to
import inspect

Related

How do nested variables within the .env file work in CodeIgniter 4

Under the "Nesting Variables" section in Codeigniter4 site:
"To save on typing, you can reuse variables that you’ve already specified in the file by wrapping the variable name within ${...}"
link to CI nesting Variables section
example in the documentation:
BASE_DIR="/var/webroot/project-root"
CACHE_DIR="${BASE_DIR}/cache"
TMP_DIR="${BASE_DIR}/tmp"
I was trying to use the following
app.baseURL = 'http://localhost:8080/'
google.redirect = ${app.baseURL}Google
However, it's assigning it as a literal when print_r($_ENV)
[google.redirect] => ${app.baseURL}Google
I've tried using non-namespaced keys including BASE_DIR (per the example) and it keeps printing as a literal.
What's strange - When I use the following:
CI_ENVIRONMENT = development
google.redirect = ${CI_ENVIRONMENT}Google
The result when print_r is:
[CI_ENVIRONMENT] => development
[google.redirect] => developmentGoogle
My question is - What am I doing incorrectly and/or how should these be set/used correctly?
According to the documentation, I should be able to use any key within the .env file that was already assigned using
${somekeyinthisfile}
After a bit of looking, there is a more recent file up at
https://github.com/codeigniter4/CodeIgniter4/blob/develop/system/Config/DotEnv.php
with all the "other" changes...
This was a Bug Fix. So get that file and you will be good to go.
I am pretty sure that the intention wasn't to allow app.xxx settings to be used as variables as the documentation clearly shows, by not
showing them being used. ( yes its 6am now ...)
BUT it is your code to do with as you please...So if you want to use app.xxx as variables...
The Only Thing missing is the DOT (.) in the regex
If you look on Line 272 - system/Config/DotEnv.php inside method resolveNestedVariables() and add a . (dot) into the regex, that will make all your app.things work.
$value = preg_replace_callback(
'/\${([a-zA-Z0-9_.]+)}/',
function ($matchedPatterns) use ($loader) {
I have added a dot (.) at the end of the [a-zA-Z0-9_
So
'/\${([a-zA-Z0-9_]+)}/',
becomes
'/\${([a-zA-Z0-9_.]+)}/',

Eper redbug, print line number in message

Here are some messages from using redbug in the shell:
% 02:49:02 <0.116.0>({cowboy_protocol,init,4})
% func1:start(<<"/second">>, [some])
% 02:49:02 <0.116.0>({cowboy_protocol,init,4})
% func1:looper(<<"/home/second">>, #{data => []}])
Is there are way to also print line numbers in redbug messages?
redbug:help() does show this:
print_fun () custom print handler, fun/1 or fun/2;
fun(TrcMsg) -> <ignored>
fun(TrcMsg,AccOld) -> AccNew
But there is no good explanation anywhere how to use it, so I couldn't try to see if I could add line numbers to the message
It doesn't seem you can do it in any straight forward way.
Easiest way to check this, it to just print all data you receive in your print_fun
1> PrintFun = fun (Msg) -> io:format( ">>> ~p~n" , [Msg]) end.
#Fun<erl_eval.6.90072148>
2> redbug:start("erlang" , [{print_fun, PrintFun}]).
{30,249}
>>> {call,{{erlang,demonitor,[#Ref<0.0.0.40>]},<<>>},
{<0.33.0>,{erlang,apply,2}},
{11,40,31,554200}}
>>> {call,{{erlang,atom_to_list,['PrintFun']},<<>>},
{<0.33.0>,{erlang,apply,2}},
{11,40,31,554210}}
>>> {call,{{erlang,group_leader,[]},<<>>},
{<0.33.0>,{erlang,apply,2}},
{11,40,31,554213}}
>>> {call,{{erlang,monitor,[process,<0.26.0>]},<<>>},
{<0.33.0>,{erlang,apply,2}},
{11,40,31,554215}}
>>> {call,{{erlang,port_control,[#Port<0.491>,101,[]]},<<>>},
{<0.24.0>,user_drv},
{11,40,31,554231}}
>>> {call,{{erlang,module_loaded,[calendar]},<<>>},
{<0.20.0>,code_server},
{11,40,31,554257}}
>>> {call,{{erlang,atom_to_list,[calendar]},<<>>},
{<0.20.0>,code_server},
{11,40,31,554263}}
>>> {call,{{erlang,'++',["calendar",".beam"]},<<>>},
{<0.20.0>,code_server},
{11,40,31,554265}}
>>> {call,{{erlang,'++',["ebin","/calendar.beam"]},<<>>},
{<0.20.0>,code_server},
{11,40,31,554268}}
>>> {call,{{erlang,whereis,[erl_prim_loader]},<<>>},
{<0.20.0>,code_server},
{11,40,31,554270}}
redbug done, msg_count - 10
As you can see, all you get is MFA ({Module, Function, Arguments}), calling process, and time stamp.
To get actual line of function call you would have to dig into debug_info attached to beam file (if there is any) with beam_lib module. I think seampleas way to do it would be using beam_lib:chunks( Module, [abstract_code])., just like this
{ok,{redbug,[{abstract_code,{raw_abstract_v1,[{attribute,1,
file,
{"src/redbug.erl",1}},
{attribute,9,module,redbug},
{attribute,11,export,[{help,0}]},
{attribute,13,export,[{unix,1}]},
{attribute,15,export,
[{start,1},{start,2},{start,3},{start,4},{start,5}]},
{attribute,16,export,[{stop,0}]},
{attribute,1,file,{"src/log.hrl",1}},
{function,17,'?log',2,
[{clause,17,[{var,17,...},{var,...}],[[{...}]],[{...}]},
{clause,18,[{var,...},{...}],[],[...]}]},
{attribute,19,file,{"src/redbug.erl",19}},
{attribute,22,record,
{cnf,[{record_field,24,{...},...},
{record_field,25,...},
{record_field,...},
{...}|...]}},
{function,57,help,0,[{clause,57,[],...}]},
{function,123,unix,1,
[{clause,123,...},{clause,...},{...}|...]},
{function,146,to_term,1,[{clause,...},{...}]},
{function,154,maybe_halt,1,[{...}]},
{function,160,is_in_shell,0,[...]},
{function,167,stop,0,...},
{function,174,start,...},
{function,176,...},
{function,...},
{...}|...]}}]}}
There you can find list, on which you could find tuples like {function, LineNumber, FunctionName, Arity, FunctionCodeAsList }. So by going trough this list and finding function you are looking for you can extract LineNumber.
Still you need to take under account few things that might not work.
you are analyzing actual file from disc, so it need to be in you current directory. It has nothing to do if and what is loaded to your VM (what code version). So you might have to put a little work to actually find this .beam
files compiled without debug_info might fail in producing this abstract syntax tree. You need to decide if that is might be your case, and how would you like to handle this.
some beams might be encrypted, some might have other issues. You should read beam_lib module documentation just to get feel what you are dealing with.
If you do come up with something, please do share. Happy hacking !

perl package statement and perl module locations

I am writing my first real Perl modules using Moo. But I am getting confused about package locations and package statements and "use".
I have a simple program in c:/temp:
use Roadmap::a;
print a->new->aa->[0] . "\n";
I have a simple module C:/bin/perl/perl/site/lib/Roadmap/a.pm, #INC has:
#INC:
C:/bin/perl/perl/site/lib
C:/bin/perl/perl/vendor/lib
C:/bin/perl/perl/lib
.
If a.pm is this:
package Roadmap::a;
use Moo;
use Types::Standard qw(ArrayRef);
has aa => (is => 'rw', isa => ArrayRef, default => sub{ [3] });
1;
It returns an error:
Can't locate object method "new" via package "a"
(perhaps you forgot to load "a"?) at e.pl line 2.
But if a.pm is this:
package a;
use Moo;
use Types::Standard qw(ArrayRef);
has aa => (is => 'rw', isa => ArrayRef, default => sub{ [3] });
1;
It succeeds and returns the answer '3'. What am I doing wrong? Shouldn't all modules have names of Somedir::Module ?? Thanks!
Update: this is module is simple enough that all I did was create the directory site/lib/Roadmap and place the a.pm file within it. Should I have instead tried to do some kind of perl module install on this simple module? For this simple example, it didn't seem like I needed to.
Update: if I replace "a->new()" with "Roadmap::a->new()" then my original example works. But I was hoping that with a package Somedir::Module, after doing the "use Somedir::Module" in my script, I could then just do Module->new() instead of Somedir::Module->new().
Update: continuing the conversation with myself, if I change the script to this format, it all works the way I want it to. (Of course I would never use a lowercase module name in practice, this is just for testing.)
use aliased 'Roadmap::a';
print a->new->aa->[0] . "\n";
So I guess I have now answered my own question.
"use aliased" solves the problem nicely.

Python: Call a shell script which calls a bin. With arguments

The context: There is a map somewhere on the system with bin files which I'd like to call. They are not callable directly though, but through shell scripts which do all kinds of magic and then call the corresponding bin with: "$ENV_VAR/path/to/the/bin" "$#" (the software is non-free, that's probably why this construction is used)
The problem: Calling this from within Python. I tried to use:
from subprocess import call
call(["nameOfBin", "-input somefile"])
But this gave the error ERROR: nameOfBin - Illegal option: input somefile. This means the '-' sign in front of 'input' has disapeared along the way (putting more '-' signs in front doesn't help).
Possible solutions:
1: In some way preserving the '-' sign so the bin at the end actually takes '-input' as an option instead of 'input'.
2: Fix the magic in a dirty way (I will probably manage), and have a way to call a bin at a location defined by a $ENV_VAR (environment variable).
I searched for both methods, but appearantly nobody before me had such a problem (or I didn't see it: Sorry if that's the case).
Each item in the list should be a single argument. Replace "-input somefile" with "-input", "somefile":
from subprocess import call
rc = call(["nameOfBin", "-input", "somefile"])

How to get R script line numbers at error?

If I am running a long R script from the command line (R --slave script.R), then how can I get it to give line numbers at errors?
I don't want to add debug commands to the script if at all possible; I just want R to behave like most other scripting languages.
This won't give you the line number, but it will tell you where the failure happens in the call stack which is very helpful:
traceback()
[Edit:] When running a script from the command line you will have to skip one or two calls, see traceback() for interactive and non-interactive R sessions
I'm not aware of another way to do this without the usual debugging suspects:
debug()
browser()
options(error=recover) [followed by options(error = NULL) to revert it]
You might want to look at this related post.
[Edit:] Sorry...just saw that you're running this from the command line. In that case I would suggest working with the options(error) functionality. Here's a simple example:
options(error = quote({dump.frames(to.file=TRUE); q()}))
You can create as elaborate a script as you want on an error condition, so you should just decide what information you need for debugging.
Otherwise, if there are specific areas you're concerned about (e.g. connecting to a database), then wrap them in a tryCatch() function.
Doing options(error=traceback) provides a little more information about the content of the lines leading up to the error. It causes a traceback to appear if there is an error, and for some errors it has the line number, prefixed by #. But it's hit or miss, many errors won't get line numbers.
Support for this will be forthcoming in R 2.10 and later. Duncan Murdoch just posted to r-devel on Sep 10 2009 about findLineNum and setBreapoint:
I've just added a couple of functions to R-devel to help with
debugging. findLineNum() finds which line of which function
corresponds to a particular line of source code; setBreakpoint() takes
the output of findLineNum, and calls trace() to set a breakpoint
there.
These rely on having source reference debug information in the code.
This is the default for code read by source(), but not for packages.
To get the source references in package code, set the environment
variable R_KEEP_PKG_SOURCE=yes, or within R, set
options(keep.source.pkgs=TRUE), then install the package from source
code. Read ?findLineNum for details on how to tell it to search
within packages, rather than limiting the search to the global
environment.
For example,
x <- " f <- function(a, b) {
if (a > b) {
a
} else {
b
}
}"
eval(parse(text=x)) # Normally you'd use source() to read a file...
findLineNum("<text>#3") # <text> is a dummy filename used by
parse(text=)
This will print
f step 2,3,2 in <environment: R_GlobalEnv>
and you can use
setBreakpoint("<text>#3")
to set a breakpoint there.
There are still some limitations (and probably bugs) in the code; I'll
be fixing thos
You do it by setting
options(show.error.locations = TRUE)
I just wonder why this setting is not a default in R? It should be, as it is in every other language.
Specifying the global R option for handling non-catastrophic errors worked for me, along with a customized workflow for retaining info about the error and examining this info after the failure. I am currently running R version 3.4.1.
Below, I've included a description of the workflow that worked for me, as well as some code I used to set the global error handling option in R.
As I have it configured, the error handling also creates an RData file containing all objects in working memory at the time of the error. This dump can be read back into R using load() and then the various environments as they existed at the time of the error can be inspected interactively using debugger(errorDump).
I will note that I was able to get line numbers in the traceback() output from any custom functions within the stack, but only if I used the keep.source=TRUE option when calling source() for any custom functions used in my script. Without this option, setting the global error handling option as below sent the full output of the traceback() to an error log named error.log, but line numbers were not available.
Here's the general steps I took in my workflow and how I was able to access the memory dump and error log after a non-interactive R failure.
I put the following at the top of the main script I was calling from the command line. This sets the global error handling option for the R session. My main script was called myMainScript.R. The various lines in the code have comments after them describing what they do. Basically, with this option, when R encounters an error that triggers stop(), it will create an RData (*.rda) dump file of working memory across all active environments in the directory ~/myUsername/directoryForDump and will also write an error log named error.log with some useful information to the same directory. You can modify this snippet to add other handling on error (e.g., add a timestamp to the dump file and error log filenames, etc.).
options(error = quote({
setwd('~/myUsername/directoryForDump'); # Set working directory where you want the dump to go, since dump.frames() doesn't seem to accept absolute file paths.
dump.frames("errorDump", to.file=TRUE, include.GlobalEnv=TRUE); # First dump to file; this dump is not accessible by the R session.
sink(file="error.log"); # Specify sink file to redirect all output.
dump.frames(); # Dump again to be able to retrieve error message and write to error log; this dump is accessible by the R session since not dumped to file.
cat(attr(last.dump,"error.message")); # Print error message to file, along with simplified stack trace.
cat('\nTraceback:');
cat('\n');
traceback(2); # Print full traceback of function calls with all parameters. The 2 passed to traceback omits the outermost two function calls.
sink();
q()}))
Make sure that from the main script and any subsequent function calls, anytime a function is sourced, the option keep.source=TRUE is used. That is, to source a function, you would use source('~/path/to/myFunction.R', keep.source=TRUE). This is required for the traceback() output to contain line numbers. It looks like you may also be able to set this option globally using options( keep.source=TRUE ), but I have not tested this to see if it works. If you don't need line numbers, you can omit this option.
From the terminal (outside R), call the main script in batch mode using Rscript myMainScript.R. This starts a new non-interactive R session and runs the script myMainScript.R. The code snippet given in step 1 that has been placed at the top of myMainScript.R sets the error handling option for the non-interactive R session.
Encounter an error somewhere within the execution of myMainScript.R. This may be in the main script itself, or nested several functions deep. When the error is encountered, handling will be performed as specified in step 1, and the R session will terminate.
An RData dump file named errorDump.rda and and error log named error.log are created in the directory specified by '~/myUsername/directoryForDump' in the global error handling option setting.
At your leisure, inspect error.log to review information about the error, including the error message itself and the full stack trace leading to the error. Here's an example of the log that's generated on error; note the numbers after the # character are the line numbers of the error at various points in the call stack:
Error in callNonExistFunc() : could not find function "callNonExistFunc"
Calls: test_multi_commodity_flow_cmd -> getExtendedConfigDF -> extendConfigDF
Traceback:
3: extendConfigDF(info_df, data_dir = user_dir, dlevel = dlevel) at test_multi_commodity_flow.R#304
2: getExtendedConfigDF(config_file_path, out_dir, dlevel) at test_multi_commodity_flow.R#352
1: test_multi_commodity_flow_cmd(config_file_path = config_file_path,
spot_file_path = spot_file_path, forward_file_path = forward_file_path,
data_dir = "../", user_dir = "Output", sim_type = "spot",
sim_scheme = "shape", sim_gran = "hourly", sim_adjust = "raw",
nsim = 5, start_date = "2017-07-01", end_date = "2017-12-31",
compute_averages = opt$compute_averages, compute_shapes = opt$compute_shapes,
overwrite = opt$overwrite, nmonths = opt$nmonths, forward_regime = opt$fregime,
ltfv_ratio = opt$ltfv_ratio, method = opt$method, dlevel = 0)
At your leisure, you may load errorDump.rda into an interactive R session using load('~/path/to/errorDump.rda'). Once loaded, call debugger(errorDump) to browse all R objects in memory in any of the active environments. See the R help on debugger() for more info.
This workflow is enormously helpful when running R in some type of production environment where you have non-interactive R sessions being initiated at the command line and you want information retained about unexpected errors. The ability to dump memory to a file you can use to inspect working memory at the time of the error, along with having the line numbers of the error in the call stack, facilitate speedy post-mortem debugging of what caused the error.
First, options(show.error.locations = TRUE) and then traceback(). The error line number will be displayed after #

Resources