In mrskew, can I calculate human readable timestamps from tim? - oracle

I'm using mrskew by Method-R to analyze Oracle SQL Trace files.
I want to list all database calls similar to the output of calls.rc
But instead of the value of $tim, I'd print a human readable date format.
Raw data (minimal obfuscated):
*** 2020-11-26 10:06:01.867
*** SESSION ID:(1391.49878) 2020-11-26 10:06:01.867
*** CLIENT ID:() 2020-11-26 10:06:01.867
*** SERVICE NAME:(SYS$USERS) 2020-11-26 10:06:01.867
*** MODULE NAME:(JDBC Thin Client) 2020-11-26 10:06:01.867
*** CLIENT DRIVER:(jdbcthin : 12.2.0.1.0) 2020-11-26 10:06:01.867
*** ACTION NAME:() 2020-11-26 10:06:01.867
...
WAIT #0: nam='SQL*Net message from client' ela= 491 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=12568091328841
=====================
PARSING IN CURSOR #18446744071522016088 len=71 dep=0 uid=88 oct=7 lid=88 tim=12568091329190 hv=2304270232 ad='61e4d11e0' sqlid='5kpbj024phrws'
/*Begin.Work*/
SELECT ...
END OF STMT
PARSE #18446744071522016088:c=147,e=148,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=957996380,tim=12568091329190
...
EXEC #18446744071522016088:c=683,e=11406,p=0,cr=2,cu=11,mis=0,r=1,dep=0,og=1,plh=957996380,tim=12568091341788
CLOSE #18446744071522016088:c=27,e=27,dep=0,type=1,tim=12568091343665
XCTEND rlbk=0, rd_only=0, tim=12568091343769
Current output (compacted for readability):
END-TIM LINE SQL_ID CALL-NAME STATEMENT-TEXT
-----------------------------------------------------------------------------
12568091.341788 36 5kpbj024phrws EXEC /*Begin.Work*/ SELECT ...
12568091.343769 42 XCTEND
Expected output (please don't cricicise my not correct subsecond calculation):
END-TIME LINE SQL_ID CALL-NAME STATEMENT-TEXT
-----------------------------------------------------------------------
2020-11-26 10:06:01.341788 36 5kpbj024phrws EXEC /*Begin.Work*/ SELECT ...
2020-11-26 10:06:01.343769 42 XCTEND
I assume I can use POSIX:strftime to format the timestamp properly, but I need a way to generate an epoch timestamp from the timestamp at the begin of the tracefile
*** 2020-11-26 10:06:01.867
and then an offset for each $tim relative to this begin of tracefile.
I hope methodr toolset can provide this. It should be easier for me to explain, when (in human readable form) which activity started.

Generating an epoch seconds number from a string date is simple enough. Time::Local has functions to do it.
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
use Time::Local 'timelocal_posix';
my $date = '2020-11-26 10:06:01.867';
# Split the date apart
my ($yr, $mon, $day, $hr, $min, $sec, $micro) = split /[- :.]/, $date;
# Note necessary adjustments to month and year
say timelocal_posix($sec, $min, $hr, $day, $mon - 1, $yr - 1900);

With the help of comments by #DaveCross and #CaryMillsap my processing is now:
from the tracefile create an intermediate file similar to
*** 2020-11-26 10:06:01.867
END-TIM LINE SQL_ID CALL-NAME STATEMENT-TEXT
-----------------------------------------------------------------------------
XCTEND tim=12568091341788 e=2 dep=0 36 5kpbj024phrws EXEC /*Begin.Work*/ SELECT ..
XCTEND tim=12568091343769 e=1 dep=0 42 XCTEND
using in calls.rc
sprintf("XCTEND tim=%-20d e=%-5d dep=0 %10d %10d %10d %13s %-40.40s %-.46s", $tim*1000000, $line,($e+$ela)*1000000, $parse_id, $exec_id, $sqlid, "· "x$dep.$name.(scalar(#bind)?"(".join(",",#bind).")":""), "· "x$dep.$sql)
modify the result to have somewhere on top
*** 2020-11-26 10:06:01.867
process this file with mrwhen
get rid of unwantent parts by
sed -E 's/XCTEND t.{37}//'
In more detail it's documented here.

Related

JSONDecodeError: Unexpected UTF-8 BOM (decode using utf-8-sig): line 1 column 1 (char 0) ---While Tuning gpt2.finetune

Hope you all are doing good ,
I am working on fine tuning GPT 2 model to generate Title based on the content ,While working on it ,I have created a simple CSV files containing only the title to train the model , But while inputting this model to GPT 2 for fine tuning I am getting the following ERROR ,
JSONDecodeError Traceback (most recent call last)
in ()
10 steps=1000,
11 save_every=200,
---> 12 sample_every=25) # steps is max number of training steps
13
14 # gpt2.generate(sess)
3 frames
/usr/lib/python3.7/json/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
336 if s.startswith('\ufeff'):
337 s = s.encode('utf8')[3:].decode('utf8')
--> 338 # raise JSONDecodeError("Unexpected UTF-8 BOM (decode using utf-8-sig)",
339 # s, 0)
340 else:
JSONDecodeError: Unexpected UTF-8 BOM (decode using utf-8-sig): line 1 column 1 (char 0)
Below is my code for the above :
import gpt_2_simple as gpt2
model_name = "120M" # "355M" for larger model (it's 1.4 GB)
gpt2.download_gpt2(model_name=model_name) # model is saved into current directory under /models/117M/
sess = gpt2.start_tf_sess()
gpt2.finetune(sess,
'titles.csv',
model_name=model_name,
steps=1000,
save_every=200,
sample_every=25) # steps is max number of training steps
I have tried all the basic mechanism of handing UTF -8 BOM but did not find any luck ,Hence requesting your help .It would be a great help from you all .
Try changing the model name because i see you input 120M and the gpt2 model is called 124M

How to eliminate JIT overhead in a Julia executable (with MWE)

I'm using PackageCompiler hoping to create an executable that eliminates just-in-time compilation overhead.
The documentation explains that I must define a function julia_main to call my program's logic, and write a "snoop file", a script that calls functions I wish to precompile. My julia_main takes a single argument, the location of a file containing the input data to be analysed. So to keep things simple my snoop file simply makes one call to julia_main with a particular input file. So I'd hope to see the generated executable run nice and fast (no compilation overhead) when executed against that same input file.
But alas, that's not what I see. In a fresh Julia instance julia_main takes approx 74 seconds for the first execution and about 4.5 seconds for subsequent executions. The executable file takes approx 50 seconds each time it's called.
My use of the build_executable function looks like this:
julia> using PackageCompiler
julia> build_executable("d:/philip/source/script/julia/jsource/SCRiPTMain.jl",
"testexecutable",
builddir = "d:/temp/builddir4",
snoopfile = "d:/philip/source/script/julia/jsource/snoop.jl",
compile = "all",
verbose = true)
Questions:
Are the above arguments correct to achieve my aim of an executable with no JIT overhead?
Any other advice for me?
Here's what happens in response to that call to build_executable. The lines from Start of snoop file execution! to End of snoop file execution! are emitted by my code.
Julia program file:
"d:\philip\source\script\julia\jsource\SCRiPTMain.jl"
C program file:
"C:\Users\Philip\.julia\packages\PackageCompiler\CJQcs\examples\program.c"
Build directory:
"d:\temp\builddir4"
Executing snoopfile: "d:\philip\source\script\julia\jsource\snoop.jl"
Start of snoop file execution!
┌ Warning: The 'control file' contains the key 'InterpolateCovariance' with value 'true' but that is not supported. Pass a value of 'false' or omit the key altogether.
└ # ValidateInputs d:\Philip\Source\script\Julia\JSource\ValidateInputs.jl:685
Time to build model 20.058000087738037
Saving c:/temp/SCRiPT/SCRiPTModel.jls
Results written to c:/temp/SCRiPT/SCRiPTResultsJulia.json
Time to write file: 3620 milliseconds
Time in method runscript: 76899 milliseconds
End of snoop file execution!
[ Info: used 1313 out of 1320 precompile statements
Build static library "testexecutable.a":
atexit_hook_copy = copy(Base.atexit_hooks) # make backup
# clean state so that any package we use can carelessly call atexit
empty!(Base.atexit_hooks)
Base.__init__()
Sys.__init__() #fix https://github.com/JuliaLang/julia/issues/30479
using REPL
Base.REPL_MODULE_REF[] = REPL
Mod = #eval module $(gensym("anon_module")) end
# Include into anonymous module to not polute namespace
Mod.include("d:\\\\temp\\\\builddir4\\\\julia_main.jl")
Base._atexit() # run all exit hooks we registered during precompile
empty!(Base.atexit_hooks) # don't serialize the exit hooks we run + added
# atexit_hook_copy should be empty, but who knows what base will do in the future
append!(Base.atexit_hooks, atexit_hook_copy)
Build shared library "testexecutable.dll":
`'C:\Users\Philip\.julia\packages\WinRPM\Y9QdZ\deps\usr\x86_64-w64-mingw32\sys-root\mingw\bin\gcc.exe' --sysroot 'C:\Users\Philip\.julia\packages\WinRPM\Y9QdZ\deps\usr\x86_64-w64-mingw32\sys-root' -shared '-DJULIAC_PROGRAM_LIBNAME="testexecutable.dll"' -o testexecutable.dll -Wl,--whole-archive testexecutable.a -Wl,--no-whole-archive -std=gnu99 '-IC:\Users\philip\AppData\Local\Julia-1.2.0\include\julia' -DJULIA_ENABLE_THREADING=1 '-LC:\Users\philip\AppData\Local\Julia-1.2.0\bin' -Wl,--stack,8388608 -ljulia -lopenlibm -m64 -Wl,--export-all-symbols`
Build executable "testexecutable.exe":
`'C:\Users\Philip\.julia\packages\WinRPM\Y9QdZ\deps\usr\x86_64-w64-mingw32\sys-root\mingw\bin\gcc.exe' --sysroot 'C:\Users\Philip\.julia\packages\WinRPM\Y9QdZ\deps\usr\x86_64-w64-mingw32\sys-root' '-DJULIAC_PROGRAM_LIBNAME="testexecutable.dll"' -o testexecutable.exe 'C:\Users\Philip\.julia\packages\PackageCompiler\CJQcs\examples\program.c' testexecutable.dll -std=gnu99 '-IC:\Users\philip\AppData\Local\Julia-1.2.0\include\julia' -DJULIA_ENABLE_THREADING=1 '-LC:\Users\philip\AppData\Local\Julia-1.2.0\bin' -Wl,--stack,8388608 -ljulia -lopenlibm -m64`
Copy Julia libraries to build directory:
7z.dll
BugpointPasses.dll
libamd.2.4.6.dll
libamd.2.dll
libamd.dll
libatomic-1.dll
libbtf.1.2.6.dll
libbtf.1.dll
libbtf.dll
libcamd.2.4.6.dll
libcamd.2.dll
libcamd.dll
libccalltest.dll
libccolamd.2.9.6.dll
libccolamd.2.dll
libccolamd.dll
libcholmod.3.0.13.dll
libcholmod.3.dll
libcholmod.dll
libclang.dll
libcolamd.2.9.6.dll
libcolamd.2.dll
libcolamd.dll
libdSFMT.dll
libexpat-1.dll
libgcc_s_seh-1.dll
libgfortran-4.dll
libgit2.dll
libgmp.dll
libjulia.dll
libklu.1.3.8.dll
libklu.1.dll
libklu.dll
libldl.2.2.6.dll
libldl.2.dll
libldl.dll
libllvmcalltest.dll
libmbedcrypto.dll
libmbedtls.dll
libmbedx509.dll
libmpfr.dll
libopenblas64_.dll
libopenlibm.dll
libpcre2-8-0.dll
libpcre2-8.dll
libpcre2-posix-2.dll
libquadmath-0.dll
librbio.2.2.6.dll
librbio.2.dll
librbio.dll
libspqr.2.0.9.dll
libspqr.2.dll
libspqr.dll
libssh2.dll
libssp-0.dll
libstdc++-6.dll
libsuitesparseconfig.5.4.0.dll
libsuitesparseconfig.5.dll
libsuitesparseconfig.dll
libsuitesparse_wrapper.dll
libumfpack.5.7.8.dll
libumfpack.5.dll
libumfpack.dll
libuv-2.dll
libwinpthread-1.dll
LLVM.dll
LLVMHello.dll
zlib1.dll
All done
julia>
EDIT
I was afraid that creating a minimal working example would be hard, but it was straightforward:
TestBuildExecutable.jl contains:
module TestBuildExecutable
Base.#ccallable function julia_main(ARGS::Vector{String}=[""])::Cint
#show sum(myarray())
return 0
end
#Function which takes approx 8 seconds to compile. Returns a 500 x 20 array of 1s
function myarray()
[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1;
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1;
# PLEASE EDIT TO INSERT THE MISSING 496 LINES, EACH IDENTICAL TO THE LINE ABOVE!
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1;
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
end
end #module
SnoopFile.jl contains:
module SnoopFile
currentpath = dirname(#__FILE__)
push!(LOAD_PATH, currentpath)
unique!(LOAD_PATH)
using TestBuildExecutable
println("Start of snoop file execution!")
TestBuildExecutable.julia_main()
println("End of snoop file execution!")
end # module
In a fresh Julia instance, julia_main takes 8.3 seconds for the first execution and half a millisecond for the second execution:
julia> #time TestBuildExecutable.julia_main()
sum(myarray()) = 10000
8.355108 seconds (425.36 k allocations: 25.831 MiB, 0.06% gc time)
0
julia> #time TestBuildExecutable.julia_main()
sum(myarray()) = 10000
0.000537 seconds (25 allocations: 82.906 KiB)
0
So next I call build_executable:
julia> using PackageCompiler
julia> build_executable("d:/philip/source/script/julia/jsource/TestBuildExecutable.jl",
"testexecutable",
builddir = "d:/temp/builddir15",
snoopfile = "d:/philip/source/script/julia/jsource/SnoopFile.jl",
verbose = false)
Julia program file:
"d:\philip\source\script\julia\jsource\TestBuildExecutable.jl"
C program file:
"C:\Users\Philip\.julia\packages\PackageCompiler\CJQcs\examples\program.c"
Build directory:
"d:\temp\builddir15"
Start of snoop file execution!
sum(myarray()) = 10000
End of snoop file execution!
[ Info: used 79 out of 79 precompile statements
All done
Finally, in a Windows Command Prompt:
D:\temp\builddir15>testexecutable
sum(myarray()) = 1000
D:\temp\builddir15>
which took (by my stopwatch) 8 seconds to run, and it takes 8 seconds to run every time it's executed, not just the first time. This is consistent with the executable doing a JIT compile every time it's run, but the snoop file is designed to avoid that!
Version information:
julia> versioninfo()
Julia Version 1.2.0
Commit c6da87ff4b (2019-08-20 00:03 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: Intel(R) Core(TM) i7-6700 CPU # 3.40GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
Environment:
JULIA_NUM_THREADS = 8
JULIA_EDITOR = "C:\Users\Philip\AppData\Local\Programs\Microsoft VS Code\Code.exe"
Looks like you are using Windows.
At some point PackageCompiler.jl will be mature for Windows at which you can try it.
The solution was indeed to wait for progress on PackageCompilerX, as suggested by #xiaodai.
On 10 Feb 2020 what was formerly PackageCompilerX became a new (version 1.0 of) PackageCompiler, with a significantly changed API, and more thorough documentation.
In particular, the MWE above (mutated for the new API to PackageCompiler) now works correctly without any JIT overhead.

time data doesn't match format specified

I am trying to convert the string to the type of 'datetime' in python. My data match the format, but still get the
'ValueError: time data 11 11 doesn't match format specified'
I am not sure where does the "11 11" in the error come from.
My code is
train_df['date_captured1'] = pd.to_datetime(train_df['date_captured'], format="%Y-%m-%d %H:%M:%S")
Head of data is
print (train_df.date_captured.head())
0 2011-05-13 23:43:18
1 2012-03-17 03:48:44
2 2014-05-11 11:56:46
3 2013-10-06 02:00:00
4 2011-07-12 13:11:16
Name: date_captured, dtype: object
I tried the following by just selecting the first string and running the code with same datetime format. They all work without problem.
dt=train_df['date_captured']
dt1=dt[0]
date = datetime.datetime.strptime(dt1, "%Y-%m-%d %H:%M:%S")
print(date)
2011-05-13 23:43:18
and
dt1=pd.to_datetime(dt1, format='%Y-%m-%d %H:%M:%S')
print (dt1)
2011-05-13 23:43:18
But why wen I using the same format in pd.to_datetime to convert all the data in the column, it comes up with the error above?
Thank you.
I solved it.
train_df['date_time'] = pd.to_datetime(train_df['date_captured'], errors='coerce')
print (train_df[train_df.date_time.isnull()])
I found in line 100372, the date_captured value is '11 11'
category_id date_captured ... height date_time
100372 10 11 11 ... 747 NaT
So the code with errors='coerce' will replace the invalid parsing with NaN.
Thank you.

How to get all the version of hbase row

I am trying to do the following command in hbase:
scan 'testLastVersion' {VERSIONS=>8}
And it return only the last version of the row.
Do you know how can I get all the versions of row through command shell and through java code?
Thanks!
I think you are missing the ',' there.. The command should be something like this:
scan 'emp', {VERSIONS=>8}
Even if you are missing the comma, HBase should throw an error:
SyntaxError: (hbase):16: syntax error, unexpected tLCURLY
I tried to simulate a your scenario and got all the results. Please find them below.
hbase(main):010:0> put 'emp', '1', 'personal_data:name', 'Ajay'
0 row(s) in 0.0220 seconds
hbase(main):012:0> put 'emp', '1', 'personal_data:name', 'Vijay'
0 row(s) in 0.0140 seconds
hbase(main):014:0> put 'emp', '1', 'personal_data:name', 'Ceema'
0 row(s) in 0.0070 seconds
hbase(main):017:0> scan 'emp', {VERSIONS=>3}
ROW COLUMN+CELL
1 column=personal_data:name, timestamp=1472651320449, value=Ceema
1 column=personal_data:name, timestamp=1472651313396, value=Vijay
1 column=personal_data:name, timestamp=1472651300718, value=Ajay
1 row(s) in 0.0220 seconds

How do you track sql queries in Oracle 12c?

We are in the process of upgrading to Oracle 12c and I need to track the queries being executed by the application. In other words if the application executes a query like select 'foobar' from dual; I would like see the text "select 'foobar' from dual" in the output file.
If I follow the instructions here: https://docs.oracle.com/database/121/TGSQL/tgsql_trace.htm#TGSQL809 I get files that contain statistics like the following but not the actual sql queries.
WAIT #0: nam='rdbms ipc message' ela= 2999770 timeout=300 p2=0 p3=0 obj#=-1 tim=1103506389
WAIT #0: nam='rdbms ipc message' ela= 9854 timeout=1 p2=0 p3=0 obj#=-1 tim=1103522400
*** 2016-04-07 15:07:20.715
WAIT #0: nam='rdbms ipc message' ela= 2999585 timeout=300 p2=0 p3=0 obj#=-1 tim=1106522506
WAIT #0: nam='rdbms ipc message' ela= 9690 timeout=1 p2=0 p3=0 obj#=-1 tim=1106532500
If I look for the query like this I get 0 results: grep -rnw "foobar" --include=*.trc ./
One option is looking up the AWR repositories for it. It will keep a few days worth of SQLs. There's plenty of additional information in the system views, so this is strictly the text, but feel free to explore.
SELECT DISTINCT u.username, to_char(substr(h.sql_text, 1, 4000)) sqltxt
FROM dba_hist_sqltext h
JOIN dba_hist_active_sess_history a ON a.sql_id = h.sql_id
JOIN dba_users u ON u.user_id = a.user_id
WHERE username = 'SYS';
I filtered the results for SYS just as an example, but you can change it as you wish.
If you would like to see all the activity the best thing you should do is have an EM (Enterprise Manager) set for you.
If you don't, gv$activity_session_history would be a good call, it's better seen when using grouping functions. Simply selecting on it would be a mess depending on the number of calls your application is pushing.
Another way, you could see it in an average manner:
`
select s.parsing_schema_name,
inst_id,
sql_id,
plan_hash_value,
child_number,
round(nullif(s.ELAPSED_TIME, 0) / nullif(s.EXECUTIONS, 0) / 1000000, 4) elap_per_exec,
round(s.USER_IO_WAIT_TIME / nullif(s.ELAPSED_TIME, 0) * 100, 2) io_wait_pct,
round(s.CLUSTER_WAIT_TIME / nullif(s.ELAPSED_TIME, 0) * 100, 2) cluster_wait_pct,
round(s.application_wait_time / nullif(s.ELAPSED_TIME, 0) * 100, 2) app_wait_pct,
round(s.CPU_TIME / nullif(s.ELAPSED_TIME, 0) * 100, 2) cpu_time_pct,
round(s.PHYSICAL_READ_BYTES / nullif(s.EXECUTIONS, 0) / 1024 / 1024, 2) pio_per_exec_mb,
round(s.PHYSICAL_READ_BYTES / nullif(s.PHYSICAL_READ_REQUESTS, 0), 2) / 1024 read_per_request_kbytes,
round(s.buffer_gets / nullif(s.executions, 0), 4) BufferGets_per_Exec
s.executions,
to_char(s.last_active_time,'dd/mm/yyyy hh24:mi:ss') last_act_time,
s.first_load_time,
s.sql_fulltext,
s.sql_profile,
s.sql_patch,
s.sql_plan_baseline
FROM gv$sql s
WHERE 1=1
and s.parsing_schema_name in ('LIST OF DATABASE USERS YOU WANT TO MONITOR')
order by s.last_active_time desc;
`
It would give a good perspective of how well your doing based in your average thresholds.

Resources