How to determine the execution time of a Lua script in Redis? - time

I have a small Lua script to run in Redis, and I'm interested in getting the execution time.
Due to the nature of Redis and it's Lua implementation, I cannot use the TIME function at the start/return points of the script, and include this information in the return for processing (See http://redis.io/commands/eval - Scripts as pure functions). This results in an error: (error) ERR Error running script (call to f_a49ed2fea72f1f529843d6024d1515e76e69bcbd): Write commands not allowed after non deterministic commands
I have searched around for a function/call I could make which will return the execution time of the last run script, but have not found anything yet.
I am using PHP and the Predis Library. While I can check the execution time from PHP side, I wish to remove the transmission overhead and find out how long the Lua script will block access to the database. I have successfully had times returned if I do not need to alter any of the data stored in Redis, and these have been about 1/10th the time PHP reports.
How can I determine the execution time of the Lua script in Redis, and not via PHP?

You can activate Redis slow log feature, by changing the slowlog-log-slower-than parameter to 0. It will record execution time of ALL commands (including Lua scripts, and whatever the execution time).
The slow log is kept in an in-memory queue you have to dump on a regular basis to collect data. Depending on the volume of traffic, you may have to increase slowlog-max-len to be sure to catch the execution times you are interested in.
You can use the slowlog get command to dump the slow log. Up to you to filter out the results you don't need. AFAIK, there is no possibility to filter at data collection time (to keep only Lua statistics).

You can use os.clock() inside script if you enable os module when building Redis.
Change #if 0 to #if 1 on line 484 in https://github.com/antirez/redis/blob/unstable/src/scripting.c:
void luaLoadLibraries(lua_State *lua) {
luaLoadLib(lua, "", luaopen_base);
luaLoadLib(lua, LUA_TABLIBNAME, luaopen_table);
luaLoadLib(lua, LUA_STRLIBNAME, luaopen_string);
luaLoadLib(lua, LUA_MATHLIBNAME, luaopen_math);
luaLoadLib(lua, LUA_DBLIBNAME, luaopen_debug);
luaLoadLib(lua, "cjson", luaopen_cjson);
luaLoadLib(lua, "struct", luaopen_struct);
luaLoadLib(lua, "cmsgpack", luaopen_cmsgpack);
#if 0 /* Stuff that we don't load currently, for sandboxing concerns. */
luaLoadLib(lua, LUA_LOADLIBNAME, luaopen_package);
luaLoadLib(lua, LUA_OSLIBNAME, luaopen_os);
#endif
}

EDIT: rld is obsolete.
You can use rld (https://github.com/RedisLabs/redis-lua-debugger) and have it print to the log - the log has timing information (although rld will cause your script to run slower, naturally).

As of Redis v3.2, you can choose to replicate commands instead of scripts with a call to the new redis.replicate_commands() API. This will allow you, among other things, to call TIME in scripts that perform write operations without triggering the non-determinism error.

Related

Confusion about rubys IO#(read/write)_nonblock calls

I am currently doing the Ruby on the Web project for The Odin Project. The goal is to implement a very basic webserver that parses and responds to GET or POST requests.
My solution uses IO#gets and IO#read(maxlen) together with the Content-Length Header attribute to do the parsing.
Other solution use IO#read_nonblock. I googled for it, but was quite confused with the documentation for it. It's often mentioned together with Kernel#select, which didn't really help either.
Can someone explain to me what the nonblock calls do differently than the normal ones, how they avoid blocking the thread of execution, and how they play together with the Kernel#select method?
explain to me what the nonblock calls do differently than the normal ones
The crucial difference in behavior is when there is no data available to read at call time, but not at EOF:
read_nonblock() raises an exception kind of IO::WaitReadable
normal read(length) waits until length bytes are read (or EOF)
how they avoid blocking the thread of execution
According to the documentation, #read_nonblock is using the read(2) system call after O_NONBLOCK is set for the underlying file descriptor.
how they play together with the Kernel#select method?
There's also IO.select. We can use it in this case to wait for availability of input data, so that a subsequent read_nonblock() won't cause an error. This is especially useful if there are multiple input streams, where it is not known from which stream data will arrive next and for which read() would have to be called.
In a blocking write you wait until bytes got written to a file, on the other hand a nonblocking write exits immediately. It means, that you can continue to execute your program, while operating system asynchronously writes data to a file. Then, when you want to write again, you use select to see whether the file is ready to accept next write.

Loading Data into the application from GUI using Ruby

Problem:
Hi everyone, I am currently building an automation suite using Ruby-Selenium Webdriver-Cucumber to load data into the application using it's GUI. I've take input from mainframe .txt files. The scenarios are like to create a customer and then load multiple accounts for them as per the data provided in the inputs.
Current Approach
Execute the scenario using the rake task by passing line number as parameter and the script is executed for only one set of data.
To read the data for a particular line, I'm using below code:
File.readlines("#{file_path}")[line_number.to_i - 1]
My purpose of using line by line loading is to keep the execution running even if a line fails to load.
Shortcomings
Supposed I've to load 10 accounts to a single customer. So my current script will run 10 times to load each account. I want something that can load the accounts in a single go.
What I am looking for
To overcome the above shortcoming, I want to capture the entire data for a single customer from the file like accounts etc and load them into the application in a single execution.
Also, I've to keep track on the execution time and memory allocation as well.
Please provide your thoughts on this approach and any suggestions or improvements are welcomed. (Sorry for the long post)
The first thing I'd do is break this down into steps -- as you said in your comment, but more formally here:
Get the data to apply to all records. Put up a page with the
necessary information (or support command line specification if not
too much?).
For each line in the file, do the following (automated):
Get the web page for inputting its data;
Fill in the fields;
Submit the form
Given this, I'd say the 'for each line' instruction should definitely be reading a line at a time from the file using File.foreach or similar.
Is there anything beyond this that needs to be taken into account?

Reading file in parallel from multiple processes

I'm running multiple processes in parallel and each of these processes read the same file in parallel. It looks like some of the processes see a corrupted version of the file if I increase the number of processes to > 15 or so. What is the recommended way of handling such a scenario?
More details:
The file being read in parallel is actually a perl script. The multiple jobs are python processes, and each of them launch this perl script independently with different input parameters. When the number of jobs is increased, some of these jobs give errors that the perl script has invalid syntax (which is not true). Hence, I suspect that some of these jobs read in corrupted versions of the perl script.
I'm running all of this on a 32core machine.
If any process is also writing to the file, then you need to enforce some synchronization, for example with a global named mutex.
If there is no asynchronous writing going on, I would not expect to see corruption during the reads. Are you opening the files with "r" access? If you're still encountering troubles, it might be worth experimenting with reducing read buffer size. Or call out to a native win32 API for the file access.
Good luck!

redis: EVAL and the TIME

I like the Lua-scripting for redis but i have a big problem with TIME.
I store events in a SortedSet.
The score is the time, so that in my application i can view all events in given time-window.
redis.call('zadd', myEventsSet, TIME, EventID);
Ok, but this is not working - i can not access the TIME (Servertime).
Is there any way to get a time from the Server without passing it as an argument to my lua-script? Or is passing the time as argument the best way to do it?
This is explicitly forbidden (as far as I remember). The reasoning behind this is that your lua functions must be deterministic and depend only on their arguments. What if this Lua call gets replicated to a slave with different system time?
Edit (by Linus G Thiel): This is correct. From the redis EVAL docs:
Scripts as pure functions
A very important part of scripting is writing scripts that are pure functions. Scripts executed in a Redis instance are replicated on slaves by sending the script -- not the resulting commands.
[...]
In order to enforce this behavior in scripts Redis does the following:
Lua does not export commands to access the system time or other external state.
Redis will block the script with an error if a script calls a Redis command able to alter the data set after a Redis random command like RANDOMKEY, SRANDMEMBER, TIME. This means that if a script is read-only and does not modify the data set it is free to call those commands. Note that a random command does not necessarily mean a command that uses random numbers: any non-deterministic command is considered a random command (the best example in this regard is the TIME command).
There is a wealth of information on why this is, how to deal with this in different scenarios, and what Lua libraries are available to scripts. I recommend you read the whole documentation!

Count number of executions of batch-script

This is my problem, I've got a batch-script that I can't modify (lets call it foo) and I would like to count how many times/day this script is executed - to keep track of that data.
Preferably, I would like to write the number of executions with date and exit-code to some kind of log file.
So my question is if this is possible and in that case - how? To create a batch-script/something that works in the background and writes every execution of foo to a log.
(I know this would be easy if I could modify foo but I can't. Also, everything is running on WinXP machines.)
You could write a wrapper script that does the logging and calls the existing script. Then use the wrapper in place of the original script
Consider writing a program that interrogates the Task Manager.
See http://www.netomatix.com/ProcDiagnostics.aspx
You could, for example, write a simple Console app which runs on a timer; every 5 seconds it checks that your foo application process exists. If it finds that it does, it assumes that find as the start time of the application; if it doesn't find it, it assumes the application has now closed and logs that information. It wouldn't be accurate to the second by any means, but would give you a rough approximation of when the thing is running and closing.
You might be able to configure Process Monitor
http://technet.microsoft.com/en-us/sysinternals/bb896645.aspx to capture the information you require

Resources