Trouble setting value in nested dictionary - data-structures

I'm writing a TCL script to parse the HTML output of the Firebug profiling console. To begin with I just want to accumulate the number of method calls on a per-file basis. To do this I'm trying to use nested dictionaries. I seem to have gotten the first level correct (where file is the key and method is the value) but not the second, nested level, where method is the value and count is the key.
I have read about dictionary's update command so I'm open to refactoring using this. My TCL usage is on-again off-again so thanks in advance for any assistance. Below is my code and some sample output
foreach row $table_rows {
regexp {<a class="objectLink objectLink-profile a11yFocus ">(.+?)</a>.+?class=" ">(.+?)\(line\s(\d+)} $row -> js_method js_file file_line
if {![dict exists $method_calls $js_file]} {
dict set method_calls $js_file [dict create]
}
set file_method_calls [dict get $method_calls $js_file]
if {![dict exists $file_method_calls $js_method]} {
dict set file_method_calls $js_method 0
dict set method_calls $js_file $file_method_calls
}
set file_method_call_counts [dict get $file_method_calls $js_method]
dict set $file_method_calls $js_method [expr 1 + $file_method_call_counts]
dict set method_calls $js_file $file_method_calls
}
dict for {js_file file_method_calls} $method_calls {
puts "file: $js_file"
dict for {method_name call_count} $file_method_calls {
puts "$method_name: $call_count"
}
puts ""
}
OUTPUT:
file: localhost:50267
(?): 0
e: 0
file: Defaults.js
toDictionary: 0
(?): 0
Renderer: 0
file: jquery.cookie.js
cookie: 0
decoded: 0
(?): 0

The dict set command, like any setter in Tcl, takes the name of a variable as its first argument. I bet that:
dict set $file_method_calls $js_method [expr 1 + $file_method_call_counts]
should really read:
dict set file_method_calls $js_method [expr {1 + $file_method_call_counts}]
(Also, brace your expressions for speed and safety.)

Related

How do i filter in TCL script API

I need to filter the response from getWater and getSoda. The problem I have is when I try to get response in API I get both querys. So in cli lets say i put CLI:getWater the response it gives for both water and soda i need to distinguish between the two if you look at the end line it gives you 1 for Water and 0 for Soda. I'm trying to make filter in TCL file so if i put getWater it only pulls out the query with whatever ends with 1 and vice versa.
cli% getWater {2 Fiji - {} 1 {} b873-367ef9944d48 **1**} {3 Coke - {} 1 {} 9d39-56ad9be6ee9f **0**} {6 Dasani - {} 1 {} 9d39-56ad9be6ee9f **1**} {9 Fanta - {} 1 {} 9d39-56ad9be6ee9f **0**}
im having hard time coding it because in not familiar with TCL
but so far to get query i got this.
proc API::get {args} {
set argc [llength $args]
if {$argc == 1} {
# get all sets based on set type
set objtype [lindex $args 0]
catcher getset_int $objtype {} {}
I'm guessing that you have some command (that I'll call getListOfRecords for the sake of argument) and you want to filter the returned list by the value of the 8th element (index 7; TCL uses zero-based indexing) of each record? You can do that with either lmap+lindex or with lsearch (with the right options).
proc getRecordsOfType {typeCode} {
lmap r [getListOfRecords] {
if {[lindex $r 7] eq $typeCode} {set r} else continue
}
}
proc getRecordsOfType {typeCode} {
lsearch -all -inline -exact -index 7 [getListOfRecords] $typeCode
}
Using lsearch is probably faster, but the other approach is far more flexible. (Measure instead of guessing if it matters to you.)
getWater is just getRecordsOfType 1.

Retrieving data from a particular column in a CSV File using TCL

I have a CSV File,containing two columns of data.
I want to retrieve two columns of the data in two separate lists.
I have tried the following code:
set fp [open "D:\\RWTH\\Mini thesis\\EclipseTCL\\TCL trial\\excelv1.csv" r]
set file_data [read $fp]
close $fp
set data [split $file_data " "]
puts $data
the output obtained is
{0,245
0.0025,249
0.005,250
0.0075,252
0.01,253
0.0125,255
0.015,256
.
.
.
}
The data is in 2 separate columns in the excel sheet. I wish to take the elements only from the 2nd column i,e
{245,
249,
250,
252,
253,
.
.
.
}
I would be glad, if someone can help me with this.
Using the file_data you have already read from the file, you can:
lmap row [split [string trim $file_data] \n] {
scan $row %*f,%d
}
That is, trim off white space before and after the data, split into rows, then from every row, scan one integer (skipping the real and the comma). All the scanned integers are collected in a list.
It is, however, a good idea to always use the right tool for the job.
package require csv
lmap row [split [string trim $file_data] \n] {
lindex [::csv::split $row] end
}
The ::csv::split command knows exactly how to split csv data. In this case, it isn't really necessary, but it's a good habit to use the csv package for csv data.
Documentation:
csv (package),
lindex,
lmap (for Tcl 8.5),
lmap,
package,
scan,
split,
string
You'd better to use "gets" to read each line from the file:
set fp [open "D:\\RWTH\\Mini thesis\\EclipseTCL\\TCL trial\\excelv1.csv" r]
set secondColumnData {}
while {[gets $fp line]>=0} {
if {[llength $line]>0} {
lappend secondColumnData [lindex [split $line ","] 1]
}
}
close $fp
puts $secondColumnData
gets
You may also try:
set data [read [open "D:\\RWTH\\Mini thesis\\EclipseTCL\\TCL trial\\excelv1.csv"]]
set result [regexp -all -inline -line -- {^.*,(.*)$} $data]
set items {}
foreach {tmp item} $result {
lappend items $item
}
puts $items
Output:
245 249 250 252 253 255 256

Redis Sorted Set: Bulk ZSCORE

How to get a list of members based on their ID from a sorted set instead of just one member?
I would like to build a subset with a set of IDs from the actual sorted set.
I am using a Ruby client for Redis and do not want to iterate one by one. Because there could more than 3000 members that I want to lookup.
Here is the issue tracker to a new command ZMSCORE to do bulk ZSCORE.
There is no variadic form for ZSCORE, yet - see the discussion at: https://github.com/antirez/redis/issues/2344
That said, and for the time being, what you could do is use a Lua script for that. For example:
local scores = {}
while #ARGV > 0 do
scores[#scores+1] = redis.call('ZSCORE', KEYS[1], table.remove(ARGV, 1))
end
return scores
Running this from the command line would look like:
$ redis-cli ZADD foo 1 a 2 b 3 c 4 d
(integer) 4
$ redis-cli --eval mzscore.lua foo , b d
1) "2"
2) "4"
EDIT: In Ruby, it would probably be something like the following, although you'd be better off using SCRIPT LOAD and EVALSHA and loading the script from an external file (instead of hardcoding it in the app):
require 'redis'
script = <<LUA
local scores = {}
while #ARGV > 0 do
scores[#scores+1] = redis.call('ZSCORE', KEYS[1], table.remove(ARGV, 1))
end
return scores
LUA
redis = ::Redis.new()
reply = redis.eval(script, ["foo"], ["b", "d"])
Lua script to get scores with member IDs:
local scores = {}
while #ARGV > 0 do
local member_id = table.remove(ARGV, 1)
local member_score = {}
member_score[1] = member_id
member_score[2] = redis.call('ZSCORE', KEYS[1], member_id)
scores[#scores + 1] = member_score
end
return scores

Parse list of integers (optimization needed for speed test)

I am performing a tiny speed test in order to compare the speed of the Agda programming language with the Tcl scripting language. Its for scientific work and this is just a pre-test, not a real test. I am not in anyway trying to perform a realistic speed comparison!
I have come up with a small example, in which Agda is 10x times faster than Tcl. There are special reasons I use this example. My main concern is that my Tcl code is badly programmed and this is the sole reason Tcl is slower than Agda in this example.
The goal of the code is to parse a line that represents a list of integers and check if it is indeed a list of integers.
Example "(1,2,3)" would be a valid list.
Example "(1,a,3)" would not be a valid list.
My input is a file and I check every third line (3rd) of the file. If any line is not a list of integers, the program prints "false".
My input file:
(613424,505980,317647,870930,75580,897160,716297,668539,689646,196362,533020)
(727375,472272,22435,869407,320468,80779,302881,240382,196077,635360,568517)
(613424,505980,317647,870930,75580,897160,716297,668539,689646,196362,533020)
(however, my real test file is about 3 megabyte large)
My current Tcl code to solve this problem is:
package require Tcl 8.6
proc checkListNat {str} {
set list [split [string map {"(" "" ")" ""} $str] ","]
foreach l $list {
if {[string is integer $l] == 0} {
return 0
}
}
return 1
}
set i 1
set fp [open "/tmp/test.txt" r]
while { [gets $fp data] >= 0 } {
incr i
if { [expr $i % 3] == 0} {
if { [checkListNat $data] == 0 } {
puts "error"
}
}
}
close $fp
How can I optimize my current Tcl code, so that the speed test between Agda and Tcl is more realistic?
The first thing to do is to put as much code in procedures (or lambda terms) as possible and ensure that all expressions are braced. Those were your two key problems that were killing performance. We'll do a few other things too (you hardly ever need expr inside an if test and this wasn't one of those cases, string trim is more suitable than string map, string is really ought to be done with -strict). With those, I get this version which is relatively similar to what you already had yet ought to be substantially more performant.
package require Tcl 8.6
proc checkListNat {str} {
foreach l [split [string trim $str "()"] ","] {
if {[string is integer -strict $l] == 0} {
return 0
}
}
return 1
}
apply {{} {
set i 1
set fp [open "/tmp/test.txt" r]
while { [gets $fp data] >= 0 } {
if {[incr i] % 3 == 0 && ![checkListNat $data]} {
puts "error"
}
}
close $fp
}} {*}$argv
You might get better performance by adding fconfigure $fp -encoding iso8859-1; you'll have to test that yourself. But the key changes are the ones due to the bold items earlier, as each substantially impacts on the efficiency of compilation strategy used. (Also, Tcl 8.5 is a little faster than 8.6 — 8.6 has a radically different execution engine that is a bit slower for some things — so you might test the new code with 8.5 too; the code itself appears to be valid with both versions.)
try checking with regex {^[0-9,]+$} $line instead of the checkListNat function.
update
here is an example
echo "87,566, 45,67\n56,5r5,45" >! try
...
while {[gets $fp line] >0} {
if {[regexp {^[0-9]+$} $line] >0 } {
puts "OK $line"
} else {
puts "BAD $line"
}
}
gives:
>OK 87,566, 45,67
>BAD 56,5r5,45

ruby chaining commands inline for a TCL programmer

I am a TCL programmer and do a lot of statement chaining and don't know how that can be done in ruby
For example if i would want to append the current time to the value of a variable
for example in tcl:
set mylist [list a b c,d,e f]
set myelem_with_time "[lindex [split [lindex $mylist 2] ,] 0][clock seconds]"
>>c{with some time value}
How can this be achieved in ruby without using separate lines for each command
(of course its not an object class method or else use . operator, for example chaining the current time, or some arithmetic operation etc)
psudo code:
myval = mylist[2].split(",")[0] + time()+60seconds;
(I want to interpolate the time + 60 without calculating on a previous line)
mylist = %w[a b c,d,e f]
myelem_with_time = mylist[2].split(',')[0] + (Time.now + 60).to_i.to_s
# or
myelem_with_time = "%s%d" % [mylist[2].split(',')[0], (Time.now + 60).to_i]
# or
myelem_with_time = "#{mylist[2].split(',')[0]}#{(Time.now + 60).to_i}"
Using your list from above and playing with your command:
mylist[2].split(",")[0] + (Time.now + 60).to_s
I got:
e f2012-02-28 04:46:55 -0700
Is that what you're looking for (I did not strip the Date from the output but that is possible)

Resources