How can I sort child nodes by property value in Scala.js? - sorting

The problem is to sort all child divs of a root node according to their top CSS property.
Here is my code:
val elements = global.document.getElementById("root").childNodes.asInstanceOf[dom.NodeList]
val clones = (for (i <- (0 to (elements.length - 1)) if (elements(i).isInstanceOf[dom.raw.HTMLDivElement])) yield {
val clone = elements(i).cloneNode(true)
val style = clone.attributes.getNamedItem("style").value
val parts = style.split("top: ")
val parts2 = parts(1).split("px")
val px = parts2(0).toDouble
Tuple2[dom.Node, Double](clone, px)
}).toList
val sorted = clones.sortWith((a, b) => a._2 > b._2)
global.document.getElementById("root").innerHTML = ""
for (e <- sorted) {
global.document.getElementById("root").appendChild(e._1)
}
I'm new to Scala.js and it took quite an effort to come up with this solution. It compiles and seems to work, however I'm not sure how legitimate it is.
For example I can only get the top property of the node in a very complicated way. Also I suspect that for deleting all child nodes global.document.getElementById("root").innerHTML = "" is a backdoor way. I'm not sure if this sorting can be done in place without creating clones. I welcome any suggestions for improvement and I hope that some beginner out there may find even this code useful.

Various suggestions, some pertaining to Scala and some to the underlying browser environment:
First, jQuery (actual JavaScript library) (Scala.js facade) is your friend. Trying to do anything with the raw DOM is a pain in the ass, and I don't recommend it for anything but the simplest toy applications. (This has nothing to do with Scala.js, mind -- that's just the reality of working in the browser, and is all true of JavaScript as well.)
Using jQuery, getting the elements is just:
val elements = $("root").children
Second, essentially nobody loops using indexes in Scala like that -- it's legal, but extremely rare. Instead, you get each element directly in the for. And you can stick the value assignments right into the for itself, keeping the yield clause clean.
jQuery lets you get at CSS properties directly. (Although I think you still have to parse out the "px".) Again, everything is much harder if you try to use the raw DOM functions.
And it's very rare to spell out Tuple2 -- you just use parens for a tuple. Putting it all together, it would look something like this:
for {
element <- elements
if (element.isInstanceOf[dom.raw.HTMLDivElement])
clone = element.clone()
top = clone.css("top")
px = top.dropRight(2).toDouble
}
yield (clone, px)
Mind, I haven't actually tried out the above code -- there are probably some bugs -- but that's more like what idiomatic Scala.js + jQuery code would look like, and is worth considering as a starting point.

Related

SCAN command with spring redis template

I am trying to execute "scan" command with RedisConnection. I don't understand why the following code is throwing NoSuchElementException
RedisConnection redisConnection = redisTemplate.getConnectionFactory().getConnection();
Cursor c = redisConnection.scan(scanOptions);
while (c.hasNext()) {
c.next();
}
Exception:
java.util.NoSuchElementException at
java.util.Collections$EmptyIterator.next(Collections.java:4189) at
org.springframework.data.redis.core.ScanCursor.moveNext(ScanCursor.java:215)
at
org.springframework.data.redis.core.ScanCursor.next(ScanCursor.java:202)
Yes, I have tried this, in 1.6.6.RELEASE spring-data-redis.version. No issues, the below simple while loop code is enough. And i have set count value to 100 (more the value) to save round trip time.
RedisConnection redisConnection = null;
try {
redisConnection = redisTemplate.getConnectionFactory().getConnection();
ScanOptions options = ScanOptions.scanOptions().match(workQKey).count(100).build();
Cursor c = redisConnection.scan(options);
while (c.hasNext()) {
logger.info(new String((byte[]) c.next()));
}
} finally {
redisConnection.close(); //Ensure closing this connection.
}
I'm using spring-data-redis 1.6.0-RELEASE and Jedis 2.7.2; I do think that the ScanCursor implementation is slightly flawed w/rgds to handling this case on this version - I've not checked previous versions though.
So: rather complicated to explain, but in the ScanOptions object there is a "count" field that needs to be set (default is 10). This field, contains an "intent" or "expected" results for this search. As explained (not really clearly, IMHO) here, you may change the value of count at each invocation, especially if no result has been returned. I understand this as "a work intent" so if you do not get anything back, maybe your "key space" is vast and the SCAN command has not worked "hard enough". Obviously, as long as you're getting results back, you do not need to increase this.
A "simple-but-dangerous" approach would be to have a very large count (e.g 1 million or more). This will make REDIS go away trying to search your vast key space to find "at least or near as much" as your large count. Don't forget - REDIS is single-threaded so you just killed your performance. Try this with a REDIS of 12M keys and you'll see that although SCAN may happily return results with a very high count value, it will absolutely do nothing more during the time of that search.
To the solution to your problem:
ScanOptions options = ScanOptions.scanOptions().match(pattern).count(countValue).build();
boolean done = false;
// the while-loop below makes sure that we'll get a valid cursor -
// by looking harder if we don't get a result initially
while (!done) {
try(Cursor c = redisConnection.scan(scanOptions)) {
while (c.hasNext()) {
c.next();
}
done = true; //we've made it here, lets go away
} catch (NoSuchElementException nse) {
System.out.println("Going for "+countValue+" was not hard enough. Trying harder");
options = ScanOptions.scanOptions().match(pattern).count(countValue*2).build();
}
}
Do note that the ScanCursor implementation of Spring Data REDIS will properly follow the SCAN instructions and loop correctly, as much as needed, to get to the end of the loop as per documentation. I've not found a way to change the scan options within the same cursor - so there may be a risk that if you get half-way through your results and get a NoSuchElementException, you'll start again (and essentially do some of the work twice).
Of course, better solutions are always welcome :)
My old code
ScanOptions.scanOptions().match("*" + query + "*").count(10).build();
Working code
ScanOptions.scanOptions().match("*" + query + "*").count(Integer.MAX_VALUE).build();

Accessing deeply nested table without error?

For a field inside a deeply nested table, for example, text.title.1.font. Even if you use
if text.title.1.font then ... end
it would result in an error like "attempt to index global 'text' (a nil value)" if any level of the table does not actually exists. Of course one may tried to check for the existence of each level of the table, but it seems rather cumbersome. I am wondering is there a safe and prettier way to handle this, such that when referencing such an object, nil would be the value instead of triggering an error?
The way to do this that doesn't invite lots of bugs is to explicitly tell Lua which fields of which tables should be tables by default. You can do this with metatables. The following is an example, but it should really be customized according to how you want your tables to be structured.
-- This metatable is intended to catch bugs by keeping default tables empty.
local default_mt = {
__newindex =
function()
error(
'This is a default table. You have to make nested tables the old-fashioned way.')
end
}
local number_mt = {
__index =
function(self, key)
if type(key) == 'number' then
return setmetatable({}, default_mt)
end
end
}
local default_number_mt = {
__index = number_mt.__index,
__newindex = default_mt.__newindex
}
local title_mt = {__index = {title = setmetatable({}, default_number_mt)}}
local text = setmetatable({}, title_mt)
print(text.title[1].font)
Egor's suggestion debug.setmetatable(nil, {__index = function()end}) is the easiest to apply. Keep in mind that it's not lexically scoped, so, once it's on, it will be "on" until turned off, which may have unintended consequences in other parts of your code. See this thread for the discussion and some alternatives.
Also note that text.title.1.font should probably be text.title[1].font or text.title['1'].font (and these two are not the same).
Another, a bit more verbose, but still usable alternative is:
if (((text or {}).title or {})[1] or {}).font then ... end

Sorting CouchDB Views By Value

I'm testing out CouchDB to see how it could handle logging some search results. What I'd like to do is produce a view where I can produce the top queries from the results. At the moment I have something like this:
Example document portion
{
"query": "+dangerous +dogs",
"hits": "123"
}
Map function
(Not exactly what I need/want but it's good enough for testing)
function(doc) {
if (doc.query) {
var split = doc.query.split(" ");
for (var i in split) {
emit(split[i], 1);
}
}
}
Reduce Function
function (key, values, rereduce) {
return sum(values);
}
Now this will get me results in a format where a query term is the key and the count for that term on the right, which is great. But I'd like it ordered by the value, not the key. From the sounds of it, this is not yet possible with CouchDB.
So does anyone have any ideas of how I can get a view where I have an ordered version of the query terms & their related counts? I'm very new to CouchDB and I just can't think of how I'd write the functions needed.
It is true that there is no dead-simple answer. There are several patterns however.
http://wiki.apache.org/couchdb/View_Snippets#Retrieve_the_top_N_tags. I do not personally like this because they acknowledge that it is a brittle solution, and the code is not relaxing-looking.
Avi's answer, which is to sort in-memory in your application.
couchdb-lucene which it seems everybody finds themselves needing eventually!
What I like is what Chris said in Avi's quote. Relax. In CouchDB, databases are lightweight and excel at giving you a unique perspective of your data. These days, the buzz is all about filtered replication which is all about slicing out subsets of your data to put in a separate DB.
Anyway, the basics are simple. You take your .rows from the view output and you insert it into a separate DB which simply emits keyed on the count. An additional trick is to write a very simple _list function. Lists "render" the raw couch output into different formats. Your _list function should output
{ "docs":
[ {..view row1...},
{..view row2...},
{..etc...}
]
}
What that will do is format the view output exactly the way the _bulk_docs API requires it. Now you can pipe curl directly into another curl:
curl host:5984/db/_design/myapp/_list/bulkdocs_formatter/query_popularity \
| curl -X POST host:5984/popularity_sorter/_design/myapp/_view/by_count
In fact, if your list function can handle all the docs, you may just have it sort them itself and return them to the client sorted.
This came up on the CouchDB-user mailing list, and Chris Anderson, one of the primary developers, wrote:
This is a common request, but not supported directly by CouchDB's
views -- to do this you'll need to copy the group-reduce query to
another database, and build a view to sort by value.
This is a tradeoff we make in favor of dynamic range queries and
incremental indexes.
I needed to do this recently as well, and I ended up doing it in my app tier. This is easy to do in JavaScript:
db.view('mydesigndoc', 'myview', {'group':true}, function(err, data) {
if (err) throw new Error(JSON.stringify(err));
data.rows.sort(function(a, b) {
return a.value - b.value;
});
data.rows.reverse(); // optional, depending on your needs
// do something with the data…
});
This example runs in Node.js and uses node-couchdb, but it could easily be adapted to run in a browser or another JavaScript environment. And of course the concept is portable to any programming language/environment.
HTH!
This is an old question but I feel it still deserves a decent answer (I spent at least 20 minutes on searching for the correct answer...)
I disapprove of the other suggestions in the answers here and feel that they are unsatisfactory. Especially I don't like the suggestion to sort the rows in the applicative layer, as it doesn't scale well and doesn't deal with a case where you need to limit the result set in the DB.
The better approach that I came across is suggested in this thread and it posits that if you need to sort the values in the query you should add them into the key set and then query the key using a range - specifying a desired key and loosening the value range. For example if your key is composed of country, state and city:
emit([doc.address.country,doc.address.state, doc.address.city], doc);
Then you query just the country and get free sorting on the rest of the key components:
startkey=["US"]&endkey=["US",{}]
In case you also need to reverse the order - note that simple defining descending: true will not suffice. You actually need to reverse the start and end key order, i.e.:
startkey=["US",{}]&endkey=["US"]
See more reference at this great source.
I'm unsure about the 1 you have as your returned result, but I'm positive this should do the trick:
emit([doc.hits, split[i]], 1);
The rules of sorting are defined in the docs.
Based on Avi's answer, I came up with this Couchdb list function that worked for my needs, which is simply a report of most-popular events (key=event name, value=attendees).
ddoc.lists.eventPopularity = function(req, res) {
start({ headers : { "Content-type" : "text/plain" } });
var data = []
while(row = getRow()) {
data.push(row);
}
data.sort(function(a, b){
return a.value - b.value;
}).reverse();
for(i in data) {
send(data[i].value + ': ' + data[i].key + "\n");
}
}
For reference, here's the corresponding view function:
ddoc.views.eventPopularity = {
map : function(doc) {
if(doc.type == 'user') {
for(i in doc.events) {
emit(doc.events[i].event_name, 1);
}
}
},
reduce : '_count'
}
And the output of the list function (snipped):
165: Design-Driven Innovation: How Designers Facilitate the Dialog
165: Are Your Customers a Crowd or a Community?
164: Social Media Mythbusters
163: Don't Be Afraid Of Creativity! Anything Can Happen
159: Do Agencies Need to Think Like Software Companies?
158: Customer Experience: Future Trends & Insights
156: The Accidental Writer: Great Web Copy for Everyone
155: Why Everything is Amazing But Nobody is Happy
Every solution above will break couchdb performance I think. I am very new to this database. As I know couchdb views prepare results before it's being queried. It seems we need to prepare results manually. For example each search term will reside in database with hit counts. And when somebody searches, its search terms will be looked up and increments hit count. When we want to see search term popularity, it will emit (hitcount, searchterm) pair.
The Link Retrieve_the_top_N_tags seems to be broken, but I found another solution here.
Quoting the dev who wrote that solution:
rather than returning the results keyed by the tag in the map step, I would emit every occurrence of every tag instead. Then in the reduce step, I would calculate the aggregation values grouped by tag using a hash, transform it into an array, sort it, and choose the top 3.
As stated in the comments, the only problem would be in case of a long tail:
Problem is that you have to be careful with the number of tags you obtain; if the result is bigger than 500 bytes, you'll have couchdb complaining about it, since "reduce has to effectively reduce". 3 or 6 or even 20 tags shouldn't be a problem, though.
It worked perfectly for me, check the link to see the code !

How much information hiding is necessary when doing code refactoring?

How much information hiding is necessary? I have boilerplate code before I delete a record, it looks like this:
public override void OrderProcessing_Delete(Dictionary<string, object> pkColumns)
{
var c = Connect();
using (var cmd = new NpgsqlCommand("SELECT COUNT(*) FROM orders WHERE order_id = :_order_id", c)
{ Parameters = { {"_order_id", pkColumns["order_id"]} } } )
{
var count = (long)cmd.ExecuteScalar();
// deletion's boilerplate code...
if (count == 0) throw new RecordNotFoundException();
else if (count > 1) throw new DatabaseStructureChangedException();
// ...boiler plate code
}
// deleting of table(s) goes here...
}
NOTE: boilerplate code is code-generated, including the "using (var cmd = new NpgsqlCommand( ... )"
But I'm seriously thinking to refactor the boiler plate code, I wanted a more succint code. This is how I envision to refactor the code (made nicer with extension method (not the sole reason ;))
using (var cmd = new NpgsqlCommand("SELECT COUNT(*) FROM orders WHERE order_id = :_order_id", c)
{ Parameters = { {"_order_id", pkColumns["order_id"]} } } )
{
cmd.VerifyDeletion(); // [EDIT: was ExecuteWithVerification before]
}
I wanted the executescalar and the boilerplate code to goes inside the extension method.
For my code above, does it warrants code refactoring / information hiding? Is my refactored operation looks too opaque?
I would say that your refactor is extremely good, if your new single line of code replaces a handful of lines of code in many places in your program. Especially since the functionality is going to be the same in all of those places.
The programmer coming after you and looking at your code will simply look at the definition of the extension method to find out what it does, and now he knows that this code is defined in one place, so there is no possibility of it differing from place to place.
Try it if you must, but my feeling is it's not about succinctness but whether or not you want to enforce the behavior every time or most of the time. And by extension, if the verify-condition changes that it would likely change across the board.
Basically, reducing a small chunk of boiler-plate code doesn't necessarily make things more succinct; it's just one more bit of abstractness the developer has to wade through and understand.
As a developer, I'd have no idea what "ExecuteWithVerify" means. What exactly are we verifying? I'd have to look it up and remember it. But with the boiler-plate code, I can look at the code and understand exactly what's going on.
And by NOT reducing it to a separate method I can also tune the boiler-plate code for cases where exceptions need to be thrown for differing conditions.
It's not information-hiding when you extract or refactor your code. It's only information-hiding when you start restricting access to your extension definition after refactoring.
"new" operator within a Class (except for the Constructor) should be Avoided at all costs. This is what you need to refactor here.

Tree traversal without recursive / stack usage (C#)?

I'm working with WPF and I'm developing a complex usercontrol, which is composed of a tree with rich functionality etc.
For this purpose I used a View-Model design pattern, because some operations couldn't be achieved directly in WPF. So I take the IHierarchyItem (which is a node and pass it to this constructor to create a tree structure)
private IHierarchyItemViewModel(IHierarchyItem hierarchyItem, IHierarchyItemViewModel parent)
{
this.hierarchyItem = hierarchyItem;
this.parent = parent;
List<IHierarchyItemViewModel> l = new List<IHierarchyItemViewModel>();
foreach (IHierarchyItem item in hierarchyItem.Children)
{
l.Add(new IHierarchyItemViewModel(item, this));
}
children = new ReadOnlyCollection<IHierarchyItemViewModel>(l);
}
The problem is that this constructor takes about 3 seconds !! for 200 items on my dual-core.
Am I doing anythig wrong or recursive constructor call is that slow?
Thank you very much!
OK I found a non recursive version by myself, although it uses the Stack.
It traverses the whole tree:
Stack<MyItem> stack = new Stack<MyItem>();
stack.Push(root);
while (stack.Count > 0)
{
MyItem taken = stack.Pop();
foreach (MyItem child in taken.Children)
stack.Push(MyItem);
}
There should be nothing wrong with a recursive implementation of a tree, especially for such a small number of items. A recursive implementation is sometimes less space efficient, and slightly less time efficient, but the code clarity often makes up for it.
It would be useful for you to perform some simple profiling on your constructor. Using one of the suggestions from: http://en.csharp-online.net/Measure_execution_time you could indicate for yourself how long each piece is taking.
There is a possibility that one piece in particular is taking a long time. In any case, that might help you narrow down where you are really spending the time.

Resources