dot (graphviz): What is "type=s"? - graphviz

I see this in examples all the time (like the ethane molecule one) and it is never explained.
What is [type=s]? what are the different types?

I see this in examples all the time (like the ethane molecule one) ...
I assume you're referring to the ethane molecule example on Wikipedia's DOT language page:
graph ethane {
C_0 -- H_0 [type=s];
C_0 -- H_1 [type=s];
C_0 -- H_2 [type=s];
C_0 -- C_1 [type=s];
C_1 -- H_3 [type=s];
C_1 -- H_4 [type=s];
C_1 -- H_5 [type=s];
}
A few things of interest:
The use of attribute type on the Wikipedia page dates back to 2004 and is in the first version of the page, almost identical to the version there today.
There is no attribute type listed in the current graphviz documentation.
I installed graphviz 1.14 and its DOT documentation (circa 2002) does not list type as an attribute.
I found DOT documentation for graphviz 1.7, dated 1996. It doesn't list attribute type either.
Removing the type attributes from the graph or changing their values does not affect the output for the current version of graphviz or version 1.14.
Various attributes have changed over time, and it's possible that type is the old name of something like tailPort, which accepts a portPos for input to indicate which side of a node to attach the edge ("s" for "south", "n" for "north", etc). Maybe it was used in a version that I don't have the documentation for. Or maybe it was never used at all, and people just faithfully copied it from Wikipedia. ;)

Related

Why does accessing coefficients following estimation with nl require slightly different syntax than for other estimation commands?

Following most estimation commands in Stata (e.g. reg, logit, probit, etc.) one may access the estimates using the _b[ParameterName] syntax (or the synonymous _coef[ParameterName]). For example:
regress y x
followed by
di _b[x]
will display the estimate of the coefficient of x. di _b[_cons] will display the coefficient of the estimated intercept (assuming the regress command was successful), etc.
But if I use the nonlinear least squares command nl I (seemingly) have to do something slightly different. Now (leaving aside that for this example model there is absolutely no need to use a NLLS regression):
nl (y = {_cons} + {x}*x)
followed by (notice the forward slash)
di _b[/x]
will display the estimate of the coefficient of x.
Why does accessing parameter estimates following nl require a different syntax? Are there subtleties to be aware of?
"leaving aside that for this example model there is absolutely no need to use a NLLS regression": I think that's what you can't do here....
The question is about why the syntax is as it is. That's a matter of logic and a matter of history. Why a particular syntax was chosen is ultimately a question for the programmers at StataCorp who chose it. Here is one limited take on your question.
The main syntax for regression-type models grows out of a syntax designed for linear regression models in which by default the parameters include an intercept, as you know.
The original syntax for nonlinear regression models (in the sense of being estimated by nonlinear least-squares) matches a need to estimate a bundle of parameters specified by the user, which need not include an intercept at all.
Otherwise put, there is no question of an intercept being a natural default; no parameterisation is a natural default and each model estimated by nl is sui generis.
A helpful feature is that users can choose the names they find natural for the parameters, within the constraints of what counts as a legal name in Stata, say alpha, beta, gamma, a, b, c, etc. If you choose _cons for the intercept in nl that is a legal name but otherwise not special and just your choice; nl won't take it as a signal that it should flip into using regress conventions.
The syntax you cite is part of what was made possible by a major redesign of nl but it is consistent with the original philosophy.
That the syntax is different because it needs to be may not be the answer you seek, but I guess you'll get a fuller answer only from StataCorp; developers do hang out on Statalist, but they don't make themselves visible here.

Adaptive Updates In Vowpal Wabbit Formula

I am looking at the following 2 presentations about the updates done by VW when the --adaptive flag is used. It seems these are different.
http://www.slideshare.net/jakehofman/technical-tricks-of-vowpal-wabbit
https://github.com/JohnLangford/vowpal_wabbit/wiki/v6.1_tutorial.pdf
With these two descriptions (respectively):
#1
#2
My questions:
Which of these are correct (or are they the same)?
For number 1 it appears that the gradient from the t+1 example is used in the denominator. How is this done? Does this mean that the new weight (labeled w_i) is the weight for example t+1?
As you noticed, the first presentation contains an error/typo in the AdaGrad formula. The formula should be w_{i, t+1} := w_{i, t} - (\eta * g_{i, t} / \sqrt{sum}), where sum=\sum_{t'=1}^t g_{i, t'}^2.
In VowpalWabbit, --adaptive (corresponding to the AdaGrad idea) is on by default. But --normalized and --invariant are also on by default, which means that on top of plain AdaGrad few more tricks/improvements are applied. The interaction of all these tricks is complex and there is no single slide which describes all the aspects, so the only reference is the source code (gd.cc).
Which of these are correct (or are they the same)?
I think they are not same, but they are different "layers" of the complex code. I think that the slide 33 (which you cite as #2) of the second presentation corresponds to the slide 31 (which you don't cite) of the
first presentation, but I am not sure.

Is there a diff-like algorithm that handles moving block of lines?

The diff program, in its various incarnations, is reasonably good at computing the difference between two text files and expressing it more compactly than showing both files in their entirety. It shows the difference as a sequence of inserted and deleted chunks of lines (or changed lines in some cases, but that's equivalent to a deletion followed by an insertion). The same or very similar program or algorithm is used by patch and by source control systems to minimize the storage required to represent the differences between two versions of the same file. The algorithm is discussed here and here.
But it falls down when blocks of text are moved within the file.
Suppose you have the following two files, a.txt and b.txt (imagine that they're both hundreds of lines long rather than just 6):
a.txt b.txt
----- -----
1 4
2 5
3 6
4 1
5 2
6 3
diff a.txt b.txt shows this:
$ diff a.txt b.txt
1,3d0
< 1
< 2
< 3
6a4,6
> 1
> 2
> 3
The change from a.txt to b.txt can be expressed as "Take the first three lines and move them to the end", but diff shows the complete contents of the moved chunk of lines twice, missing an opportunity to describe this large change very briefly.
Note that diff -e shows the block of text only once, but that's because it doesn't show the contents of deleted lines.
Is there a variant of the diff algorithm that (a) retains diff's ability to represent insertions and deletions, and (b) efficiently represents moved blocks of text without having to show their entire contents?
Since you asked for an algorithm and not an application, take a look at "The String-to-String Correction Problem with Block Moves" by Walter Tichy. There are others, but that's the original, so you can look for papers that cite it to find more.
The paper cites Paul Heckel's paper "A technique for isolating differences between files" (mentioned in this answer to this question) and mentions this about its algorithm:
Heckel[3] pointed out similar problems with LCS techniques and proposed a
linear-lime algorithm to detect block moves. The algorithm performs adequately
if there are few duplicate symbols in the strings. However, the algorithm gives
poor results otherwise. For example, given the two strings aabb and bbaa,
Heckel's algorithm fails to discover any common substring.
The following method is able to detect block moves:
Paul Heckel: A technique for isolating differences between files
Communications of the ACM 21(4):264 (1978)
http://doi.acm.org/10.1145/359460.359467 (access restricted)
Mirror: http://documents.scribd.com/docs/10ro9oowpo1h81pgh1as.pdf (open access)
wikEd diff is a free JavaScript diff library that implements this algorithm and improves on it. It also includes the code to compile a text output with insertions, deletions, moved blocks, and original block positions inserted into the new text version. Please see the project page or the extensively commented code for details. For testing, you can also use the online demo.
Git 2.16 (Q1 2018) will introduce another possibility, by ignoring some specified moved lines.
"git diff" learned a variant of the "--patience" algorithm, to which the user can specify which 'unique' line to be used as anchoring points.
See commit 2477ab2 (27 Nov 2017) by Jonathan Tan (jhowtan).
(Merged by Junio C Hamano -- gitster -- in commit d7c6c23, 19 Dec 2017)
diff: support anchoring line(s)
Teach diff a new algorithm, one that attempts to prevent user-specified lines from appearing as a deletion or addition in the end result.
The end user can use this by specifying "--anchored=<text>" one or more
times when using Git commands like "diff" and "show".
The documentation for git diff now reads:
--anchored=<text>:
Generate a diff using the "anchored diff" algorithm.
This option may be specified more than once.
If a line exists in both the source and destination, exists only once, and starts with this text, this algorithm attempts to prevent it from appearing as a deletion or addition in the output.
It uses the "patience diff" algorithm internally.
See the tests for some examples:
pre post
a c
b a
c b
normally, c is moved to produce the smallest diff.
But:
git diff --no-index --anchored=c pre post
Diff would be a.
With Git 2.33 (Q3 2021), the command line completion (in contrib/) learned that "git diff"(man) takes the --anchored option.
See commit d1e7c2c (30 May 2021) by Thomas Braun (t-b).
(Merged by Junio C Hamano -- gitster -- in commit 3a7d26b, 08 Jul 2021)
completion: add --anchored to diff's options
Signed-off-by: Thomas Braun
This flag was introduced in 2477ab2 ("diff: support anchoring line(s)", 2017-11-27, Git v2.16.0-rc0 -- merge listed in batch #10) but back then, the bash completion script did not learn about the new flag.
Add it.
Here's a sketch of something that may work. Ignore diff insertations/deletions for the moment for the sake of clarity.
This seems to consist of figuring out the best blocking, similar to text compression. We want to find the common substring of two files. One options is to build a generalized suffix tree and iteratively take the maximal common substring , remove it and repeat until there are no substring of some size $s$. This can be done with a suffix tree in O(N^2) time (https://en.wikipedia.org/wiki/Longest_common_substring_problem#Suffix_tree). Greedily taking the maximal appears to be optimal (as a function of characters compressed) since taking a character sequence from other substring means adding the same number of characters elsewhere.
Each substring would then be replaced by a symbol for that block and displayed once as a sort of 'dictionary'.
$ diff a.txt b.txt
1,3d0
< $
6a4,6
> $
$ = 1,2,3
Now we have to reintroduce diff-like behavior. The simple (possibly non-optimal) answer is to simply run the diff algorithm first, omit all the text that wouldn't be output in the original diff and run the above algorithm.
SemanticMerge, the "semantic scm" tool mentioned in this comment to one of the other answers, includes a "semantic diff" that handles moving a block of lines (for supported programming languages). I haven't found any details about the algorithm but it's possible the diff algorithm itself isn't particular interesting as it's relying on the output of a separate parsing of the programming language source code files themselves. Here's SemanticMerge's documentation on implementing an (external) language parser, which may shed some light on how its diffs work:
External parsers - SemanticMerge
I tested it just now and its diff is fantastic. It's significantly better than the one I produced using the demo of the algorithm mentioned in this answer (and that diff was itself much better than what was produced by Git's default diff algorithm) and I suspect still better than one likely to be produced by the algorithm mentioned in this answer.
Our Smart Differencer tools do exactly this when computing differences between source texts of two programs in the same programmming language. Differences are reported in terms of program structures (identifiers, expressions, statements, blocks) precise to line/column number, and in terms of plausible editing operations (delete, insert, move, copy [above and beyond OP's request for mere "copy"], rename-identifier-in-block).
The SmartDifferencers require an structured artifact (e.g., a programming language), so it can't do this for arbitrary text. (We could define structure to be "just lines of text" but didn't think that would be particularly valuable compared to standard diff).
For this situation in my real life coding, when I actually move a whole block of code to another position in the source, because it makes more sense either logically, or for readability, what I do is this:
clean up all the existing diffs and commit them
so that the file just requires the move that we are looking for
remove the entire block of code from the source
save the file
and stage that change
add the code into the new position
save the file
and stage that change
commit the two staged patches as one commit with a reasonable message
Check also this online tool simtexter based on the SIM_TEXT algorithm. It strongly seems the best.
You can also have a look to the source code for the Javascript implementation or C / Java.

Word Suggestion program

Suggest me a program or way to handle the word correction / suggestion system.
- Let's say the input is given as 'Suggset', it should suggest 'Suggest'.
Thanx in advance. And I'm using python and AJAX. Please don't suggest me any jquery modules cuz I need the algorithmic part.
Algorithm that solves your problem called "edit distance". Given the list of words in some language and mistyped/incomplete word you need to build a list of words from given dictionary closest to it. For example distance between "suggest" and "suggset" is equal to 2 - you need one deletion and one insertion. As an optimization you can assign different weights to each operation - for example you can say that substitution is cheaper than deletion and substitution between two letters that lie closer on keyboard (for example 'v' and 'b') is cheaper that between those that are far apart (for example 'q' and 'l').
First description of algorithm for spelling and correction appeared in 1964. In 1974 efficient algorithm based on dynamic programming appeared in paper called "String-to-string correction problem" by Robert A. Wagner and Michael J. Fischer. Any algorithms book have more or less detailed treatment of it.
For python there is library to do that: Levenshtein distance library
Also check this earlier discussion on Stack Overflow
It will take a lot of work to make one of those yourself. There is a really nice spell checker library written in python called PyEnchant that I've found to be quite nice. Here's an example from their website:
>>> import enchant
>>> d = enchant.Dict("en_US")
>>> d.check("Hello")
True
>>> d.check("Helo")
False
>>> d.suggest("Helo")
['He lo', 'He-lo', 'Hello', 'Helot', 'Help', 'Halo', 'Hell', 'Held', 'Helm', 'Hero', "He'll"]
>>>

Eligibility trace algorithm, the update order

I am reading Silver et al (2012) "Temporal-Difference Search in Computer Go", and trying to understand the update order for the eligibility trace algorithm.
In the Algorithm 1 and 2 of the paper, weights are updated before updating the eligibility trace. I wonder if this order is correct (Line 11 and 12 in the Algorithm 1, and Line 12 and 13 of the Algorithm 2).
Thinking about an extreme case with lambda=0, the parameter is not updated with the initial state-action pair (since e is still 0). So I doubt the order possibly should be the opposite.
Can someone clarify the point?
I find the paper very instructive for learning the reinforcement learning area, so would like to understand the paper in detail.
If there is a more suitable platform to ask this question, please kindly let me know as well.
It looks to me like you're correct, e should be updated before theta. That's also what should happen according to the math in the paper. See, for example, Equations (7) and (8), where e_t is first computed using phi(s_t), and only THEN is theta updated using delta V_t (which would be delta Q in the control case).
Note that what you wrote about the extreme case with lambda=0 is not entirely correct. The initial state-action pair will still be involved in an update (not in the first iteration, but they will be incorporated in e during the second iteration). However, it looks to me like the very first reward r will never be used in any updates (because it only appears in the very first iteration, where e is still 0). Since this paper is about Go, I suspect it will not matter though; unless they're doing something unconventional, they probably only use non-zero rewards for the terminal game state.

Resources