Is there nice yaml documentation documenting most of functionality, written without yaml? - yaml

Official and most of yaml documentation is written in yaml itself. That is nice demonstration of language power. I get it, that's the main point of this documentation. But documenting language by using that yet unknown language is like solving puzzle. At least for me. Searching in this style of documentation really hard: "which operators can I use for string indentation?" In traditional documentating style one would use chapter say "string indentation". But here, while it's a nice demo, you need to read it all, and understand it all, which is extra subpar if you don't work with yaml daily. And yaml language spec is great, if you want to practice context (free?) grammar definition, but greatly unfit for quick search for basic questions.
My question is. Is there yaml documentation, using traditional structure, documenting most of features, not just very few? One html page, sections and paragraphs? I cannot find one, and I'm always struggling/wasting so much time trying to find something in this style of documentation. And every time I read anything, I feel I'm missing so much information, which is not shown, constantly learning X using not yet explained constructs.

Related

Handling comments in code when doing i18n

I'm in the process of translating a Open Source project from Chinese to English, and I've used i18n (in this case babel) to separate the code from both English and Chinese translations.
Everything's done, except for a rather large number of inline comments in the code.
Obviously, babel can't translate comments inline (and it would be rather obnoxious if it did, anyway. Since code would not be unique across languages and therefore less easily verifiable.)
The way I see it, there are a number of options:
Leave comments in -
Pro: Helps original author, etc.
Con: Makes it distracting for ongoing translation and anyone who doesn't speak the language
Strip out all the comments -
Pro: Code is native-language-agnostic, so it makes sense. Who needs comments anyway? Use the source, Luke!
Con: Goes against SE principles. Could lose something important in understanding how the code works - maybe something's been done to avoid a security risk, etc.
Place English comments near Chinese comments
(Possibly moved to lines above and prefixed with "EN" and "ZH", for example).
Pro: Best of both worlds, comments kept close to code
Con: Not conducive to dictionary-style translation. Can get bulky with more languages.
Create a comment dictionary / notes
Pro: Keeps the comments in a separate file for easy translation.
Con: Difficult to keep synced with code. Not intuitive to remember to update comments related to code when changing coe.
Use a different preprocessor for i18n before/after each development cycle.
Pro: Comments et al would be in your language. Could link this to git pull/push so you only ever see the code in your language.
Con: Bulky, non-obvious process. Could result in code-verification or even compilation errors.
None of these seem like really great solutions.
If you do alot of this, and the code is shared publicly between developers who don't share a native tongue, is there a recommended way to handle translating (or not) comments in the code itself?
I am not sure I understand... You say you separated the code from the languages part. So now you should have code (with comments) + English resources + Chinese resources (i used resources for whatever your programming language use to store localizable content)
Translators only see the resources, not the code, nor the comments. The comments stay untranslated, for the developers.
Short Answer
It seems to be a mixture of:
Strip out all the comments, and
Place English comments near Chinese comments.
Inline comments are almost always trivial - Strip them
Functional comments are not as intrusive - Translate them (possibly with a i18n prefix e.g. "[cn]:" or "[en]:").
Explanation
My meagre amount of research tends to suggest that larger projects make strong attempts to reduce comments and let the code explain itself, instead focusing on code quality which reduces the need for comments.
e.g. From the Linux Kernel Coding guidelines:
NEVER try to explain HOW your code works in a comment: it's much
better to write the code so that the working is obvious, and it's
a waste of time to explain badly written code.
...and from the MySQL coding standards:
Comment your code when you do something that someone else may think is
not trivial.
Both of these standards (and others) recommend minimal function descriptions also, so that's not as obtrusive to understanding the code, and, since function descriptions are generally multi-lined and above the code itself, multiple languages can be included as necessary.
Maybe someone, somewhere has built an interface that can isolate comments into the readers language, but I couldn't (yet) find any that do so.
I always think that API comments exported in the project and private comments in open source projects should be internationalized, which is very convenient for developers in other countries.
On Github, there are actually many developers who use their own national language to comment on some well-known open source projects and some of their own annotations. Most of the reason is that if they do not translate, the efficiency of developers reading comments very low.
Similar to .d.ts in TypeScript, I think function annotation translation can also take a similar form, which is more convenient for the community to feedback translation content, because in fact many developers are willing to do so.

Is there a Cypher syntax definition anywhere?

I'm looking for a definition of the syntax for the Cypher query language. I tried the docs but they're very vague.
Ideally, I'd like a BNF (or any variant) definition, or one of those "graph" definitions like this or this. Really, anything resembling a formal definition.
What you are looking for will be available in openCypher. Several items will be released as part of the project, one of the first of which is the BNF grammar.
Update 2016-01-30: A first draft of the grammar is now avialable at \https://github.com/opencypher/openCypher/blob/master/grammar.ebnf.
Update: 2016-10-17: EBNF and Antlr grammars, TCK, railroad diagrams, and a list of community projects are available at http://www.opencypher.org/#resources
Take a look at the recently announced (Oct 2015) openCypher project. It involves releasing the language specification, among other things.
From the announcement:
1. Cypher reference documentation:
A comprehensive user documentation describing use of the Cypher query language with examples and tutorials.
2. Technology compatibility kit (TCK):
The TCK consists of a number of tests that a software supplier would run in order to self-certify support for a given version of Cypher.
3. Reference implementation:
Distributed under the Apache 2.0 license, the reference implementation is a fully functional implementation of key parts of the stack needed to support Cypher inside a data platform or tool. The first planned deliverable is a parser that will take a Cypher statement and parse it into an AST (abstract syntax tree) representation. The reference implementation complements the documentation and tests by providing working implementations of Cypher – which are permissively licensed – and can be used as examples or as a foundation for one’s own implementation.
4. Cypher language specification:
Licensed under a Creative Commons license, the Cypher language specification is a technical expression of the language syntax to enable parsers to auto-generate the query syntax. A full semantic specification is also planned as a part of the openCypher project.
The same announcement also says that the process is open and that it is possible to submit, review and comment on language proposals.
Update!
Neo4j has changed a lot since this answer was written. In 2017 the simple answer is yes, you can download the grammar files from https://www.opencypher.org/
Below is the old answer, which was accurate in 2014
As far as I can tell, the only formal definition is in the code. That's the bad news.
The good news is that the code uses a scala library to do the parsing which makes the code rules look kinda/sorta like BNF. And there's some documentation on how to read it.
Here's a link into a scala object that defines what a query is.
This general package on github looks to me like it contains all of the cypher command implementations, and should have everything you're asking for.
Code in this package is written in scala, and looks like this:
object Query {
def start(startItems: StartItem*) = new QueryBuilder().startItems(startItems:_*)
def matches(patterns:Pattern*) = new QueryBuilder().matches(patterns:_*)
def optionalMatches(patterns:Pattern*) = new QueryBuilder().matches(patterns:_*).makeOptional()
def updates(cmds:UpdateAction*) = new QueryBuilder().updates(cmds:_*)
def unique(cmds:UniqueLink*) = new QueryBuilder().startItems(Seq(CreateUniqueStartItem(CreateUniqueAction(cmds:_*))):_*)
(...)
This matches roughly with the upper right hand quadrant of the Cypher refcard. You can sorta see that there can be a start clause, a match clause, and so on. This includes links to other implementation classes (like UpdateAction which further define clauses considered update actions.
Make sure to also read How Neo4J Uses Scala's Parser Combinator: Cypher's Internals Part 1 for more information on what's going on here, and the mapping between the scala classes and what we'd normally consider EBNF. This blog post is old (2011) and the specific code examples it gives shouldn't be trusted, but I think it has good general information on how the implementation works, and what to look for if you want to understand the EBNF behind cypher.
Disclaimer: I'm not a scala hardcore, YMMV, IANAL, devs please overrule me if I'm wrong.
(Michael Hunger answered in a comment, so I can't accept his answer. Here's his answer:)
Cypher uses parboiled as parser, the parboiled rule DSL are pretty easy to read and understand. https://github.com/neo4j/neo4j/blob/d18583d260a957ab1a14bd27d34eb5625df42bc5/community/cypher/cypher-compiler-2.2/src/main/scala/org/neo4j/cypher/internal/compiler/v2_2/parser/Clauses.scala
None of these seem to work any more.
I don't see anything on the opencypher.org site that looks like a grammar to download.
None of the github links from Michael Hunger work.
I'd really like access to SOME resource where I can learn how to construct queries for functions like avg that allegedly take a list expression as an argument, yet barf at every variant I can figure out.

Parsing XML, how is this actually done? [duplicate]

So, just as a fun project, I decided I'd write my own XML parser. No, not to parse a specific document, and no, not using an XML parser library. I mean writing code to parse out any XML document into a usable data structure. Just because I like the challenge. :-)
With that said, so far it's proved to be... interesting. It's not as easy to parse (especially when you start taking into account special characters, CDATA, empty tags, comments, etc.) as it initially looked.
Are there any well documented XML parsing algorithms or explanations anywhere that anyone knows of? It seems like there are well-documented Queue and Stack and BTree and etc. etc. etc. implementations everywhere, but I'm not sure I've ever seen a simple, well-documented XML parser algorithm...
I repeat: I am not looking for a pre-built parser library! I am looking for information on how to create my own pre-built parser library! Do not tell me "use expat" or "use SAX" or whatever. That's not what I'm asking for.
Antlr offers a tutorial on parsing XML. It breaks the process down into phases: lexing, parsing, tree parsing, etc. Looks pretty interesting.
I don't know if it would be "cheating" in your book, but you could try parsing your XML with a ready-built all-purpose language parser like ANTLR. The result would be a list of tokens (if you just use the lexer) or a parse tree (if you include the parser) and you could then re-build the parse tree almost 1:1 into an XML structure.
Maybe. I haven't thought about the ways in which XML might be different from "normal" ANTLR fodder like programming languages, and whether you would be able to define a suitable grammar.
VTD-XML is probably the simplest parsing technique possible...
http://expat.sourceforge.net/
Expat is an XML parser library written in C. It is a stream-oriented parser in which an application registers handlers for things the parser might find in the XML document (like start tags). An introductory article on using Expat is available on xml.com.

Scheme core language specification

I am learning my way around Scheme, and I am especially interested in how the language is constructed. I'm trying to find a nice description of the core syntax for a Scheme implementation. I don't know enough about the standards, but I assume that they all contain macro systems. If not, I'd like to read about a standard that also includes macros (they can't possibly be implemented in simpler Scheme constructs, can they?).
Does anyone have a good reference for the minimal syntax needed for a Scheme dialect?
Just an update:
I also stumbled upon this: http://matt.might.net/articles/compiling-to-java/#sec1. If you also add define-syntax and delay then it seems like it might be a good start.
In the R5RS specification, the following page appears to be what I was looking for: formal syntax
Although it may be a bit dry, you should read over the R5RS spec or the R6RS spec.
The docs really do not take that long to read through and you can just skim most of the sections until you need more detail. But either document does cover all of the minimal syntax required, including macros.

Standards Document

I am writing a coding standards document for a team of about 15 developers with a project load of between 10 and 15 projects a year. Amongst other sections (which I may post here as I get to them) I am writing a section on code formatting. So to start with, I think it is wise that, for whatever reason, we establish some basic, consistent code formatting/naming standards.
I've looked at roughly 10 projects written over the last 3 years from this team and I'm, obviously, finding a pretty wide range of styles. Contractors come in and out and at times, and sometimes even double the team size.
I am looking for a few suggestions for code formatting and naming standards that have really paid off ... but that can also really be justified. I think consistency and shared-patterns go a long way to making the code more maintainable ... but, are there other things I ought to consider when defining said standards?
How do you lineup parenthesis? Do you follow the same parenthesis guidelines when dealing with classes, methods, try catch blocks, switch statements, if else blocks, etc.
Do you line up fields on a column? Do you notate/prefix private variables with an underscore? Do you follow any naming conventions to make it easier to find particulars in a file? How do you order the members of your class?
What about suggestions for namespaces, packaging or source code folder/organization standards? I tend to start with something like:
<com|org|...>.<company>.<app>.<layer>.<function>.ClassName
I'm curious to see if there are other, more accepted, practices than what I am accustomed to -- before I venture off dictating these standards. Links to standards already published online would be great too -- even though I've done a bit of that already.
First find a automated code-formatter that works with your language. Reason: Whatever the document says, people will inevitably break the rules. It's much easier to run code through a formatter than to nit-pick in a code review.
If you're using a language with an existing standard (e.g. Java, C#), it's easiest to use it, or at least start with it as a first draft. Sun put a lot of thought into their formatting rules; you might as well take advantage of it.
In any case, remember that much research has shown that varying things like brace position and whitespace use has no measurable effect on productivity or understandability or prevalence of bugs. Just having any standard is the key.
Coming from the automotive industry, here's a few style standards used for concrete reasons:
Always used braces in control structures, and place them on separate lines. This eliminates problems with people adding code and including it or not including it mistakenly inside a control structure.
if(...)
{
}
All switches/selects have a default case. The default case logs an error if it's not a valid path.
For the same reason as above, any if...elseif... control structures MUST end with a default else that also logs an error if it's not a valid path. A single if statement does not require this.
In the occasional case where a loop or control structure is intentionally empty, a semicolon is always placed within to indicate that this is intentional.
while(stillwaiting())
{
;
}
Naming standards have very different styles for typedefs, defined constants, module global variables, etc. Variable names include type. You can look at the name and have a good idea of what module it pertains to, its scope, and type. This makes it easy to detect errors related to types, etc.
There are others, but these are the top off my head.
-Adam
I'm going to second Jason's suggestion.
I just completed a standards document for a team of 10-12 that work mostly in perl. The document says to use "perltidy-like indentation for complex data structures." We also provided everyone with example perltidy settings that would clean up their code to meet this standard. It was very clear and very much industry-standard for the language so we had great buyoff on it by the team.
When setting out to write this document, I asked around for some examples of great code in our repository and googled a bit to find other standards documents that smarter architects than I to construct a template. It was tough being concise and pragmatic without crossing into micro-manager territory but very much worth it; having any standard is indeed key.
Hope it works out!
It obviously varies depending on languages and technologies. By the look of your example name space I am going to guess java, in which case http://java.sun.com/docs/codeconv/ is a really good place to start. You might also want to look at something like maven's standard directory structure which will make all your projects look similar.

Resources