Did Ruby deprecate the wrong File exists method? - ruby

The docs of File.exist? says:
Return true if the named file exists.
Note that last word used; "exists". This is correct. "File exist" without the ending s is not correct.
The method File.exists? exists, but they deprecated this method. I am thinking it should have been the other way around. What am I missing?
Also, it's noteworthy that other languages/libraries use exists, for example Java and .NET.
Similarly, "this equals that" - but Ruby uses equal, again dropping the ending s. I am getting a feeling that Ruby is actively walking in another direction than mainstream. But then there has to be a reason?

This is largely a subjective call. Do you read the call as "Does this file exist?" or "File exists"? Both readings have their merits.
Historically Ruby has had a lot of aliased methods like size vs. length, but lately it seems like the core team is trying to focus on singular, consistent conventions that apply more broadly.
You'd have to look closely at the conversations on the internal mailing list surrounding the decisions here. I can't find them easily, only people dealing with the changes as deprecation warnings pop up.
The Ruby core team is a mix of people who speak different languages but the native language is Japanese, so perhaps that's guiding some of these decisions. It could be a preference to avoid odd inflections on verbs.

I agree that if File.exists?('x.txt') reads much more natural than the plural form, which was probably Matz's intention for the alias. As far as I'm concerned, this particular deprecation was misguided.
However the general preference of a plural form may well be routed in handling enumarables/collections, where plural makes a sense when used with idioms like this:
pathnames.select(&:exist?)

exist? matches the convention used elsewhere throughout the stdlib, and goes back to the early days of ruby. For example array.include? (not includes?), string.match? (not matches?), object.respond_to? (not responds_to?). So in this light, File.exists? was always a blemish.
Some recommend that you read the dot as "does". So "if file does exist," "if array does include," "if string does match," etc.

Related

Why is prefixing class, method, or variable names with "My" considered bad practice?

Often in code snippets or examples, the prefix My is used for classes, methods, and variables.
We see this all the time, from Google's gcm example to numerous questions on stackoverflow. It can be found in documentation for PHP, Javascript, C#, and really just about every other language.
So, given it's extreme prevalence, why shouldn't I put MyClass with myMethod using myVar1 into production? Why is this bad practice, and what should I do instead?
Name are important
Everything else comes down to that one simple fact.
Having clear, meaningful variable names is an important factor in writing understandable and maintainable code. - Computational Fairy Tales
When you write a piece of code, you want anyone reading it (which includes you in three months!) to know what it does almost immediately. In order to remember, you would probably need to add a comment about it. However...
The proper use of comments is to compensate for our failure to express our self in code - Express Names in Code: Bad vs Clean
To help with this, your name should be both descriptive and concise. Generally, when the prefix My is added, it replaces a prefix that could be more informational. Instead of MyService, why not GcmListenerService?
What if I make it descriptive, but I just add My? Like MyGcmListenerService?
That breaks the consice part of the rule. What do you gain from adding My? Is the My adding value to the name? Even if you were attempting to take possession of the code (which would be better done and is feasible using vcs), My is meaningless. Who wrote My? Well, I did, obviously. It says it right there, "My".
If I really shouldn't use My, why is it in so many examples?
Really, it's just a placeholder.
It's like the metasyntactic variables "foo" and "bar" - it's usually used as a placeholder for a real name. - Opinions on using My as a class name prefix
Unfortunately, most experienced programmers just assume that people looking up examples will know this, and they will replace My with a good variable name when they actually use the code. However, for those new to programming, if you see it all over the place, instead of knowing it is a placeholder, you may think it is actually a standard, and best practice to use My.
Ok, then what do I use instead of My?
There are a lot of really good guides about naming variables and classes out there. You can start with wikipedia, but if you google around a bit you can find lots of articles about good vs bad names. At the heart of it all, though, is one rules: make names descriptive, and keep them concise.
If I read the name of a class, method, or variable and immediately know its purpose, it is a good name.

What is the meaning of the "#" prefix on some D attributes?

The D Programming Language has at least two attributes prefixed with the "#" symbol:
#disable
#property
What sort of meaning is "#" supposed to convey? I can't seem to locate anything relevant in the documentation.
Also, why is __gshared the only attribute with two leading underscores?
It has no meaning.
Yes, that probably wasn't what you were hoping to hear -- but that's what they've said in the newsgroups.
The # doesn't really mean anything at this point. All of the #x words are function attributes. The # was tacked on pretty much just to save keywords. So, in general, newer attributes have # on them and older ones don't (though there was some shuffling around of that a while back where there was some debate over whether some of the attributes should have # or not). If they were redone from scratch without caring what other languages have done, then you might have gotten # on all of the function attributes, but there was no way that stuff like #public was going to happen, since it would have just made porting code harder for no real benefit. The end result is that what got # and what didn't is fairly arbitrary. You just have to remember which attributes start with # and which don't, but that's not all that much different from having to learn new keywords. It's just that these are prefixed with # so that they aren't actually keywords and don't reduce the number of legal identifiers in the language.
Now, there's definitely a desire among many in the D community to use # for custom attributes in the future, in which case, # would indicate a custom attribute in the cases where the name used wasn't one built into the language, but for all of the ones built into the language, it pretty much just amounts to saving a keyword.
As Mehrdad shows (see the links in the comments), there's no special meaning to "#", they are how they are just for historical reasons.
As for your other question, __gshared isn't the only keyword with two underscores, there's also __thread and __traits. This naming convention is commonly used to denote internal data structures, which need to be exposed for practical reasons but are not "safe" to use in all cases (i.e. more a hack than a well-established feature). I'm not sure whether or not the D language follows this convention, but seeing this quote from the docs I believe that's the case:
__gshared is disallowed in safe mode.
I'm searching for more info about __thread and __traits (which indeed are not attributes), but so far could find very little.

Why is Pathname not considered a string (or when is it okay to add a to_str method to a class)?

There was a bug report Pathname#to_str doesn't appear to work anymore. Having to_str in Pathname would let you use Pathname anywhere you use a string, eg with Dir, system, etc. However the bug report was rejected, and it appears from a commit message that the to_str method was deliberately removed.
I don't understand this - Pathname can be loselessly converted to a String and back, and it would be very convenient when working with APIs that don't use Pathname.
So why is having to_str not appropriate for Pathname, and when is to_str okay to use?
Here's an article that attempts to answer your question but, in my opinion, doesn't really succeed.
Around the same time Pathname#to_str was removed Exception#to_str was removed as well--clearly Matz was trying to draw a line in the sand at this time between "stringlike" and "non-stringlike" classes. The Exception change makes sense--an Exception cannot, to use your words, "be losslessly converted to a String and back," because an Exception object contains lots of other information--the stack trace in particular--that would be lost in that conversion.
I can only guess but I'd wager that Matz felt the same way about Pathname, though it's unclear why. Even to documentation (1.9.3) at one point, says (under "Core methods"), "These methods are effectively manipulating a String, because that’s all a path is." Several sources I've found--in addition to the one #MarkThomas cites--use Pathname as an example of a class for which to_str does make sense, probably taking a cue from Hal Fulton's The Ruby Way.
I guess this isn't a very satisfactory answer. If you really want to know you may have to ask on Ruby-Talk or Ruby-Core. You could try asking Matz [on Twitter](Yukihiro Matsumoto), but he seems to converse exclusively in Japanese. Wycats and Jeremy Kemper may have some more insight, though, and seem pretty accessible. Good luck!
P.S. This article has a section "Technical explanation of to_str and friends" that I found interesting, but it doesn't do any better a job of answering your question.

Are semantics and syntax the same?

What is the difference in meaning between 'semantics' and 'syntax'? What are they?
Also, what's the difference between things like "semantic website vs. normal website", "semantic social networking vs. normal social networking" etc.
Syntax is the grammar. It describes the way to construct a correct sentence. For example, this water is triangular is syntactically correct.
Semantics relates to the meaning. this water is triangular does not mean anything, though the grammar is ok.
Talking about the semantic web has become trendy recently. The idea is to enhance the markup (structural with HTML) with additional data so computer could make sense of the web pages more easily.
Syntax is the grammar of a language - the rules by which to form sentences or expressions.
Semantics is the meaning you are trying to express with your code.
A program that is syntactically correct will compile and run.
A program that is semantically correct will actually do what you as the programmer intended it to do. i.e. it doesn't have any bugs in it.
Two programs written to perform the same task in different languages will use different syntaxes, but they would be the same semantically.
If you are talking about web (rather than programming languages):
The syntax of the language is whatever the browser (or processing program) can legally recognize and handle, and render to you. For example, your browser can render HTML, while your API can parse XML trees.
Semantics involve what is actually being represented. There's a lot of buzz now about semantic webs and all that stuff, but it essentially means that each entity is also associated with some human-readable information or metadata, so that a certain tag would have a supposed meaning and refer you to it.
Social networks are the same story. You put knowledge in the links
"An ant ate an aunt." has a correct syntax, but will not make sense semantically. A syntax is a set of rules that can be combined to produce infinite number of gramatically valid sentences, but few, very few of which has a semantics.
Syntax is the word order of a sentence. In English it would be the subject-verb-object form.
Semantic is the meaning behind words. E.g: she ate a saw. The word saw doesn't match according to the meaning of the sentence. but it is grammatically correct. so its syntax is correct. =)
Specifically, semantic social networking means embedding the actual social relationships within the page markup. The standard format for doing this as defined by microformats is XFN, XHTML Friends Network. In regards to the semantic web in general, microformats should be the go-to guide for defining embedded semantic content.
Semantic web sites use the concept of the semantic web, which aims to bring meaning to web content by using special annotations to identify certain concepts in a page. This makes possible the automatic (by a computer, not a human) reasoning about the content, which improves its aggregation, extraction, indexing and searching.
Explanations above are vague on the semantics side, semantics could mean the different elements at disposition to build arguments of value(these being comprehensible, to end-user man and digestible to the machine).
Of course this puts semantics and the programmer-editor-writer-communicator in the middle: he decides on the semantics that should be ideally defined to his public, comprehended by his public, general convention by his public and digestible to the machine-computer. Semantics should be agreed upon, are conceptual, must be implementable to both sides.
Say footnotes, inline and block-quotes, titles and on and on to end up into a well-defined and finite list. Mediawiki, wikitext as an example fails in that perspective, defining syntax for elements of semantic meaning left undefined, no finite list agreed upon. "meaning by form" as additional of what a title as an example again carries as textual content. Example "This is a title" becomes only semantics integrated by the supposition within agreed upon semantics, and there can be more then one set of say "This is important and will be detailed"
Asciidoc and pandoc markup is quite different in it's semantics, regardless of how each translates this by convention of syntax to output formats.
Programming, output formats as html, pdf, epub can have consequentially meaning by form, by semantics, the syntax having disappeared as a temporary tool of translation, and as one more consequence thus the output can be scanned robotically for meaning, the champ of algorithms of 'grep': Google. Looking for the meaning of "what" in "What is it that is looked for" based upon whether a title or a footnote, or a link is considered.
Semantics, and there can be more then one layer, even the textual message carries (Chomsky) semantics thus could be translated as meaning by form, creating functional differences to anything else in the output chain, including a human being, the reader.
As a conclusion, programmers and academics should be integrated, no academic should be without knowledge of his tools, as any bread and butter carpenter. Programmers should be academics in the sense that the other end of the bridging they accomplish is the end user, the bridge... much so: semantics.
m.

Standards Document

I am writing a coding standards document for a team of about 15 developers with a project load of between 10 and 15 projects a year. Amongst other sections (which I may post here as I get to them) I am writing a section on code formatting. So to start with, I think it is wise that, for whatever reason, we establish some basic, consistent code formatting/naming standards.
I've looked at roughly 10 projects written over the last 3 years from this team and I'm, obviously, finding a pretty wide range of styles. Contractors come in and out and at times, and sometimes even double the team size.
I am looking for a few suggestions for code formatting and naming standards that have really paid off ... but that can also really be justified. I think consistency and shared-patterns go a long way to making the code more maintainable ... but, are there other things I ought to consider when defining said standards?
How do you lineup parenthesis? Do you follow the same parenthesis guidelines when dealing with classes, methods, try catch blocks, switch statements, if else blocks, etc.
Do you line up fields on a column? Do you notate/prefix private variables with an underscore? Do you follow any naming conventions to make it easier to find particulars in a file? How do you order the members of your class?
What about suggestions for namespaces, packaging or source code folder/organization standards? I tend to start with something like:
<com|org|...>.<company>.<app>.<layer>.<function>.ClassName
I'm curious to see if there are other, more accepted, practices than what I am accustomed to -- before I venture off dictating these standards. Links to standards already published online would be great too -- even though I've done a bit of that already.
First find a automated code-formatter that works with your language. Reason: Whatever the document says, people will inevitably break the rules. It's much easier to run code through a formatter than to nit-pick in a code review.
If you're using a language with an existing standard (e.g. Java, C#), it's easiest to use it, or at least start with it as a first draft. Sun put a lot of thought into their formatting rules; you might as well take advantage of it.
In any case, remember that much research has shown that varying things like brace position and whitespace use has no measurable effect on productivity or understandability or prevalence of bugs. Just having any standard is the key.
Coming from the automotive industry, here's a few style standards used for concrete reasons:
Always used braces in control structures, and place them on separate lines. This eliminates problems with people adding code and including it or not including it mistakenly inside a control structure.
if(...)
{
}
All switches/selects have a default case. The default case logs an error if it's not a valid path.
For the same reason as above, any if...elseif... control structures MUST end with a default else that also logs an error if it's not a valid path. A single if statement does not require this.
In the occasional case where a loop or control structure is intentionally empty, a semicolon is always placed within to indicate that this is intentional.
while(stillwaiting())
{
;
}
Naming standards have very different styles for typedefs, defined constants, module global variables, etc. Variable names include type. You can look at the name and have a good idea of what module it pertains to, its scope, and type. This makes it easy to detect errors related to types, etc.
There are others, but these are the top off my head.
-Adam
I'm going to second Jason's suggestion.
I just completed a standards document for a team of 10-12 that work mostly in perl. The document says to use "perltidy-like indentation for complex data structures." We also provided everyone with example perltidy settings that would clean up their code to meet this standard. It was very clear and very much industry-standard for the language so we had great buyoff on it by the team.
When setting out to write this document, I asked around for some examples of great code in our repository and googled a bit to find other standards documents that smarter architects than I to construct a template. It was tough being concise and pragmatic without crossing into micro-manager territory but very much worth it; having any standard is indeed key.
Hope it works out!
It obviously varies depending on languages and technologies. By the look of your example name space I am going to guess java, in which case http://java.sun.com/docs/codeconv/ is a really good place to start. You might also want to look at something like maven's standard directory structure which will make all your projects look similar.

Resources