Best practice for Cocoa category naming conventions

Best practice for Cocoa category naming conventions - cocoa

I am tidying up my ancient Cocoa code to use modern naming conventions. There has been lots of discussion on best practices, but I'm unsure of one thing.
I'm thinking about adding a prefix to category method names, to ensure uniqueness. It seem generally agreed that this is a good idea, though most people probably don't bother.
My question is: what about a NSDictionary category method like -copyDeep that does a deep copy? The method used to be named -deepCopy, but I reversed the words as the analyzer looks for a prefix of "copy". Therefore I presumably couldn't add a prefix. And having the "prefix" in the middle or end of the method name seems messy and inconsistent.
I'd also be interested in thoughts on the style of prefix -- I currently use DS (for Dejal Systems) for class prefixes. But I know that Apple now wants to reserve all two-character prefixes for themselves, so am thinking about using Dejal, e.g. my class DSManagedObject would be renamed as DejalManagedObject. And getting back to categories, their methods would be renamed to add a dejal prefix, e.g. from -substringFromString: to -dejalSubstringFromString:. But -dejalCopyDeep would confuse the analyzer, so maybe I'd have to be inconsistent for such methods, and use -copyDeepDejal or -copyDeep_dejal?
I will be re-releasing my categories and various classes as open source once I've cleaned them up, so following the latest conventions will be beneficial.

I emailed the Apple Application Frameworks Evangelist about this, and got a reply that recommended not prefixing category method names. Which conflicts with the advice in the aforelinked WWDC10 session, but I assume reflects Apple's current thinking.
He recommended just looking at the beta seed API diffs to spot conflicts, which is what I've always been doing.

I agree with Kevin Ballard, you should prefix your category method names, particularly if you are going to distribute them to others. But you do have a valid concern the analyzer will be confused by DScopy. The ARC compiler will similarly be confused if the definition/implementation of DScopy is done without ARC and is used by another class using ARC (or vice versa).
My preferred solution is to use "ownership transfer annotations", such as:
NS_RETURNS_NOT_RETAINED
NS_RETURNS_RETAINED
They would be used to override the compilers default behavior of reading method names and acting on them. You might declare DScopy like so: (This declaration must be in a header file that is imported by all the classes that use this method mentioned due to link)
-(DSManagedObject *)DScopy; NS_RETURNS_RETAINED;
Source for NS_RETURNS... WWDC 2011 Session 322 - Objective-C Advancements in Depth. The meat of this issue begins at about time 9:10.
A note about "But I know that Apple now wants to reserve all two-character prefixes for themselves". As a personal preference I like to use the _ character to separate the prefix from the name, it works well for me. You might try something like:
-(DSManagedObject *)ds_copy; NS_RETURNS_RETAINED;
This would give you three characters, and arguably make the method name more readable.
Edit In response to link posted in comment.
However as Justin's answer to your original question says that can be broken.
With regards to attributes; I did not suggest using __attribute__((objc_method_family(copy))) I suggested using NS_RETURNS_RETAINED, which translates to :__attribute__((ns_returns_retained)). While the first example there won't even compile (as he says) using - (NSString *)string __attribute__((objc_method_family(copy))); it compiles with - (NSString *)string; NS_RETURNS_RETAINED; just fine.
Obviously also if the NS_RETURNS_.. are "hidden" from the compiler in separate the .ms or indirected in some other way and the compiler can't see the directives then it won't work. Because of this I would suggest putting the declaration for any methods that may cause the analyzer/compiler confusion in your main .h file (the one that imports all the others) to limit the chances that there will be an issue.

Related

Why is prefixing class, method, or variable names with "My" considered bad practice?

Often in code snippets or examples, the prefix My is used for classes, methods, and variables.
We see this all the time, from Google's gcm example to numerous questions on stackoverflow. It can be found in documentation for PHP, Javascript, C#, and really just about every other language.
So, given it's extreme prevalence, why shouldn't I put MyClass with myMethod using myVar1 into production? Why is this bad practice, and what should I do instead?

Name are important
Everything else comes down to that one simple fact.
Having clear, meaningful variable names is an important factor in writing understandable and maintainable code. - Computational Fairy Tales
When you write a piece of code, you want anyone reading it (which includes you in three months!) to know what it does almost immediately. In order to remember, you would probably need to add a comment about it. However...
The proper use of comments is to compensate for our failure to express our self in code - Express Names in Code: Bad vs Clean
To help with this, your name should be both descriptive and concise. Generally, when the prefix My is added, it replaces a prefix that could be more informational. Instead of MyService, why not GcmListenerService?
What if I make it descriptive, but I just add My? Like MyGcmListenerService?
That breaks the consice part of the rule. What do you gain from adding My? Is the My adding value to the name? Even if you were attempting to take possession of the code (which would be better done and is feasible using vcs), My is meaningless. Who wrote My? Well, I did, obviously. It says it right there, "My".
If I really shouldn't use My, why is it in so many examples?
Really, it's just a placeholder.
It's like the metasyntactic variables "foo" and "bar" - it's usually used as a placeholder for a real name. - Opinions on using My as a class name prefix
Unfortunately, most experienced programmers just assume that people looking up examples will know this, and they will replace My with a good variable name when they actually use the code. However, for those new to programming, if you see it all over the place, instead of knowing it is a placeholder, you may think it is actually a standard, and best practice to use My.
Ok, then what do I use instead of My?
There are a lot of really good guides about naming variables and classes out there. You can start with wikipedia, but if you google around a bit you can find lots of articles about good vs bad names. At the heart of it all, though, is one rules: make names descriptive, and keep them concise.
If I read the name of a class, method, or variable and immediately know its purpose, it is a good name.

Artifact naming convention

We're doing a big project on OSGi and adding some commons modules. There's some discussion about naming the artifact.
So, one possibility when naming the module is for example:
cmns-definitions (for common definitions), another is cmns-definition, still another is cmns-def. This has some effect also on the package name. Now it's
xx.xxx.xxx.xxx.xxx.commons.definitions, if changing to cmns-def it would be xx.xxx.xxx.xxx.xxx.commons.def.
Inside this package will be classes like enums and other definitions to be used throughout the system.
I personally lean to cmns-definitions since there's not only 1 definition inside the package. Other people point out that java.util doesn't have only 1 utility there for example. Still, java.util is an abbreviation for me. It can mean java utility or java utilities. Same thing happens with commons-lang.
How would you name the package? Why would you choose this name?
cmns-definitions
cmns-definition
cmns-def
Bonus question: How to name something like cmns-exceptions? That's how I name it. Would you name it cmns-xcpt?
ËDIT:
I'm throwing in my own thoughts on this in the hope of being either confirmed or contradicted. If you can, please do.
According to what I think, the background reason why you name something is to make it easier to understand what's inside it. Or, according to Peter Kriens, to make it easy to remember and being able to automate processes via patterns. Both are valid arguments.
My reasoning is as follows in terms of pattern:
1) When a substantivation occurs and it's well known in the industry, follow it on your naming.
Eg:
"features" is a case on this. We have a module called cmns-features. Does this mean we have many features on this module? No. It means "the module that implements the "features" file from Apache karaf".
"commons" is a substantivation of "common" well-accepted on the industry. It doesn't mean "many common". It means "Common code".
If I see extr-commons as a module name, I know that it contains common code for extr (in this case extraction), for example.
2) When a quantity of classes inside the module are cooperating to give a distinct "one and one only" meaning to the whole, use singular form to name it.
The majority of modules are included here. If I name something cmns-persistence-jpa, I mean that whatever classes inside cooperate together to provide the jpa implementation of cmns-persistence-api. I don't expect 2 implementations inside it, but actually a myriad of classes that together make one implementation. Crystal clear to me. No?
3) When a grouping of classes is done with the sole purpose of gathering classes by affinity, but the classes don't cooperate together to no purpose, use plural.
Here is the case for example of cmns-definitions (enums used by the whole system).
Alternatively, using an abbreviation circumvents the problem, e.g. cmns-def which can be also "interpreted expanded" by a human reader to cmns-definitions. Many people use also "xxxx-util" meaning xxxx-utilities.
Still a third option can be used to pack things together, using a name that itself means a pluralization. The word "api" comes to mind, but any word that pluralizes something would do, like "pack".
Support to these cases (3) are well-known modules like commons-collections (using the plural) or commons-dbcp (using abbreviation) or commons-lang (again abbreviation) and anything that uses api to pack classes together by affinity.
From apache:
commons-collections -> many powerful data structures that accelerate development of most significant Java applications
commons-lang -> host of helper utilities for the java.lang API
commons-dbcp -> package of several database connection pools

'it is just a name ...'
I find in my long career that these just names can make a tremendous difference in productivity. I do not think it makes a difference if you use definitions, definition, or def as long as you're consistent and use patterns in the name that are easy to remember and can be used to automate processes. A build based on a consistent naming scheme is infinitely easier to work with than a build with "nice human display" names that are ad-hoc and have no discernible pattern.
If you use patterns, names tend to become shorter. Now people working with these names usually spent a lot of time with them. So their readability is not nearly as important as their mnemonic value. It turns out that abbreviations of 3 or 4 characters are surprisingly powerful. One of the reason is they work well is that there is only one possible abbreviation while if you go longer there are many candidates.
Anyway, most import part is the overall consistency. Good luck.

definitions (or def or definition) is a bad name because it doesn't have any semantic to a reader. You're in an object oriented world (I suppose) - try to follow its conventions and principles. Modules in Maven should be named after the biggest "abstraction" they contain. "Definition" is a form, not a meaning.
Your question is similar to: "Which class name is better FileUtilities or FileUtils". Answer: none.

Basically what you do with the Definitions and Exceptions is to provide kind of an API for your other modules. So I propose to combine definitions, exceptions and add interfaces to it. Then it makes sense to call it all cmns-api. I normally prefer the singular names as they are shorter but you are free to decide as it is just a name.

Best Practice for comments in Java source files?

This doesn't have to be Java, but it's what I'm dealing with. Also, not so much concerned with the methods and details of those, I'm wondering about the overall class file.
What are some of the things I really need to have in my comments for a given class file? At my corporation, the only things I really can come up with:
Copyright/License
A description of what the class does
A last modified date?
Is there anything else which should be provided?
One logical thing I've heard is to keep authors out of the header because it's redundant with the information already being provided via source control.
Update:
JavaDoc can be assumed here, but I'm really more concerned about the details of what's good to include content-wise, whether it's definitive meta-data that can be mined, or the more loose, WHY etc...

One logical thing I've heard is to keep authors out of the header because it's redundant
with the information already being provided via source control.
also last modified date is redundant
I use a small set of documentation patterns:
always documenting about thread-safety
always documenting immutability
javadoc with examples
#Deprecation with WHY and HOW to replace the annotated element
keeping comments at minimum

No to the "last modified date" - that belongs in source control too.
The other two are fine. Basically concentrate on the useful text - what the class does, any caveats around thread safety, expected usage etc.
Implementation comments should usually be about why you're doing something non-obvious - and should therefore be rare. (For instance, it could be because some API behaves in an unusual way, or because there's a useful shortcut you can use but which isn't immediately obvious.)

For the sanity of yourself and future developers, you really ought to be writing Javadocs.

When you feel the need to write comments to explain what some code does, improve the readability of the code, so that comments are not needed. You can do that by renaming methods/fields/classes to have more meaningful names, and by splitting larger methods into smaller methods using the composed method pattern.
If even after all your efforts the code is not self-explanatory, for example the reason why some unobvious code had to be written is not clear from the code, then apologize by writing comments. (Sometimes you can document the reasons by writing a test which will fail, if somebody changes the unobvious-but-correct code to do the obvious-but-wrong thing. But having a comment in addition to that is also useful. I prefix such comments often with "// HACK:" or "// XXX:".)

An overall description of the purpose of the class, a description for each field and a contract for each method. Javadoc format works well.

If you assign ownership of components to particular developers or teams, owners should be recorded in the component source or VCS metadata.

Standards Document

I am writing a coding standards document for a team of about 15 developers with a project load of between 10 and 15 projects a year. Amongst other sections (which I may post here as I get to them) I am writing a section on code formatting. So to start with, I think it is wise that, for whatever reason, we establish some basic, consistent code formatting/naming standards.
I've looked at roughly 10 projects written over the last 3 years from this team and I'm, obviously, finding a pretty wide range of styles. Contractors come in and out and at times, and sometimes even double the team size.
I am looking for a few suggestions for code formatting and naming standards that have really paid off ... but that can also really be justified. I think consistency and shared-patterns go a long way to making the code more maintainable ... but, are there other things I ought to consider when defining said standards?
How do you lineup parenthesis? Do you follow the same parenthesis guidelines when dealing with classes, methods, try catch blocks, switch statements, if else blocks, etc.
Do you line up fields on a column? Do you notate/prefix private variables with an underscore? Do you follow any naming conventions to make it easier to find particulars in a file? How do you order the members of your class?
What about suggestions for namespaces, packaging or source code folder/organization standards? I tend to start with something like:
<com|org|...>.<company>.<app>.<layer>.<function>.ClassName
I'm curious to see if there are other, more accepted, practices than what I am accustomed to -- before I venture off dictating these standards. Links to standards already published online would be great too -- even though I've done a bit of that already.

First find a automated code-formatter that works with your language. Reason: Whatever the document says, people will inevitably break the rules. It's much easier to run code through a formatter than to nit-pick in a code review.
If you're using a language with an existing standard (e.g. Java, C#), it's easiest to use it, or at least start with it as a first draft. Sun put a lot of thought into their formatting rules; you might as well take advantage of it.
In any case, remember that much research has shown that varying things like brace position and whitespace use has no measurable effect on productivity or understandability or prevalence of bugs. Just having any standard is the key.

Coming from the automotive industry, here's a few style standards used for concrete reasons:
Always used braces in control structures, and place them on separate lines. This eliminates problems with people adding code and including it or not including it mistakenly inside a control structure.
if(...)
{
}
All switches/selects have a default case. The default case logs an error if it's not a valid path.
For the same reason as above, any if...elseif... control structures MUST end with a default else that also logs an error if it's not a valid path. A single if statement does not require this.
In the occasional case where a loop or control structure is intentionally empty, a semicolon is always placed within to indicate that this is intentional.
while(stillwaiting())
{
;
}
Naming standards have very different styles for typedefs, defined constants, module global variables, etc. Variable names include type. You can look at the name and have a good idea of what module it pertains to, its scope, and type. This makes it easy to detect errors related to types, etc.
There are others, but these are the top off my head.
-Adam

I'm going to second Jason's suggestion.
I just completed a standards document for a team of 10-12 that work mostly in perl. The document says to use "perltidy-like indentation for complex data structures." We also provided everyone with example perltidy settings that would clean up their code to meet this standard. It was very clear and very much industry-standard for the language so we had great buyoff on it by the team.
When setting out to write this document, I asked around for some examples of great code in our repository and googled a bit to find other standards documents that smarter architects than I to construct a template. It was tough being concise and pragmatic without crossing into micro-manager territory but very much worth it; having any standard is indeed key.
Hope it works out!

It obviously varies depending on languages and technologies. By the look of your example name space I am going to guess java, in which case http://java.sun.com/docs/codeconv/ is a really good place to start. You might also want to look at something like maven's standard directory structure which will make all your projects look similar.

Is there any downside to redundant qualifiers? Any benefit?

For example, referencing something as System.Data.Datagrid as opposed to just Datagrid. Please provide examples and explanation. Thanks.

The benefit is that you don't need to add an import for everything you use, especially if it's the only thing you use from a particular namespace, it also prevents collisions.
The downside, of course, is that the code balloons out in size and gets harder to read the more you use specific qualifiers.
Personally I tend to use imports for most things unless I know for sure I will only be using something from a particular namespace once or twice, so it won't impact the readability of my code.

You're being very explicit about the type you're referencing, and that is a benefit. Although, in the very same process you're giving up code clarity, which clearly is a downside in my case, as I want code to be readable and understandable. I go for the short version unless I have a conflict in different namespaces which can only be solved with the explicit referencing to classes.. Unless I make an alias for it with the keyword using:
using Datagrid = System.Data.Datagrid;

Actually the full path is global::System.Data.DataGrid. The point of using a more qualified path is to avoid having to use additional using statements, especially if the introduction of another using will cause problems with type resolution. More fully qualified identifiers exist so that you can be explicit when you need to be explicit, but if the class's namespace is clear, then the DataGrid version is clearer to many.

I generally use the shortest form available in order to keep the code as clean and readable as possible. That's what using directives are for, after all, and tooltips in the VS editor give you instant detail on the provenance of a type.
I also tend to use a namespace tag for RCWs in a COM interop layer, to call out those variables explicitly in the code (they may need special attention on lifecycle and collection), eg
using _Interop = Some.Interop.Namespace;

In terms of performance there is no upside/downside. Everything is resolved at compile time and the generated MSIL is identical whether you use fully-qualified names or not.
The reason why its use is prevalent in the .NET world is because of auto-generated code, such as designer markup. In that case it would be better to fully-qualify names like class names because of possible conflicts with other classes you may have in your code.
If you have a tool like ReSharper, it will actually tell you what fully-qualified references you have are unnecessary (e.g. by graying them out) so you can lop them off. If you frequently cut-paste code across your various code bases, it would be a must to fully qualify them. (then again, why would you want to do cut-paste all the time; it's a bad form of code reuse!)

I don't think there is really a downside, just readability vs actual time spent coding. In general if you don't have namespaces with ambiguous object I don't think it's really needed. Another thing to consider is level of use. If you have one method that uses reflection and you are alright with typeing System.Reflection 10 times, then it's not a big deal but if you plan on using a namespace alot then I would recommend an include.

Depending on your situation, extra qualifiers will generate a warning (if this is what you mean by redundant). If you then treat warnings as errors, that's a pretty serious downside.
I've run into this with GCC for example.
struct A {
int A::b; // warning!
}

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio