FxCop rule to ensure that each class has a unique identifier - uniqueidentifier

In our API library, we have a number of classes that implement a method ComputeCurrentDefinitionHashCode, which combines the hash codes of each member field with a pseudo-random number that should be unique to that class.
This is based on Paul Hsieh's "SuperFastHash" at http://www.azillionmonkeys.com/qed/hash.html
I'm trying to determine if it's possible to use FxCop to ensure that the randomly generated number we put in each class is not duplicated in any other class.
In other words, can we save information from one class to the next?

Yes, you can construct an FxCop rule that caches information across classes. However, depending on how you include the target number in your classes, this may or may not be a particularly good candidate for an FxCop rule. For example, if it is a literal passed as an argument to a base class constructor, then an FxCop rule might be an OK choice. However, if the source of the number is less "predictable", a unit test approach might be preferable.

Related

What is the best way to manage a large quantity of constants

I am currently working on a very complex program that processes rows from an input table and has a huge number of possible outcomes for each record. Because of this I have a very large number of constants defined for the outcome messages. There is one success message for the record, but a multitude of possible warnings and errors.
My first thought was to define all of my constants for these messages at the package body level, but then I decided to move each constant to the procedure where it is used. I'm now second guessing that decision and thinking of moving everything back to package body level. What is the best way to define this many constants? Ease of maintainability is my ultimate goal for this program since it is so complex.
I think this is a matter of taste. In my application I put all error codes into an Error-Package. All main and commonly used constants I put into a separate package (without a package body).
Again, a matter of taste, but I tend to put a list of named constants at the package spec level rather than the package body so that they can be referenced by any portion of the application. If I ever want to change the error code that c_err_for_specific_reason_x uses, it becomes a single place to do so.
If I wanted to hide the codes and put them within the body I would have a get_error_code(p_get_error_name varchar) function that did the translation based on you passing a valid constant name.
I've done both on different projects, but tend towards the list over the function most times. I tend to use the function if it a table-driven source of the data.
It ... wait for it ... depends!
Since you currently define your constants in the package body, you don't need them to be publicly accessible outside the package. So defining them in a spec really doesn't buy you anything.
Here's is the rule I follow: Define constants within the smallest scope needed. So if a constant is used only within one procedure, define it in that procedure. If it is used within more than one procedure, define it in the body. If it is used elsewhere by code in other packages (or non-packaged SPs) but only when using a particular package, define it in the spec of that package. If it is used by other code for general use, put it in a separate spec of such general constants.

Why should I not use identity based operations on Optional in Java8?

The java.util.Optional javadoc states that:
This is a value-based class; use of identity-sensitive operations (including reference equality (==), identity hash code, or synchronization) on instances of Optional may have unpredictable results and should be avoided.
However, this junit snippet is green. Why? It seems to contradict the javadoc.
Optional<String> holder = Optional.ofNullable(null);
assertEquals("==", true, holder == Optional.<String>empty());
assertEquals("equals", true, holder.equals(Optional.<String>empty()));
You shouldn’t draw any conclusions from the observed behavior of one simple test ran under a particular implementation. The specification says that you can’t rely on this behavior, because the API designers reserve themselves the option to change the behavior at any time without notice.
The term Value-based Class already provides a hint about the intended options. Future versions or alternative implementations may return the same instance when repeated calls with the same value are made, or the JVM might implement true value types for which identity based operations have no meaning.
This is similar to instances returned by the boxing valueOf methods of the wrapper types. Besides the guaranty made for certain (small) values, it is unspecified whether new instances are created or the same instance is returned for the same value.
“A program may produce unpredictable results if it attempts to distinguish two references to equal values of a value-based class…” could also imply that the result of a reference comparison may change during the lifetime of two value-based class instances. Think of a de-duplication feature. Since the JVM already has such a feature for the internal char[] array of Strings, the idea of expanding this feature to all instances of “value-based classes” isn’t so far-fetched.
Optional.of() makes a new Object for every non-null value. So it is very likely that comparing 2 Optionals will be two references even if the Optional refers to the same value.
However, your example shows that Optional.empty() re-uses the same singleton instance. That is probably the only time the same Optional instance is ever returned twice.

What is the best way to compute the transitive import closure of an XML Schema definition?

I have a set of XML schema definition resources (files). These files contain mutual import and include directives. For a specific purpose users will instantiate element definitions in a particular XSD. I would like to provide them with an excerpt that contains only the XSD resources required for the task. This means I need to trace all imports and includes to other resources recursively, until I have I set. (A Kleene Star or transitive closure).
I assume that this is implicitly done when I validate the schemata from the entry point. So there might be a call back that lists all dependencies resolved during the process that I can tap into.
The other solution I see is to use DOM and manually parse each schema for the import and include elements. This seems clunky, however.
I think the most convenient way to do this would be with an XSLT stylesheet to which you provide a list of starting points (URIs, or if you need to be careful about chameleon inclusion, namespace-name/URI pairs), and which then fetches the documents and computes the transitive closure, emitting either a list of URIs (or, again, namespace / URI pairs) or a sequence of XSD schema documents.
XQuery could also be used.
And as you suggest, DOM could also be used, with the host programming language of your choice. (I'd do it in XSLT or XQuery, myself, but that's because I do most of my programming in those languages.) Some validators may provide an API for getting a list of the schema documents consulted, or you may be able to extract that information from a validator's representation of the PSVI; APIs to XSD validation are not standardized.
Note that in the general case you need to watch out for and handle xsd:redefine and xsd:override, not just xsd:include and xsd:import.
And of course, if this is a one-shot task and the number of modules is likely to be less than fifty, it may be faster to do it by hand than by writing a program to do it automatically.

How can I determine the size of methods and classes in Ruby?

I'm working on a code visualization tool and I'd like to be able to display the size(in lines) of each Class, Method, and Module in a project. It seems like existing parsers(such as Ripper) could make this info easy to get. Is there a preferred way to do this? Is there a method of assessing size for classes that have been re-opened in separate locations? How about for dynamically (Class.new {}, Module.new {}) defined structures?
I think what you're asking for is not possible in general without actually running the whole Ruby program the classes are part of (and then you run into the halting problem). Ruby is extremely dynamic, so lines could be added to a class' definition anywhere, at any time, without necessarily referring to the particular class by name (e.g. using class_eval on a class passed into a method as an argument). Not that the source code of a class' definition is saved anyway... I think the closest you could get to that is the source_locations of the methods of the class.
You could take the difference of the maximum and minimum line numbers of those source_locations for each file. Then you'd have to assume that the class is opened only once per file, and that the size of the last method in a file is negligible (as well as any non-method parts of the class definition that happen before the first method definition or after the last one).
If you want something more accurate maybe you could run the program, get method source_locations, and try to correlate those with a separate parse of the source file(s), looking for enclosing class blocks etc.
But anything you do will most likely involve assumptions about how classes are generally defined, and thus not always be correct.
EDIT: Just saw that you were asking about methods and modules too, not just classes, but I think similar arguments apply for those.
I've created a gem that handles this problem in the fashion suggested by wdebaum. class_source. It certainly doesn't cover all cases but is a nice 80% solution for folks that need this type of thing. Patches welcome!

Naming guidelines - Naming generic objects

MSDN Guidelines states that class names should be Pascal cast with no special prefix, such as "C".
It is also states that names of class members, such as proprties and fields, should also be Pascal cast.
So, names ambiguity may arise in the case of a naming generic object.
for example, consider a class named "Polynom". An object instantiate from this class shuold be named "Polynom" also. Polynom = new Polynom.
Is it?
I think a more common guideline (that I have seen Microsoft themselves follow) is to name variables, including instances, camel-cased (lower first, upper all other words: variableName). So in your case, it would be polynom = new Polynom. Of course, I wouldn't actually name a variable polynom unless it had a very obvious use and only for a local space. Otherwise a variable name should describe what it does, not what type it is.
All that said, the most important part of any naming convention is not what casing goes where but that you are consistent with it. Find something that works for you and stick to it!
[Quick edit: re-reading your question again, I see that you're mainly concerned about Properties. In this case, yes, it is very common to Pascal case them, so Polynom would be resonable. But since this is a property that would be exposed to the user (otherwise why is it a property?) Please don't name it Polynom!!! Do something more descriptive, we have intellisense if we want to know the type.]
You may often see
PolyNom polyNom = new PolyNom();
Although most of the time this is not the most readable code. Is it just any old polyNom, or is it for a specific purpose. Steve McConnell sites in Code Complete that the optimal variable name length for debugging (reading code) is 10-16 characters, with 8-20 characters being about the same (pg. 262 second ed.) this gives you a lot of room to more accurately describe exactly what your variable is.

Resources