External Data File for Unit Tests - cocoa

I'm a newbie to Unit Testing and I'm after some best practice advice. I'm coding in Cocoa using Xcode.
I've got a method that's validating a URL that a user enters. I want it to only accept http:// protocol and only accept URLs that have valid characters.
Is it acceptable to have one test for this and use a test data file? The data file provides example valid/invalid URLs and whether or not the URL should validate. I'm also using this to check the description and domain of the error message.
Why I'm doing this
I've read Pragmatic Unit Testing in Java with JUnit and this gives an example with an external data file, which makes me think this is OK. Plus it means I don't need to write lots of unit tests with very similar code just to test different data.
But on the other hand...
If I'm testing for:
invalid characters
and an invalid protocol
and valid URLs
all in the same test data file (and therefore in the same test) will this cause me problems later on? I read that one test should only fail for one reason.
Is what I'm doing OK?
How do other people use test data in their unit tests, if at all?

In general, use a test data file only when it's necessary. There are a number of disadvantages to using a test data file:
The code for your test is split between the test code and the test data file. This makes the test more difficult to understand and maintain.
You want to keep your unit tests as fast as possible. Having tests that unnecessarily read data files can slow down your tests.
There are a few cases where I do use data files:
The input is large (for example, an XML document). While you could use String concatenation to create a large input, it can make the test code hard to read.
The test is actually testing code that reads a file. Even in this case, you might want to have the test write a sample file in a temporary directory so that all of the code for the test is in one place.
Instead of encoding the valid and invalid URLs in the file, I suggest writing the tests in code. I suggest creating a test for invalid characters, a test for invalid protocol(s), a test for invalid domain(s), and a test for a valid URL. If you don't think that has enough coverage, you can create a mini integration test to test multiple valid and invalid URLs. Here's an example in Java and JUnit:
public void testManyValidUrls() {
UrlValidator validator = new UrlValidator();
assertValidUrl(validator, "http://foo.com");
assertValidUrl(validator, "http://foo.com/home");
// more asserts here
}
private static void assertValidUrl(UrlValidator validator, String url) {
assertTrue(url + " should be considered valid", validator.isValid(url);
}

While I think this is a perfectly reasonable question to ask, I don't think you should be overly concerned about this. Strictly speaking, you are correct that each test should only test for one thing, but that doesn't preclude your use of a data file.
If your System Under Test (SUT) is a simple URL parser/validator, I assume that it takes a single URL as a parameter. As such, there's a limit to how much simultaneously invalid data you can feed into it. Even if you feed in an URL that contains both invalid characters, and an invalid protocol, it would only cause a single result (that the URL was invalid).
What you are describing is a Data-Driven Test (also called a Parameterized Test). If you keep the test itself simple, feeding it with different data is not problematic in itself.
What you do need to be concerned about is that you want to be able to quickly locate why a test fails when/if that happens some months from now. If your test output points to a specific row in you test data file, you should be able to quickly figure out what went wrong. On the other hand, if the only message you get is that the test failed and any of the rows in the file could be at fault, you will begin to see the contours of a test maintainability nightmare.
Personally, I lean slightly towards having the test data as closely associated with the tests as possible. That's because I view the concept of Tests as Executable Specifications as very important. When the test data is hard-coded within each test, it can very clearly specify the relationship between input and expected output. The more you remove the data from the test itself, the harder it becomes to read this 'specification'.
This means that I tend to define the values of input data within each test. If I have to write a lot of very similar tests where the only variation is input and/or expected output, I write a Parameterized Test, but still invoke that Parameterized Test from hard-coded tests (that each is only a single line of code). I don't think I've ever used an external data file.
But then again, these days, I don't even know what my input is, since I use Constrained Non-Determinism. Instead, I work with Equivalence Classes and Derived Values.

take a look at: http://xunitpatterns.com/Data-Driven%20Test.html

Related

Jmeter JSONs comparison

Currently I am working on moving some API DDT (data from CSV) tests from RobotFramework to Jmeter and what troubles me is the lack of proper JSONs assertion which is able to ignore some keys during comparison. I am totally new to jmeter so I am not sure if there's no such option available.
I am pretty sure we are using the wrong tool for this job, especially because functional testers would take the job of writing new tests. However, my approach (to make it as easy as possible for functionals) is to create jmeter plugin which takes response and compare it to baseline (excluding ignored keys defined in its GUI). What do you think? Is there any builtin I can use instead? Or do you know anything about some existing plugin?
Thanks in advance
The "proper" assertion is JSON Assertion available since JMeter 4.0
You can use arbitrary JSON Path queries to filter response according to your expected result
Example:
If it is not enough - you can always go for JSR223 Assertion, Groovy language has built-in JSON support so it will be way more flexible than any existing or future plugin.
Please find below the approach that I can think of:-
Take the response/HTML/json source code dump for the base line using "save response to a file".
Take the response dump for the AUT that needs to be compare or simply 2nd run dump.
Use 2 FTP sampler's to make calls for the locally saved response dump's.
Use compare assertion to compare the 2 FTP call response's. In the compare assertion, you can use RegEx String and Substitution to mask the timestamps or userID to something common for both so that it will be ignored in comparison.
Below I have shown just an image for my thought's for help.
You need to take care on how to save and fetch the response's.
Hope this help.

How to show text files results for code coverage in sonarqube?

I have written unit cases for my adapter code. The results are in a text file having module name and whether the unit test is succcess or failure with the string SUCCESS AND FAILURE. How can I use this text file to show code coverage in sonarqube analysis ?. Please help me on this.
I want to set as covered as true for the entire folder level and not for linenumber. How to specify in that general xml format ? – Umap
Your best bet is to try to convert the into the Generic Test Data format. However, that format is designed to take coverage data about lines, not modules, so you may face difficulties with your data granularity.

Arbitrary checks in ReSharper plugin testing

As I learned from DevGuide testing ReSharper plugins works as follows:
Plugin is loaded and test input file is passed to it
Plugin performs it's actions on the passed file
ReSharper's test environment writes plugin actions results to .tmp file in a special format that depends on the type of functionality tested (for example, if we test completion, .tmp file will contain the list of generated completion items)
ReSharper's test environment compares .tmp file with .gold file to decide if test is failed or succeeded
But I need the following scenario. The first two steps are the same as the above ones, then:
I write code that obtains the results of plugin's actions and check are they what I'm expected so I can make test fail if needed
How can I achieve this?
I need it because I have a code that uses AST generated by ReSharper to build some graphs and I want to test are the graphs built correctly.
Yes, you can do this. You need to create your own test base class, instead of using one of the provided ones.
There is a hierarchy of base classes, each adding extra functionality. Usually, you'll derive from something like QuickFixAvailabilityTestBase or QuickFixTestBase, which add the functionality for testing quick fixes. These are the classes that will do something and write the output to a .tmp file that is then compared to the .gold file.
These classes themselves derive from something like BaseTestWithSingleProject, which provides the functionality to setup an in-memory solution and project that's populated with files you specify in your test, or BaseTestWithTextControl which also gives you a text control for the file you're testing. If you derive from this class directly (or with your own custom base class), you can perform the action you need for the actual test, and either assert something in memory, or write the appropriate text to the .tmp file to compare against the .gold.
You should override the DoTest method. This will give you an IProject that is already set up, and you can do whatever you need to in order to test your extension's functionality. You can use project.Solution.GetComponent<> to get at any shell or solution component, and use the ExecuteWithGold method to execute something, write to the .tmp file and have ReSharper compare to the .gold file for you.

With CRF++, MIRA works for me but CRF-L1 and CRF-L2 do not

It may not matter, but I am using the windows distribution of CRF++ 0.58.
So I have successfully used mallet to train a model with a CRF and then test it. When I try to use the same train and test files with CRF++ (and after creating a template file), I get a
The line search routine mcsrch failed: error code:0
error when I use either
-a CRF-L1
or the default
-a CRF-L2
When I use
-a MIRA
though, training works without error and same with test.
The format of the test and training data can be the same for both mallet and crf++, so that is not the issue. My template file is as simple as
#Mixed
M00:%x[0,0]
M01:%x[0,1]
M02:%x[0,2]
......
M12:%x[0,12]
My last column is either 0 or 1 in my training data which is the value to classify with. No whitespace in any of my features, I use underscores when necessary. Am I missing something simple here, what would cause the L1 and L2 regularizations to fail like that?
I knew it was something silly ...
To use features like I am using, you need to use the U prefix (as in Unigram). So like U00:%x[0,0] is fine. You can't just call you features anything you want.
I also discovered that if I stripped down my test data to a single sentence, I would get the same error message. When I restored my test data back to its original size of around 2600 sentences, the regularization algorithms now run. Overfitting is a common cause of this error message across various nlp and ml applications, but that was not the true problem in my case.
It can also happen in the extreme case of a dataset with just one CLASS (due to bug in the training set generation procedure).

How to prevent injections in ASP/VBScript?

What are the best ways (or at least most common ways) in ASP (VBScript) for input handling? My main concerns are HTML/JavaScript injections & SQL injections. Is there some equivalent to PHP's htmlspecialchars or addslashes, et cetera? Or do I have to do it manually with something like string replace functions?
The bottom line is this:
Always HTML-encode user input before you write it to your page. Server.HTMLEncode() does that for you.
Always use parameterized queries to interface with a database. The ÀDODB.Command and ADODB.CommandParameter objects are the right choice here.
Always use the URLScan utility and IIS lockdown on the IIS server that renders the page, unless they are version 6 and up, which do not require these tools anymore.
If you stick to points 1 and 2 slavishly, I can't think of much that can go wrong.
Most vulnerabilities come from not properly encoding user input or building SQL strings from it. If you for some reason come to the point where HTML-encoding user input stands in your way, you have found a design flaw in your application.
I would add to Tomalaks list one other point.
Avoid using concatenation of field values in SQL code. That is, in some cases a stored procedure may build some SQL in a string to subsequently execute. This is fine unless a textual field value is used as part of its construction.
A command parameter can protect SQL code designed to input a value from being hijacked into executing unwanted SQL but it allows such unwanted SQL to become data in the database. This is a first-level vunerability. A second-level injection vunerability exists if the field's value is then used in some SQL string concatenation inside a stored procedure.
Another consideration is that this is just minimal protection. All its doing is rendering attack attempts harmless. However in many cases it may be better to add to this a system which prevents such data entry altogther and/or alters admins to a potential injection attack.
This is where input validation becomes important. I don't know of any tools that do this for you but a few simple Regular Expressions might help. For example, "<\w+" would detect the attempt to include a HTML tag in the field.

Resources