Cross Validation--Use testing set or validation set to predict? - validation

I have a question about cross validation.
In Machine learning, we know there're training, validation, test set.
And test set is final run to see how the final model/classifier performed.
But in the process of cross validation:
we are splitting data into training set and testing set(most tutorial used this term), so I'm confused. Do we need to split the whole data into 3 parts: training, validation, test? Since in cross validation we just keep talking about relationship with 2 set: training and the other.
Could someone help clarify?
Thanks

Yep ,it's a little confusing as some material uses CV/test interchangeably and some material does not use ,but i'll try to make it easy to understand by giving the comprehension of why it's needed:
You need the train set to do exactly that, train, but then also you need a way to ensure that your algorithm isn't memorizing the train set(that it's not overfitting) and how well its doing, so that makes the need of the test set so you can give it data it has never seen and you can measure the performance.
But.... ML its all about experimentation, you will train, evaluate, tweak some knob(hyperparameters or architectures), train again, evaluate again over and over, and then you will select the best experiment results, you deploy your system and in production it gets data it's never seen and it doesn't perform that well ,what happened? You used your test data to fit parameters and make decisions , so you overfitted to this test data but you dont know how it does to data never seen.
Cross validation solves this, you have your train data to learn parameters, and test data to evaluate how it does on unseen data, but still need a way to experiment the best hyper parameters and architectures: you take a sample of your training data and call it cross validation set, and hide your test data , you will NEVER use it until the end.
Now use your train data to learn parameters, and experiment with hyper parameters and architectures, but you will evaluate each experiment on the cross validation data instead of test data(you can see it as using CV data as a way to learn the hyperparameters) , after you experimented a lot, and selected your best performing option(on CV), you now use your test data to evaluate how it performs on data it has never seen before deploying it to production.

This is generally an either-or choice. The process of cross-validation is, by design, another way to validate the model. You don't need a separate validation set -- the interactions of the various train-test partitions replace the need for a validation set.
Think about the name, cross-validation ... :-)

Related

How to validate from aggregate

I am trying to understand validation from the aggregate entity on the command side of the CQRS pattern when using eventsourcing.
Basically I would like to know what the best practice is in handling validation for:
1. Uniqueness of say a code.
2. The correctness/validation of an eternal aggregate's id.
My initial thoughts:
I've thought about the constructor passing in a service but this seems wrong as the "Create" of the entity should be the values to assign.
I've thought about validation outside the aggregate, but this seem to put logic somewhere that I assume should be responsibility of the aggregate itself.
Can anyone give me some guidance here?
Uniqueness of say a code.
Ensuring uniqueness is a specific example of set validation. The problem with set validation is that, in effect, you perform the check by locking the entire set. If the entire set is included within a single "aggregate", that's easily done. But if the set spans aggregates, then it is kind of a mess.
A common solution for uniqueness is to manage it at the database level; RDBMS are really good at set operations, and are effectively serialized. Unfortunately, that locks you into a database solution with good set support -- you can't easily switch to a document database, or an event store.
Another approach that is sometimes appropriate is to have the single aggregate check for uniqueness against a cached copy of the available codes. That gives you more freedom to choose your storage solution, but it also opens up the possibility that a data race will introduce the duplication you are trying to avoid.
In some cases, you can encode the code uniqueness into the identifier for the aggregate. In effect, every identifier becomes a set of one.
Keep in mind Greg Young's question
What is the business impact of having a failure?
Knowing how expensive a failure is tells you a lot about how much you are permitted to spend to solve the problem.
The correctness/validation of an eternal aggregate's id.
This normally comes in two parts. The easier one is to validate the data against some agreed upon schema. If our agreement is that the identifier is going to be a URI, then I can validate that the data I receive does satisfy that constraint. Similarly, if the identifier is supposed to be a string representation of a UUID, I can test that the data I receive matches the validation rules described in RFC 4122.
But if you need to check that the identifier is in use somewhere else? Then you are going to have to ask.... The main question in this case is whether you need the answer to that right away, or if you can manage to check that asynchronously (for instance, by modeling "unverified identifiers" and "verified identifiers" separately).
And of course you once again get to reconcile all of the races inherent in distributed computing.
There is no magic.

SAPUI5: Does Using Formatters Impact Performance?

I am working on custom SAPUI5 app using ODataModel and for which I have to do the formatting for some of the fields which I will be displaying in List control.
I need to know which approach is good mentioned below is good w.r.t. performance of app.
1) Is it a good idea to use Formatter.js file and write each method for each field for formatting?
Example -
There are 2 fields which should be formatted before showing in UI and hence 2 formatter function.
2) Before binding Model to List - Do the formatting using Loop at each row.
Example -
Loop at OData.
--do formatting here for both the fields
move data to model.
Endloop.
Bind new model to UI
Is there any other way by which we can improve performance - apart from code minification or using grunt.
Appreciate your help.
Thanks,
Rahul
Replacing Formatters with other solutions is definitely NOT the point to start when optimizing performance not to mention that you will loose a lot of the convenience the ODataModel comes with when manually manipulating the data in it.
Formatter Performance
Anyways using a formatter is of course less performant then pre-formatting your data once after they were loaded. A formatter will be executed on every rerendering of your control. So you might not want to do heavy calculations or excessive looping in a formatter that is executed frequently. But given a normal usage using formatters is absolutely nothing you should worry about or that does noticeably affect end-user experience. Keep enjoying the convenience of formatters (and take a look at the cool Expression Binding).
General Performance Considerations
To improve performance it is first of all very important to identify the real bottle neck. In many cases this is simply the backend, there is usually much more to win with much less effort. Always keep that in mind. UI Code optimization is ridiculous as long as the main backend call runs for say 3s.
Things to improve your UI performance might be:
serve SAPUI5 from a CDN
use a Component-preload, can be generated with grunt-openui5 or gulp-ui5-preload (I think it does not minify XML yet so you could do that additionaly before creating Component-preload)
try to reduce the number of SAPUI5 libraries you are using
be aware of which SAPUI5 libraries you are NOT using and very consequently remove those (don't forget the dependencies section in Component metadata resp. manifest.json)
be aware that sap.ui.layout is a separate independent library (not registering it as such will result in a lot of extra requests)
if you use an ODataModel make sure you set useBatch to true (default in v2.ODataModel)
intelligently design your OData service (if you can influence it)
intelligently use $expands: sometimes it can make sense to preload $expand data on a parent binding that does not actually use it e.g. if you most probably need the data later on
think about bundling your app as native app and benefit from improved caching (Kapsel)
Check Performance: Speed Up Your App and Performance Issues
squeeze out some more bytes and save some requests by minifying/combining custom css or other resources if you have some
If you are generally interested in Web Performance I can recommend Steve Souders books.
I'm totally open for more ideas on SAPUI5 performance improvements! Anyone?
BR
Chris
the best practice is to do it this way. The formatter allows you to receive an input and return output. The formatter function will be called in runtime and will be called for each of the rows which are displayed in your list. The reason that it will be called for each of the rows is because that you cannot grantee that the input will be the same for all of the rows in the list.
The concept of binding is to loop on your data model and update the UI accordingly. It is much better to use binding because a lot of reasons like: maintainability, performance, separate the data layer from the presentation layer, core optimizations and more.

Test Driven Development initial implementation

A common practice of TDD is that you make tiny steps. But one thing which is bugging me is something I've seen a few people do, where by they just hardcode values/options, and then refactor later to make it work properly. For example…
describe Calculator
it should multiply
assert Calculator.multiply(4, 2) == 8
Then you do the least possible to make it pass:
class Calculator
def self.multiply(a, b)
return 8
And it does!
Why do people do this? Is it to ensure they're actually implementing the method in the right class or something? Cause it just seems like a sure-fire way to introduce bugs and give false-confidence if you forget something. Is it a good practice?
This practice is known as "Fake it 'til you make it." In other words, put fake implementations in until such time as it becomes simpler to put in a real implementation. You ask why we do this.
I do this for a number of reasons. One is simply to ensure that my test is being run. It's possible to be configured wrong so that when I hit my magic "run tests" key I'm actually not running the tests I think I'm running. If I press the button and it's red, then put in the fake implementation and it's green, I know I'm really running my tests.
Another reason for this practice is to keep a quick red/green/refactor rhythm going. That is the heartbeat that drives TDD, and it's important that it have a quick cycle. Important so you feel the progress, important so you know where you're at. Some problems (not this one, obviously) can't be solved in a quick heartbeat, but we must advance on them in a heartbeat. Fake it 'til you make it is a way to ensure that timely progress. See also flow.
There is a school of thought, which can be useful in training programmers to use TDD, that says you should not have any lines of source code that were not originally part of a unit test. By first coding the algorithm that passes the test into the test, you verify that your core logic works. Then, you refactor it out into something your production code can use, and write integration tests to define the interaction and thus the object structure containing this logic.
Also, religious TDD adherence would tell you that there should be no logic coded that a requirement, verified by an assertion in a unit test, does not specifically state. Case in point; at this time, the only test for multiplication in the system is asserting that the answer must be 8. So, at this time, the answer is ALWAYS 8, because the requirements tell you nothing different.
This seems very strict, and in the context of a simple case like this, nonsensical; to verify correct functionality in the general case, you would need an infinite number of unit tests, when you as an intelligent human being "know" how multiplication is supposed to work and could easily set up a test that generated and tested a multiplication table up to some limit that would make you confident it would work in all necessary cases. However, in more complex scenarios with more involved algorithms, this becomes a useful study in the benefits of YAGNI. If the requirement states that you need to be able to save record A to the DB, and the ability to save record B is omitted, then you must conclude "you ain't gonna need" the ability to save record B, until a requirement comes in that states this. If you implement the ability to save record B before you know you need to, then if it turns out you never need to then you have wasted time and effort building that into the system; you have code with no business purpose, that regardless can still "break" your system and thus requires maintenance.
Even in the simpler cases, you may end up coding more than you need if you code beyond requirements that you "know" are too light or specific. Let's say you were implementing some sort of parser for string codes. The requirements state that the string code "AA" = 1, and "AB" = 2, and that's the limit of the requirements. But, you know the full library of codes in this system include 20 others, so you include logic and tests that parse the full library. You go back the the client, expecting your payment for time and materials, and the client says "we didn't ask for that; we only ever use the two codes we specified in the tests, so we're not paying you for the extra work". And they would be exactly right; you've technically tried to bilk them by charging for code they didn't ask for and don't need.

Practices for allowing systems to accommodate human error?

Systems have to sometimes accommodate the possibility of real world bad data. Consider that some data originates with paper forms. And forms inherently have a limited means of validating data.
Example 1: On one form users are expected to enter an integer distance (in miles) into a blank. We capture the information as written as a string since we don't always end up getting integer values.
Example 2: On another form we capture a code. That code should map to one of the codes in our system. However, sometimes the code written on the form is incorrect. We capture the code and allow it to exist with an invalid value until some future time of resolution. That is, we temporarily allow bad data since it's important to record the record even if some of it is invalid.
I'm interested in learning more about how systems accommodate bad data, that is, human error. Databases are supposed to be bastions of data integrity, but the real world is messy and people make mistakes. Systems must allow us to reflect those mistakes.
What are some ways systems you've developed accommodate human error? What practices have you used? What lessons have you learned?
Any further reading on the topic? (I had trouble Googling it.)
I agree with you, whatever we do there's no guarantee that we can get rid of bad or incorrect data. Especially, but not only, if it comes to user input. In my experience the same problems exist in complex integration projects, in which you have to integrate and merge (often inconsistent) data retrieved from different systems.
A good strategy is to decouple the input from the operational system itself. First, place user (or external system) provided data in a separate datastore (e.g. different schema). In a second step load this data into your operational datastore, but only if it confirms to strict rules (e.g. use address verification software to verify a given address). This Extract, Transform, Load (ETL) approach is fairly common in Data Warehousing (DWH) solutions, but can be applied programmatically in transactional systems as well (in my experience).
The above approach often leads to asynchronous processes in which the input is subitted first and (maybe) at a later time the external entity (user or system) retrives feedback whether its data was correct or not.
EDIT: For further readings I recommend to have a look at DWH concepts. Alhtough, you may not want to build such a thing, you could partially apply those concepts:
http://en.wikipedia.org/wiki/Extract,_transform,_load
http://en.wikipedia.org/wiki/Data_warehouse
http://en.wikipedia.org/wiki/Data_cleansing
A government department I worked in does a lot of surveys, most of which are (were) still paper based.
All the results were OCR'd into the system.
As part of the OCR process a digital scan of the forms is kept.
Data is then validated, data that is undecipherable or which fails validation is flagged.
When a human operator reviews the digital data they can modify the data if they are confident that they can correctly interpret what the code could not; they (here's the cool bit) can also bring up the scan of the paper based original, and use that to determine what the user was trying to say.
On a different thread; at some point you want to validate the data coming in against any expected data ranges that you want it to conform to; buy rejecting it at the point of entry you give the user a chance to correct it - the trade off is that every time you reject it you increase the chance of them abandoning the whole process.
At some point in your system you need to specify the rules which will be used for validation. At the end of the day a system is only going to be as smart as those rules. You can develop these yourself into the code (probably the business logic) or you might use a 3rd party component.
having flexible control over the validation is pretty important as they are likely to change overtime.
To be honest with you, one point of migrating from paper-based systems to IT is to remove these errors and make sure all data is always correct. I doubt any correctly planned and developed IT system (especially business financial systems) would allow such errors. Not in the company I am working for anyway...
There are lots of software tools that address the kinds of problems you mention. There are platforms and tools that let you define rules for scrubbing and transforming data and handling validation errors. Those techniques are widely used for Data Integration and Business Intelligence applications. Google for "Data Quality" or "Data Integration".
The easiest thing to do is to (this is not always possible) design the interface where users enter the data to limit as much as possible the amount of text that they need to enter. In my experience this seems to be where a lot of problems come from. One simple example of this is to provide a select, or auto-complete select field
One thing that you could do is do everything possible to determine if the data is correct before going into the db. I try to give the user entering the data as much feedback as possible so they can (ideally) fix some of the issues before the data gets persisted. For example, it is a very quick check to determine if the data being entered is of the correct type.
I got started in legal systems before the PC era. Litigation support databases routinely have to accommodate factually incorrect, incomplete, and contradictory information. It takes a different way of thinking.
The short version . . .
Instead of recording a single fact, you record multiple assertions about a fact. It boils down to designing a database to store data from assertions like these.
In an interview at 2011-01-03 08:13, Neil Rimes told Officer Cane
that he was at home from 2011-01-02 20:00 until 2011-01-03 08:13.
In an interview at 2011-01-03 08:25, Liza Nevers told Officer Cane
that Neil Rimes came home at 2011-01-02 23:45.
In a deposition at 2011-05-13 10:22, Cody Maxon told attorney Kurt
Schlagel that he saw Neil Rimes at Kroger at 2011-01-03 03:00

TDD: Where to start the first test [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
So I've done unit testing some, and have experience writing tests, but I have not fully embraced TDD as a design tool.
My current project is to re-work an existing system that generates serial numbers as part of the companies assembly process. I have an understanding of the current process and workflow due to looking at the existing system. I also have a list of new requirements and how they are going to modify the work flow.
I feel like I'm ready to start writing the program and I've decided to force myself to finally do TDD from the start to end.
But now I have no idea where to start. (I also wonder if I'm cheating the TDD process by already have an idea of the program flow for the user.)
The user flow is really serial and is just a series of steps. As an example, the first step would be:
user submits a manufacturing order number and receives a list of serializable part numbers of that orders bill of materials
The next step is started when the user selects one of the part numbers.
So I was thinking I can use this first step as a starting point. I know I want a piece of code that takes a manufacturing order number and returns a list of part numbers.
// This isn't what I'd want my code to end up looking like
// but it is the simplest statement of what I want
IList<string> partNumbers = GetPartNumbersForMfgOrder(string mfgOrder);
Reading Kent Becks example book he talks about picking small tests. This seems like a pretty big black box. Its going to require a mfg order repository and I have to crawl a product structure tree to find all applicable part numbers for this mfg order and I haven't even defined my domain model in code at all.
So on one hand that seems like a crappy start - a very general high level function. On the other hand, I feel like if I start at a lower level I'm really just guessing what I might need and that seems anti-TDD.
As a side note... is this how you'd use stories?
As an assembler
I want to get a list of part numbers on a mfg order
So that I can pick which one to serialize
To be truthful, an assembler would never say that. All an assembler wants is to finish the operation on mfg order:
As an assembler
I want to mark parts with a serial number
So that I can finish the operation on the mfg order
Here's how I would start. Lets suppose you have absolutely no code for this application.
Define the user story and the business value that it brings: "As a User I want to submit a manufacturing order number and a list of part numbers of that orders so that I can send the list to the inventory system"
start with the UI. Create a very simple page (lets suppose its a web app) with three fields: label, list and button. That's good enough, isn't it? The user could copy the list and send to the inv system.
Use a pattern to base your desig on, like MVC.
Define a test for your controller method that gets called from the UI. You're testing here that the controller works, not that the data is correct: Assert.AreSame(3, controller.RetrieveParts(mfgOrder).Count)
Write a simple implementation of the controller to make sure that something gets returned: return new List<MfgOrder>{new MfgOrder(), new MfgOrder(), new MfgOrder()}; You'll also need to implement classes for MfgOrder, for example.
Now your UI is working! Working incorrectly, but working. So lets expect the controller to get the data from a service or DAO. Create a Mock DAO object in the test case, and add an expectation that the method "partsDao.GetPartsInMfgOrder()" is called.
Create the DAO class with the method. Call the method from the controller. Your controller is now done.
Create a separate test to test the DAO, finally making sure it returns the proper data from the DB.
Keep iterating until you get it all done. After a little while, you'll get used to it.
The main point here is separating the application in very small parts, and testing those small parts individually.
This is perfectly okay as a starting test. With this you define expected behavior - how it should work. Now if you feel you've taken a much bigger bite than you'd have liked.. you can temporarily ignore this test and write a more granular test that takes out part or atleast mid-way. Then other tests that take you towards the goal of making the first big test pass. Red-Green-Refactor at each step.
Small tests, I think mean that you should not be testing a whole lot of stuff in one test. e.g. Are components D.A, B and C in state1, state2 and state3 after I've called Method1(), Method2() and Method3() with these parameters on D.
Each test should test just one thing. You can search SO for qualities of good tests. But I'd consider your test to be a small test because it is short and focussed on one task - 'Getting PartNumbers From Manufacturing Order'
Update: As a To-Try suggestion (AFAIR from Beck's book), you may wanna sit down and come up with a list of one-line tests for the SUT on a piece of paper. Now you can choose the easiest (tests that you're confident that you'll be able to get done.) in order to build some confidence. OR you could attempt one that you're 80% confident but has some gray areas (my choice too) because it'll help you learn something about the SUT along the way. Keep the ones that you've no idea of how to proceed for the end... hopefully it'll be clearer by the time the easier ones are done. Strike them off one by one as and when they turn green.
I think you have a good start but don't quite see it that way. The test that is supposed to spawn more tests makes total sense to me as if you think about it, do you know what a Manufacturing Order number or a Part Number is yet? You have to build those possibly which leads to other tests but eventually you'll get down to the itty bitty tests I believe.
Here's a story that may require a bit of breaking down:
As a User I want to submit a manufacturing order number and receive a list of serializable part numbers of that orders bill of materials
I think the key is to break things down over and over again into tiny pieces that make it is to build the whole thing. That "Divide and conquer" technique is handy at times. ;)
Well well, you've hit the exact same wall I did when I tried TDD for the first time :)
Since then, I gave up on it, simply because it makes refactoring too expensive - and I tend to refactor a lot during the initial stage of development.
With those grumpy words out of the way, I find that one of the most overseen and most important aspects of TDD is that it forces you to define your class-interfaces before actually implementing them. That's a very good thing when you need to assemble all your parts into one big product (well, into sub-products ;) ). What you need to do before writing your first tests, is to have your domain model, deployment model and preferably a good chunk of your class-diagrams ready before coding - simply because you need to identify your invariants, min- and max-values etc., before you can test for them. You should be able to identify these on a unit-testing level from your design.
Soo, in my experience (not in the experience of some author who enjoys mapping real world analogies to OO :P ), TDD should go like this:
Create your deployment diagram, from the requirement specification (ofc, nothing is set in stone - ever)
Pick a user story to implement
Create or modify your domain model to include this story
Create or modify your class-diagram to include this story (including various design classes)
Identify test-vectors.
Create the tests based on the interface you made in step 4
Test the tests(!). This is a very important step..
Implement the classes
Test the classes
Go have a beer with your co-workers :)

Resources