Replicate logic or using existent in tests? - ruby

I have some concerns about writing too much logic in the spec tests.
So let's assume we have a student with statuses and steps
Student statuses starts from
pending -> learning -> graduated -> completed
and steps are:
nil -> learning_step1 -> learning_step2 -> learning_step3 -> monitoring_step1 -> monitoring_step2
With each step going forward a lot of things are happening depending where you are: e.g.
nil -> learning_step1
Student status changes to learning
Writes an action history ( which is used by report stats )
Update a contact schedule
learning_step1 -> learning_step2
....the same...
and so ..... until
learning_step3 -> monitoring_step1
Student status changes to graduated
Writes different action histories ( which is used by report stats )
Update a contact schedule
and when
monitoring_step2 -> there is no next step
Student status changes to completed
Writes different action histories ( which is used by report stats )
Delete any contact schedule
So imagine that I need a test case of a completed student, I would have to think all the possibilities that may come and achieve that student as completed and also I can forget to write an action history and will mess with the reports.
Or ....
Using an already implemented class:
# assuming we have like in the above example 5 steps I do
StepManager.new(student).proceed # goes status learning and step1
StepManager.new(student).proceed
StepManager.new(student).proceed
StepManager.new(student).proceed # goes status graduated and monitoring1
StepManager.new(student).proceed # this will proceed the student in the 5th step which is monitoring2
StepManager.new(student).next_step # here will go completed
or to make my job easier with something like:
StepManager.new(student).proceed.proceed.proceed.proceed.proceed.proceed
or
StepManager.new(student).complete_student # which in background does the same thing
And by doing that I am sure I will never miss something. But then the tests wouldn't be so clear about what I am doing.
So should I replicate the logic or using my classes?

Use TDD best practices. In a rails project for example, write unit tests for your models and controller actions. Same for services. Test each unit for it's expectations. If you need more complex data state to check, it's recommended to use factories using https://github.com/thoughtbot/factory_bot or https://github.com/thoughtbot/factory_bot_rails if you're using rails.

Related

How to performance test workflow execution?

I have 2 APIs
Create a workflow (http POST request)
Check workflow status (http GET
request)
I want to performance test on how much time does workflow takes to complete.
Tried two ways:
Option 1 Created a java test that triggers workflow create API and then poll status API to check if status turns to CREATED. I check the time taken in this process which gives me performance results.
Option 2 Was using Gatling to do the same
val createWorkflow = http("create").post("").body(ElFileBody("src/main/resources/weather.json")).asJson.check(status.is(200))
.check(jsonPath("$.id").saveAs("id"))
val statusWorkflow = http("status").get("/${id}")
.check(jsonPath("$.status").saveAs("status")).asJson.check(status.is(200))
val scn = scenario("CREATING")
.exec(createWorkflow)
.repeat(20){exec(statusWorkflow)}
Gatling one didn't really work (or I am doing it in some wrong way). Is there a way in Gatling I can merge multiple requests and do something similar to Option 1
Is there some other tool that can help me out to performance test such scenarios?
I think something like below should work when using Gatling's tryMax
.tryMax(100) {
pause(1)
.exec(http("status").get("/${id}")
.check(jsonPath("$.status").saveAs("status")).asJson.check(status.is(200))
)
}
Note: I didn't try this out locally. More information about tryMax:
https://medium.com/#vcomposieux/load-testing-gatling-tips-tricks-47e829e5d449 (Polling: waiting for an asynchronous task)
https://gatling.io/docs/current/advanced_tutorial/#step-05-check-and-failure-management

Access the leaderboard in h2o automl in between and extract metrics of models that are completed?

In h2o automl can we find which models are completed and see their metrics while training is going on ?
eg: maxmodels=5 , so can we extract the information of model 1 while others(models 2 ,3) are getting trained.
If you're using Python or R client to run H2OAutoML, you can see the progression — e.g. which model is being built or completed — using the verbosity='info' parameter.
For example:
aml = H2OAutoML(project_name="my_aml",
...,
verbosity='info')
aml.train(...)
If you're using the Flow web UI, this progress is shown automatically when you click on the running job.
You can then obtain the model ids from those console logs, and retrieve the models as usual in a separate client:
m = h2o.get_model('the_model_id')
or you can even access the current state of the leaderboard with the model metrics:
h2o.automl.get_automl('my_aml').leaderboard
The same logic applies to R client.
An alternative, maybe simpler if the goal is just to monitor progress, is to use Flow ( usually availble at http://localhost:54321 ) > 'Admin' > 'Jobs' > search for the AutoML job..

Best system test technique for new front end to legacy system

Aplogies in advance for asking a rather vague question ...
I need to test a new front end to a database. problems are 1) the db schema is huge with no documentation and b) there are many downstream systems - too many to build in test environment.
I was wondering if this appraoch may add value - 1) execute the same operations with a)new and then b) old frontend system (recording times when started / finished) then 2) Use LogMiner to query the redo log (using start and end times) and compare the changes to the db during a) and b).
Are there better appraches?
Matt
When testing, you need to define a successful test before you start. Meaning, you need to know what the end result should be based on your starting environment, your ending environment and actions you perform. Example: let's say you have an accounting system and you want to "test" a payment transaction from account X to account Y. When you start, you know the balance of X and Y. You run your test and send a $100 payment from X to Y. After the test, is X = X-100 and Y = Y+100?
In your case, I would:
1) Take a backup of the database. IE: start with a known, consistent state.
2) Run old processes.
3) Run reports on results.
4) Restore database from #1
5) Run new processes
6) Run reports
7) Compare reports from steps #3 and #6 and compare

Hive execution hook

I am in need to hook a custom execution hook in Apache Hive. Please let me know if somebody know how to do it.
The current environment I am using is given below:
Hadoop : Cloudera version 4.1.2
Operating system : Centos
Thanks,
Arun
There are several types of hooks depending on at which stage you want to inject your custom code:
Driver run hooks (Pre/Post)
Semantic analyizer hooks (Pre/Post)
Execution hooks (Pre/Failure/Post)
Client statistics publisher
If you run a script the processing flow looks like as follows:
Driver.run() takes the command
HiveDriverRunHook.preDriverRun()
(HiveConf.ConfVars.HIVE_DRIVER_RUN_HOOKS)
Driver.compile() starts processing the command: creates the abstract syntax tree
AbstractSemanticAnalyzerHook.preAnalyze()
(HiveConf.ConfVars.SEMANTIC_ANALYZER_HOOK)
Semantic analysis
AbstractSemanticAnalyzerHook.postAnalyze()
(HiveConf.ConfVars.SEMANTIC_ANALYZER_HOOK)
Create and validate the query plan (physical plan)
Driver.execute() : ready to run the jobs
ExecuteWithHookContext.run()
(HiveConf.ConfVars.PREEXECHOOKS)
ExecDriver.execute() runs all the jobs
For each job at every HiveConf.ConfVars.HIVECOUNTERSPULLINTERVAL interval:
ClientStatsPublisher.run() is called to publish statistics
(HiveConf.ConfVars.CLIENTSTATSPUBLISHERS)
If a task fails: ExecuteWithHookContext.run()
(HiveConf.ConfVars.ONFAILUREHOOKS)
Finish all the tasks
ExecuteWithHookContext.run() (HiveConf.ConfVars.POSTEXECHOOKS)
Before returning the result HiveDriverRunHook.postDriverRun() ( HiveConf.ConfVars.HIVE_DRIVER_RUN_HOOKS)
Return the result.
For each of the hooks I indicated the interfaces you have to implement. In the brackets
there's the corresponding conf. prop. key you have to set in order to register the
class at the beginning of the script.
E.g: setting the PreExecution hook (9th stage of the workflow)
HiveConf.ConfVars.PREEXECHOOKS -> hive.exec.pre.hooks :
set hive.exec.pre.hooks=com.example.MyPreHook;
Unfortunately these features aren't really documented, but you can always look into the Driver class to see the evaluation order of the hooks.
Remark: I assumed here Hive 0.11.0, I don't think that the Cloudera distribution
differs (too much)
a good start --> http://dharmeshkakadia.github.io/hive-hook/
there are examples...
note: hive cli from console show the messages if you execute from hue, add a logger and you can see the results in hiveserver2 log role.

Google Analytics funnel ignore steps

I have following problem with tracking of Magento purchase on Google Analytics (custom theme, different from default checkout process).
My goal settings are following: http://db.tt/W30D0CnL, where step 3 equals to /checkout/onepage/opc-review-placeOrderClicked
As you can see from funnel visualization ( http://db.tt/moluI29d ) after step 2 (Checkout Start) there are a lot of exits toward /checkout/onepage/opc-review-placeOrderClicked which is setted as step 3, but step 3 reporting always 0.
Is there something that I'm missing here?
I've found the problem. Apparently second point (/checkout/onepage) was fired even on the third step.
When I changed it to regex match (/checkout/onepage$) everything started to work.

Resources