Metrics for comparing event-based and thread-based programming models - events

I have been asked to compare the programming models used by two different OSs for wireless sensor networks, TinyOS (which uses an event-based model) and Contiki (which uses events internally, but offers a protothread model for application programmers). I have developed the same application in both systems, and I can present a qualitative analysis of the pros and cons of both models, and give my subjective impression.
However, I have been asked to put forward metrics for comparing them. Apart from the time spent to write the programs (which is roughly equal), I'm not sure what other metrics are applicable. Can you suggest some?

Time to understand these programs? Number of questions ask on net about deadlocks (normalized by userbase)

I ended up using lines of code and cyclomatic complexity to show how different models impact code organization. I also estimated the difficulty of understanding the two programs by asking another programmer to read them.

Related

Apache Beam and ETL processes

Given following processes:
manually transforming huge .csv's files via rules (using MS excel or excel like software) & sharing them via ftp
scripts (usually written in Perl or Python) which basically transform data preparing them for other processes.
API's batch reading from files or other origin sources & updating their corresponding data model.
Springboot deployments used (or abused) to in part regularly collect & aggregate data from files or other sources.
And given these problems/ areas of improvement:
Standardization: I'd like to (as far as it makes sense), to propose a unified powerful tool that natively deals with these types of (kind of big) data transformation workflows.
Rising the abstraction level of the processes (related to the point above): Many of the "tasks/jobs" I mentioned above, are seen by the teams using them, in a very technical low level task-like way. I believe having a higher level view of these processes/flows highlighting their business meaning would help self document these processes better, and would also help to establish a ubiquitous language different stakeholders can refer to and think unambiguously about.
IO bottlenecks and resource utilization (technical): Some of those processes do fail more often that what would be desirable, (or take a very long time to finish) due to some memory or network bottleneck. Though it is clear that hardware has limits, resource utilization doesn't seem to have been a priority in many of these data transformation scripts.
Do the Dataflow model and specifically the Apache Beam implementation paired with either Flink or Google Cloud Dataflow as a backend runner, offer a proven solution to those "mundane" topics? The material on the internet mainly focuses on discussing the unified streaming/batch model and also typically cover more advanced features like streaming/event windowing/watermarks/late events/etc, which do look very elegant and promising indeed, but I have some concerns regarding tool maturity and community long term support.
It's hard to give a concrete answer to such a broad question, but I would say that, yes, Beam/Dataflow is a tool that handle this kind of thing. Even though the documentation focuses on "advanced" features like windowing and streaming, lots of people are using it for more "mundane" ETL. For questions about tool maturity and community you could consider sources like Forrester reports that often speak of Dataflow.
You may also want to consider pairing it with other technologies like Arflow/Composer.

Artificial Intelligence/Rules to guess user taste in Apparel/Clothing

Are there standard rules engine/algorithms around AI that would predict the user taste on a particular kind of product like clothes.
I know it's one thing all e-commerce website will kill for. But I am looking out for theoretical patterns defined out there which would help make that prediction in a better way, if not accurately.
Two books that cover recommender systems:
Programming Collective Intelligence: Python, does a good job explaining the algorithm, but doesn't provide enough help IMO in terms of understanding how to scale.
Algorithms of the Intelligent Web: Java, harder to follow, but also covers using persistence, in this case MySQL, to facilitate scaling and identifiers areas in example code that will not scale as-is.
Basically two ways of approaching the problem, user or item based. Netflix appears to use the former, while Amazon the latter. Typically user based requires more time and/or processing power to generate recommendations because you tend to have more users than items to consider.
Not sure how to answer this, as this question is overly broad. What you are describing is a Machine Learning kind of task, and thus would fall under that (very broad) umbrella. There are a number of different algorithms that can be used for something like this, but most texts would tell you that the definition of the problem is the important part.
What parts of fashion are important? What parts are not? How are you going to gather the data? How noisy is the data? All of these are important considerations to the problem space. Pandora does a similar type of thing with music, with their big benefit being that their users tell them initially what they like and don't like.
To categorize their music, they actually have trained musicians listening to the music to identify all sorts of stuff. See the article on Ars Technica here for more information about that. Based on what I know about fashion tastes, I would say that it is a similar problem space, and would probably require experts to "codify" the information before you could attempt to draw parallels.
Sorry for the vague answer - if you want more specifics, I would recommend asking a more specific question, about specific algorithms or data sets, etc.

Suggestions for a business application using logic based system like prolog

I need to develop any business application which includes a logic-based system like prolog. Basically I need to develop a business application and we show that the logic-based system is feasible for that.
This is an academic exercise.
I could think of only puzzles which can be solved using prolog. But I need a business application where I can use prolog.
Can any one please give some suggestions on simple business applications where I can use prolog logic based system?
Thanks & Regards.
How about some kind of resource scheduling like, say, conference rooms, labs, classrooms, etc.? You'd have to keep track of locations, available facilities, events, event priorities, times, what facilities are required for which events, etc. and try to balance these "fairly" in some way. That would be a major challenge for a conventional programming environment and would be immediately useful to boot.
Edited to add:
I found the paper I wanted to reference earlier. You'll have to pay for a copy, but it's worth it if you decide to go this route: School time table scheduling in Prolog
Abstract:
The school Time-Table Scheduling task is a very hard operations research and engineering problem, especially when-implemented in a conventional language. That is due to the imperative, deterministic nature of most conventional languages, such as BASIC and PASCAL and to the long series of constraints and goals inside the problem. The descriptive, logic-based and nondeterministic nature of Prolog language, and its ability to backtrack allows one to easily obtain a deductive data base, mixing the facts, rules, and constraints of the Time-table. Two systems, one nonmonotonic, and one monotonic with a non-monotonic reasoning structure are compared and their performances in a significant test are discussed. The approach may be easily generalized to other analogous engineering, scheduling and operations research problems.
How about a business "contact management" application? I'm thinking it could be prototyped around a couple of features, sending notes to thank customers for recent purchases and perhaps a birthday recognition of some kind.

Measure development platform effeciency

We are devloping an application that is a kind of development tool for developing Line Of Buisness applications. The current applications that we build are windows desktop clients but we are looking into also targeting silverlight / cloud kind of applications.
What we are looking for is a "standard" way of measuring the time to build an application of medium to large complexity. The "easy" way would be to build two versions of an application, one using "standard" tools, like VS and components, and one using our platform, but I'm looking for a more efficient way to measure smaller parts and still be able to get some useful metrics of how much time could be saved using our product.
Do you guys have any pointers for me to look at, and what to test etc?
Everything that can be counted is a potential mesure. LOC, classes, components, dependencies etc, etc.
You specifically ask about "time measure" withouth actually masuring the time. Well, if you are going to document the time, then measuring the time is the only way to go. If you want to estimate or predict the time you think will be spent, then you could use some prediction tool. The COCOMO model is perhaps the best known model. This model uses LOC as the essential input, and some additional calibration parameters, like complexity, type of system, personell experience and historical calibration.
The original model has been redesigned (essentially including more parameters) and is refered to as COCOMO II. And the original model is renamed to COCOMO 81 (Boehm published the first model in 1981).
You'll find a lot of info if you google COCOMO.
The Wikipedia article is probably one of the first hits: http://en.wikipedia.org/wiki/COCOMO

Have you ever used a genetic algorithm in real-world applications?

I was wondering how common it is to find genetic algorithm approaches in commercial code.
It always seemed to me that some kinds of schedulers could benefit from a GA engine, as a supplement to the main algorithm.
Genetic Algorithms have been widely used commercially. Optimizing train routing was an early application. More recently fighter planes have used GAs to optimize wing designs. I have used GAs extensively at work to generate solutions to problems that have an extremely large search space.
Many problems are unlikely to benefit from GAs. I disagree with Thomas that they are too hard to understand. A GA is actually very simple. We found that there is a huge amount of knowledge to be gained from optimizing the GA to a particular problem that might be difficult and as always managing large amounts of parallel computation continue to be a problem for many programmers.
A problem that would benefit from a GA is going to have the following characteristics:
A good way to encode potential solutions
A way to compute an a numerical score to evaluate the quality of the solution
A large multi-dimensional search space where the answer is non-obvious
A good solution is good enough and a perfect solution is not required
There are many problems that could probably benefit from GAs and in the future they will probably be more widely deployed. I believe that GAs are used in cutting edge engineering more than people think however most people (like my company does) guards those secrets extremely closely. It is only long after the fact that it is revealed that GAs were used.
Most people that deal with "normal" applications probably don't have much use for them though.
If you want to find an example, look at Postgres's Query Planner. It uses many techniques, and one just so happens to be genetic.
http://developer.postgresql.org/pgdocs/postgres/geqo-pg-intro.html
I used GA in my Master's thesis, but after that I haven't found anything in my daily work a GA could solve that I couldn't solve faster with some other Algorithm.
I don't think it is particularly common to find genetic algorithms in everyday-commercial code. They are more commonly found in academic/research code where the need to find the "best algorithm" is less important than the need to just find a good solution to a problem.
Nonetheless, I have consulted on a couple of commercial projects that do use GAs (chiefly as a result of my involvement with GAUL). I think the most interesting example was at a Biotech company. They used the GA to optimise scoring functions that were used for virtual screening, as part of their drug discovery application.
Earlier this year, with my current company, I added a new feature to one of our products that uses another GA. I think we might be marketing this from next month. Basically, the GA is used to explore molecules that have the potential for binding to a protein, and could therefore be further investigated as drugs targeting that protein. A competing product that also uses a GA is EA inventor.
As part of my thesis I wrote a generic java framework for the multi-objective optimisation algorithm mPOEMS (Multiobjective prototype optimization with evolved improvement steps), which is a GA using evolutionary concepts. It is generic in a way that all problem-independent parts have been separated from the problem-dependent parts, and an interface is povided to use the framework with only adding the problem-dependent parts. Thus one who wants to use the algorithm does not have to begin from zero, and it facilitates work a lot.
You can find the code here.
The solutions which you can find with this algorithm have been compared in a scientific work with state-of-the-art algorithms SPEA-2 and NSGA, and it has been proven that
the algorithm performes comparable or even better, depending on the metrics you take to measure the performance, and especially depending on the optimization-problem you are looking on.
You can find it here.
Also as part of my thesis and proof of work I applied this framework to the project selection problem found in portfolio management. It is about selecting the projects which add the most value to the company, support most the strategy of the company or support any other arbitrary goal. E.g. selection of a certain number of projects from a specific category, or maximization of project synergies, ...
My thesis which applies this framework to the project selection problem:
http://www.ub.tuwien.ac.at/dipl/2008/AC05038968.pdf
After that I worked in a portfolio management department in one of the fortune 500, where they used a commercial software which also applied a GA to the project selection problem / portfolio optimization.
Further resources:
The documentation of the framework:
http://thomaskremmel.com/mpoems/mpoems_in_java_documentation.pdf
mPOEMS presentation paper:
http://portal.acm.org/citation.cfm?id=1792634.1792653
Actually with a bit of enthusiasm everybody could easily adapt the code of the generic framework to an arbitrary multi-objective optimisation problem.
I haven't but I've heard of this company (can't remember their name) which uses mutating, genetic algos to calculate placements and lengths of antennas (or something) from a friend of mine. And they're supposed to (according to my friend) have huge success with this. I guess GA is just too complex for "average Joe developer" to become mainstream. Kind of like Map Reduce - spectacularly cool, but WAY too advanced to hit the "mainstream"...
LibreOffice Calc uses it in its Solver module.

Resources