Pros and cons of Datameer vs Altreyx [closed] - hadoop

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I am trying to evaluate Datameer and Altreyx for our bigdata analytics needs. What are the pros and cons of these two tools?

First off, full disclosure: I am Sr. Director for Technical Product Marketing at Datameer, so treat what I have to say with appropriate skepticism. For what it's worth, I also write about Big Data (but not about Datameer) for ZDNet, and I was Research Director for Big Data and Analytics at Gigaom Research. So I know a thing or two about the BI/Big Data market.
With that out of the way, let me say that Alteryx and Datameer are actually rather different products. Even if our messaging may sound similar at times, we do different things.
Alteryx does a great job of allowing its users to set up workflows, graphically, for data transformation, then run those workflows when the design is done. Alteryx connects to Hadoop via Hive and its ODBC driver, moving data out of Hadoop in order to process it.
Alteryx runs as a Windows desktop application, using a UI that looks much like an Integrated Development Environment (IDE). If you're a developer, or even a certain type of power user (for example, one who likes to write a little code now and then), you'll be right at home.
Datameer, on the other hand, can run on Hadoop natively. Instead of connecting via Hive and moving data from Hadoop into our engine, Hadoop in fact is our engine, where that makes the most sense. Rather than graphical workflows, we use a spreadsheet metaphor, allowing users to enter formulas in sheets in order to effect data transformation/shaping/cleansing. And instead of making you execute your whole workbook to see results, our Smart Sampling feature brings data in at design time, so you can work interactively with a subset of the data before you decide to execute the full workbook from end-to-end.
Datameer runs in a Web browser, not as a desktop application, allowing us to run cross-platform between Windows and Mac OS (for example), as well as on tablets running Android, iOS or Windows. Datameer can run on-prem or as a service, in various configurations. With our Personal and Workgroup products, with which you'd likely be processing smaller data volumes, we bypass Hadoop and execute your workbook in-memory.
We have premium modules that do some interesting things. Smart Execution can simplify some Hadoop decisions you'd otherwise have to make on your own, including whether to use MapReduce, Tez or our local in-memory engine. Our Smart Analytics module lets you use machine learning algorithms to understand your data better, and we make pretty short work of doing so.
Alteryx essentially wraps R to deliver machine learning services, and does so for predictive analytics, rather than for data discovery, per se. The ML capabilities in Alteryx are more comprehensive than ours, but they are based on R functions inserted into data flows whereas our ML feature is Wizard-driven. Our ML feature set is smaller and, we believe, simpler. The 80-20 rule applies, from our point of view.
Alteryx does an excellent job of integrating consumer and spatial data to calculate and visualize things like locations within a certain drive-time radius. Datameer does not have a comparable feature. On the other hand, we do have over 60 native connectors to various RDBMS, DW, NoSQL, social and SaaS databases and services, and they come in the box. The datasets that Alteryx can integrate with come at relatively high cost, per seat, at least in terms of list prices on the company's site (at http://www.alteryx.com/products/pricing).
Alteryx is BI product with a rich heritage dating back a decade, and the company has done a good job of adding Big Data features as those have become relevant to the market. Datameer was designed from scratch around Big Data use cases and technologies. So, really, we are very different. Can you do the same sorts of things with the two products? Sure. You can also do the same sorts of things with Excel macros and coding in C; that's just how computation works. But our approaches are rather different.

Related

How to plan software project step by step? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I have to design and develop stand alone desktop application using NetBeans , Java and MySQL. I need to know how to plan my software step by step before coding, like create SRS document, drawing use cases, plan ER Diagram, Flow charts , BP Diagrams, Class Diagrams, etc...
complete quality product with less errors
As per my understanding, the development model needs to be determined - if it is waterfall or prototypical. Waterfall model is not much in use these days as per my knowledge. Under waterfall model, the coding begins only after the requirements specifications and software design is fully developed and nailed down such that there is almost negligible chance that they would change. However, in modern world, the agile or prototypical software development model is being followed where we start with basic requirements and basic software design and then directly proceed to coding, testing and also sometimes releasing the product as soon as possible. And then all the steps of srs, design, coding, testing and releasing get repeated continously until the application's life time, thus the product gets better with each release and only after few releases it will reach at a point where it has got many features live in production.
The reason that the iterative model of software development is more popular is because the requirements keep changing and it is hard to nail down the requirements for all the features of the product beforehand because the stakeholders don't have the full idea of what they want and/or how they want. The same is true for design, due to change in requirements, the software design also needs to be changed and hence it is not beneficial to lock down the software design either.
However that being said, it is not the case that iterative development does not have any srs or any design specs. I would suggest to start with basic srs and basic software design that captures the very core part of the application and also keep it flexible so that it can accommodate the changes easily.
The diagrams and documents that you mentioned are all good starting points. However, they need to be kept at minimal capturing only the core part of the application so that the coding, testing and releasing part can proceed quickly and thus accomplish the goal of getting that initial version (proof of concept) out so that it can be demonstrated to the stakeholders.
Let us say, it is a shopping application, the core part of it may have these features:
Ability to add items to inventory
Ability to show all the items to user (search comes later, user authentication comes later)
Ability for a user to view details of the item
Ability to make a purchase (fake purchase, actual payment processing can be done later)
Ability to view the orders and order details.
Thus the above features try to complete the critical path of the application so that it can be a working application as soon as possible and can be demonstrated and iterated over. The features that are not critical initially can be stubbed out - such as authentication, search, payment processing, sending emails and so on.
I am not sure if this answers your question but hope it provides some pointers in order to start the application development from scratch.
If you have more time then follow waterfall model.
You can go for Agile methodology for fast delivery of application.
Planning of software depends on following factors
1)scope of project
2)deadlines of project
3)Number of resources available
4)cost of project
5)R&D work time etc
i hope , it will help you

How to create a data-entry form using COBOL for Windows [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I am using VFP 9 in Windows 7 for my DBMS and want to do this in COBOL for windows.
For a start I want to create, for my Retail Pharmacy, a Medicines List Table with Medicine Type,Medicine Name,Unit Price and use it as a look-up table for Billing in my Retail Pharmacy.
Then I want to create a Data Entry Form to get general Data like Bill Number, Bill Date, Customer Name.
After that I want to present to the User scrollable Alphabetised Medicine List when called with the first letter of the medicine name by the user with provision to enter quantity required for each required medicine.
After this is over, the COBOL app should calculate Itemwise amount (Unit Rate in Look Up Table multiplied by Quantity entered by the USer and then calculate the Total Bill Amount
and print out a Sales Bill with suitable Header and Footer and also save / add append the Bill to a file in the program-designated folder in my hard disk.
I am now doing all this with VFP9 in my newer machines running Windows 7 32 & 64 bit and also with MS QBasic v 1.1 and Foxpro for Dos v 2.5 a in my older MS DOS PCs running MS DOS 6.22.
I don't have a direct answer to your question - its a bit too open ended for the SO forum. But
let me guess, FoxPro is essentially dead and you need to redevelop your application for a newer platform.
You are not the first person to face this problem.
Rather than redeveloping your application in something
like COBOL on a PC platform, a task that may be rather daunting, why not explore the possibility of
migrating to a differnt product. For example, I believe there are several tools available to help migrate FoxPro to
VB.Net. There is some cost involved here (VB.Net isn't free, nor are the conversion packages).
This approach is not without some problems (you might not get exactly the interface you are accustomed to and
the converted code might not be very well structured or maintainable), but it should be fairly quick and uncomplicated to do.
The down side is that you are simply moving from one vendor solution to another. Single vendor solutions, even
those from large corporations like MicroSoft, sometimes end up being discontinued before you would like.
Another approach might be to redevelop under an open source product. For example, PHP-MySQL-Apache using
the wamp server instal could be worth exploring.
This may appear to be overkill for a standalone application but there is plenty of free support for this type of setup.
I do not recommend that casual developers tinker with the inner workings of complex open source
software such as this - but the popularity and open source aspect ensure it will be operable for several more generations of
whatever operating system you are using.
An additional point is that once you have rebuilt your system under a client-server model it becomes somewhat more
scalable and easier to deploy over a network should you ever want to go that route.
Why not use COBOL? In actual fact there is nothing fundamentally wrong with this approach either. COBOL is very
well suited for developing the type of application you have described. However, COBOL does not provide standard
database or UI components (screen management is defined but often poorly supported). Database and UI tend to be vendor
specific add ons so you again end up tied to
a single vendor solution. COBOL tends to be more
popular in large mainframe shops so getting good support for COBOL and related tools is difficult outside of
that environmnent. Try finding a recent college or university graduate in computer science that has knowledge of
COBOL and you will see what I mean. If you want to continue with COBOL, have a look at Open COBOL
which is open source and has a reasonably large support community willing to help out small time developers.
My best advice to you is to take time to research and evaluate your conversion/migration strategy before committing to
any one of them. There are many factors at play here, some more obvious than others, but choices you make
right now will have long range consequnces on the success of your application.

purpose of IBM Cognos Business Intelligence and Financial Performance Management?

I've been mandated to find out what IBM Cognos does, and I cannot find useful information on the subject apart what I can read from the IBM Website and Wikipedia.
What I'm after is some concrete examples of what Cognos can do for businesses and organisations that intend to use it.
Financial Performance Management I have no idea about but we use BI 8.4/10.1 quite a bit. The Cognos product line is actually quite large and we only really use the baseline BI stuff with Framework Manager but I'll try and help you out, based on how we use it.
Think of BI itself as an application that lets you view your data in many different ways. Now so far, it's no different to Jasper Reports or BIRT (which, despite its name, appears to provide very little BI stuff).
It does this by modelling the data (models are created with Framework Manager hence why we use it over and above the standard reporting interface) to translate raw data into business data and also relational to dimensional data if your database isn't already dimensional.
It's this business view of the data combined with the dimensionality which allows really neat manipulation within Cognos BI.
You can create reports in a truly multi-dimensional way, aggregating data in various ways across things like dates, products, geographical regions, stores, divisions and so on (depending on your dimensional setup).
All of the reports are really dynamic in that you can collapse or expand individual dimensions at will so, if for example you want to drill down on a poorly-performing state to see which individual stores in that state are causing problems, it's a simple click on an icon.
No re-querying of the data, everything just happens in very quick time. And the charts and data that can be produced are very nice.
And, on top of that, Cognos BI comes with an inbuilt query studio and report studio which allow the creation of ad-hoc reports in the exact same interface the user sees when running standard reports. No more of the Eclipse-Designer/Web-App separation that we had to endure with BIRT.
Sorry if this sounds evangelistic but we're transitioning from BIRT to Cognos BI, and the difference is substantial.
Now you may not find a lot of information outside of the IBM website, although we did find a couple of dedicated sites when we first started examining the transition. Unfortunately, I don't have them available any more since the IBM information is more than adequate.
We also make a lot of use of the IBM developerWorks forums (we use Tivoli Common Reporting which ships with the Cognos runtimes) and the microsite as well. As well as the forums, there's a whole section of developerWorks dedicated to Cognos.
A bit late, but for the benefit of anyone browsing ... Cognos BI is essentially web based reporting/ dashboarding/ analytics. Historically it connected to relational databases only; from v8.4 onwards (and moreso from v10) it also connects to OLAP cube data sources. It's designed for end user self service reporting and includes mobile as well as web connectivity.
Cognos FPM provides in-server memory OLAP cube modelling (based on the TM1 engine). A key point of difference is that it permits end-user writeback and is generally used for budgeting and 'what-if' scenario modelling. Modelling is facilitated by Rules, which enable data modification. It also scales to the max. As noted above it may be integrated to Cognos BI (as well as being stand-alone), which means that a single dashboard may include reports from both relational & OLAP sources, and provide planning. So it's very powerful.
Note that Cognos Express provides essentially the same tools for the midmarket.
A little late but in case anyone else comes here and is looking for information, I would like to enhance #paxdiablo's answer. He was talking only about Modeling and reporting tool which is the best known Cognos.
There is also a powerful tool named Metric Studio which can track in an easy way, how business is performing. This tool is IMHO the best of the Cognos Suite, since it is truly BI for the high management.
Another thing that I love from Cognos (been using it since 2004) is the administration. From an IT perspective it is way easier to make things happen in Cognos rather than any other tool I've seen (BO included).
Just to name a few: you can link row-filtering with LDAP information (e.g. roles and customers); burst reporting through cognos content or email... the possibilities are huge.

Best framework or system architecture for telemetry / SCADA system design & programming [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I asked a question about a good book for this topic.
It's not getting many replies, so maybe there aren't any good books (although I find that strange; I must go search Amazon for myself).
In the absence of a book, maybe there is a good framework, some good URLs, or just general advice.
As I asked in the other question:
I am looking at several projects, all
with roughly the same functionality.
Some instruments collect some data (or
control some functionality). They
communicate by Internet
(Ethernet/wifi/GPRS/satellite) with
a database server which stores the
measurements and provides a browser
based means of querying the data,
producing reports, etc (and possibly
also allows control of the remote
equipment).
Can anyone recommend a good book
describing an approach to developing
such a software architecture, keeping
it generic, which tools, languages.
test methods, etc to use?
For "book", please substitute "framework, some good URLs, or just general advice".
This is a very common system pattern. Who can advise?
I have written a complete SCADA system (excluding the custom instrumentation hardware). The system was designed to be generic to allow the creation of new hardware models, instruments, and data collection - it was not written like so many SCADA systems for an individual company/plant but is used internationally for thousands of companies/plants.
I was the sole developer/designer with one member of management overseeing and guiding the project. It took longer that way, but it was doable. We looked at other SCADA specific systems/frameworks already out there and decided that since our units were custom, it would be easier and more flexible to write the system from scratch leveraging existing development frameworks and 3rd party components. Looking back, this worked out really well for us because we had the time and skill, but this is generally not the best solution depending on your business/contract model.
I am no longer with that company, however, they still use my software exclusively and I left on excellent terms. I would be happy to answer any general questions you have and to help point you in the right direction.
System Architecture
Here is a high-level overview of what the system consisted of:
Custom Cellular devices that had generic inputs to accommodate multiple instruments of varying types (analog, digital, pressure, amperage, floats, etc.)
UDP/TCP packets of custom format were sent by the units across the cell network (GPRS) to our servers (Windows Server 2003 R2). Information was sent regularly for reporting, and on customizable state changes that could be programmed at the device or online (the configuration sent over the cell network).
A custom Multi-threaded .NET application using TCP/UDP Listeners that grabbed incoming packets (Several Hundred-thousand a day), deciphered custom headers, and routed the packets without further interpretation to the correct database (Some clients required their own standalone system)
A Microsoft SQL 2005 database that acted as the brain for the entire system. Packets were interpreted using CLR functions and automatically triggered alarms (as configured), compiled reports, and kept a full history
A custom .NET application to handle alerts by placing phone calls, sending SMS messages, and sending emails. The phone logic was handled by an Intel Dialogic Card over analog lines using a combination of recorded prompts and Text-To-Speech.
3 ASP.NET sites:
Customer facing site that allowed them to manage their accounts/sub users, track alerts, configure units and alerts, chart data, map devices, export reports, etc.
Sales site that allowed the distribution of material to sales people, tracking of individual devices, device health reporting, etc.
Internal management site that allowed the creation of customer accounts, the configuration/build of units, and all other administrative features as required.
There was also a custom internal monitoring system to verify the health of the system and to alert technicians of problems as necessary since the system needed 24/7 uptime.
In addition we created an iOS app, a mobile site, and a custom web service/client (API) to allow the retrieval of customer data directly by customers to allow them to integrate our solution with their existing (usually custom) SCADA systems.
Those are the components we used and they worked. Doing it again I would change a couple of things. I would use Windows Server 2008 R2, SQL 2008 R2, and instead of a Dialogic card I would use Microsoft TellMe using VoIP. I would also use Silverlight instead of ASP.NET. I really like ASP.NET, but Silverlight can give a much better presentation and can be used outside of the browser as necessary - a common request from SCADA operators.
The sites all used 3rd party components so that charting, and tables didn't have to be written from scratch. There are some SCADA specific components (largely Java based) out there. However, we found most of them to be crude, ugly, or way too specific to be used in our generic system (also expensive! It was easier and more flexible to customize a gauge/charting package to "make" our own).
As mentioned, the brain of the system was the database. This was done because Microsoft SQL is a pretty awesome, well supported product designed for extreme uptimes with great backup and performance options. We were also really impressed with the .NET CLR integration that was possible allowing our custom .NET code to run as a part of that process. The units we were supporting came in a variety of models and could be configured to use any combination of instruments and so keeping the database flexible was key. We used a lot of normalization!
One thing that really helped was to use Recursive CTEs to fake the existence of data when the values were still the default. We did this to save space in the database, but this also enabled us to introduce a layer of abstraction in the database that allowed the queries to be flexible as well.
We had messed with OPC in the past but found it too inflexible, difficult, and irritating for our needs. That was a few years ago though, and I haven't looked at it since.
That is a long and very general answer to your question. I can't give you specific code or go into extreme detail since that information is property of that company, but I can answer some design questions and point you towards frameworks/tools we found helpful. My main advice would be to break everything down into separate components and employ the black box model on each so that individual components can be swapped out/improved as necessary. The scope of the project can seem overwhelming otherwise. Let me know if you have further questions or would like more information, good luck!

Is Pentaho ETL and Data Analyzer good choice?

I was looking for ETL tool and on google found lot about Pentaho Kettle.
I also need a Data Analyzer to run on Star Schema so that business user can play around and generate any kind of report or matrix. Again PentaHo Analyzer is looking good.
Other part of the application will be developed in java and the application should be database agnostic.
Is Pentaho good enough or there are other tools I should check.
Pentaho seems to be pretty solid, offering the whole suite of BI tools, with improved integration reportedly on the way. But...the chances are that companies wanting to go the open source route for their BI solution are also most likely to end up using open source database technology...and in that sense "database agnostic" can easily be a double-edged sword. For instance, you can develop a cube in Microsoft's Analysis Services in the comfortable knowledge that whatver MDX/XMLA your cube sends to the database will be intrepeted consistently, holding very little in the way of nasty surprises.
Compare that to the Pentaho stack, which will typically end interacting with Postgresql or Mysql. I can't vouch for how Postgresql performs in the OLAP realm, but I do know from experience that Mysql - for all its undoubted strengths - has "issues" with the types of SQL that typically crops up all over the place in an OLAP solution (you can't get far in a cube without using GROUP BY or COUNT DISTINCT). So part of what you save in licence costs will almost certainly be used to solve issues arising from the fact the Pentaho doesn't always know which database it is talking to - robbing Peter to (at least partially) pay Paul, so to speak.
Unfortunately, more info is needed. For example:
will you need to exchange data with well-known apps (Oracle Financials, Remedy, etc)? If so, you can save a ton of time & money with an ETL solution that has support for that interface already built-in.
what database products (and versions) and file types do you need to talk to?
do you need to support querying of web-services?
do you need near real-time trickling of data?
do you need rule-level auditing & counts for accounting for every single row
do you need delta processing?
what kinds of machines do you need this to run on? linux? windows? mainframe?
what kind of version control, testing and build processes will this tool have to comply with?
what kind of performance & scalability do you need?
do you mind if the database ends up driving the transformations?
do you need this to run in userspace?
do you need to run parts of it on various networks disconnected from the rest? (not uncommon for extract processes)
how many interfaces and of what complexity do you need to support?
You can spend a lot of time deploying and learning an ETL tool - only to discover that it really doesn't meet your needs very well. You're best off taking a couple of hours to figure that out first.
I've used Talend before with some success. You create your translation by chaining operations together in a graphical designer. There were definitely some WTF's and it was difficult to deal with multi-line records, but it worked well otherwise.
Talend also generates Java and you can access the ETL processes remotely. The tool is also free, although they provide enterprise training and support.
There are lots of choices. Look at BIRT, Talend and Pentaho, if you want free tools. If you want much more robustness, look at Tableau and BIRT Analytics.

Resources