I would like to integrate machine learning model developed by python with obiee 12c dashboard.
You should use In-Database Machine Learning then present it in the obiee.
Oracle Machine Learning for Python (OML4Py) is a Python API for performing statistical and machine learning analysis on data in your Oracle Autonomous Database.
With Embedded Python Execution, you can run user-defined Python functions in Python engines spawned by the Autonomous Database environment.
Automated Machine Learning (AutoML) enhances user productivity and machine learning results through automated algorithm and feature selection, as well as model selection and tuning.
For more information visit this link.
Related
I want to build an user interface on SAS for logistic regression in which we provide the data - or upload the data with the variables and the dependent variable and it gives an output which gives us the Regression table and a few other desire metrices.
What should be the most efficient way of doing this? Does SAS have such a provision of doing it?
To add, if you know some HTML / JavaScript and have access to a SAS Stored Process Web Application (licensed with the Integration Technologies component linked to by DomPazz), we maintain a GPL licensed library that lets you very easily build modern, standard HTML5 user interfaces to your SAS analytical routines.
Have a look at the H54s library on GitHub.
There is an old SAS package called SAS/AF that allowed you to build UIs. That package is rarely used for a number of reasons.
I have built numerous UIs for SAS in .NET connecting to SAS via a Workspace Server. There are also libraries in Java to do the same. Take a look at SAS Integration Technologies (http://support.sas.com/documentation/onlinedoc/inttech/index.html)
If you use SAS EG or SAS Studio they both provide a GUI interface for various procedures. SAS Studio is relatively new, and you can write your own custom tasks for it as well.
Here's the the link to logistic regression analysis in SAS Studio. Even if you don't use it the options should give you an idea of the options that you may want to include in your GUI.
http://support.sas.com/documentation/cdl/en/webeditorug/67434/HTML/default/viewer.htm#n1sahl3jgs8b2qn11jcr7n9vr1qd.htm
I am looking for best suited ETL Tool for the following criteria.
Supports MongoDB
Accepts Metadata as input (Or accepts file and builds its metadata on the fly)
provides configurable Mapping. (mapping can be defined from outside development, using some file ot table)
Please suggest the tool which caters to the above needs.
Hmm, your questing is looking most configurable ETL tool. From past years of experience in ETL process, I can inform you that you will never find such tool that meets all your demands. Especially when you have Enterprise level data warehouse (needed because of high and complex reporting needs), the only one software solution is to build your own custom project based ETL software, which is often ungrateful.
But (big BUT), you can achieve at least 80% of needs with existing tools. Plugins, smart usage of scripts, good data-flow design and (if needed) small custom software in pair with scheduling could help you out to fulfill imagined process. ETL process doesn't seem to be different in compare to any other work - 80% of the work is done in 20% of time, and the rest of work (20%) is done in 80% of time.
My suggestion for you:
Pentaho Data Integration - free and open source
PDI is powerfull ETL tool, and surley can meet your demands. There is a plenty of plugins, solid level community and fine API if you're going to develop more plugins.
Pentaho Data Integration + Integration Server - Enterprise Edition - "cheap enough" for almost every medium size projects
Enterprise edition has everything like free edition, including more plugins (JMS producer for example), version control system, instaview's and ect.
Beside, it has it own Server so scheduling is software based (not OS based), logging, better management and most important thing - support!
Informatica or Microsoft SSIS - expensive and brilliant
I would not wasting words for this tools. Informatica is primary ETL oriented company that using Informatica on high level require deep understanding of DB/DWH design, ETL process, PL/SQL, dimensional modeling ect.
SSIS is primary constructed for SQL Server, so I don't see high usage needs if at least one of your source db or target db (DWH) is not running on SQL Server.
Conclusion
This is just a scratch of plenty tools that market provide to us. Someone else would probably not even mention these tools. Please look one of the lists.
Almost each BI system has it own ETL tool. Maybe the good choice would be to use it together, in that way you will be in possibility to use maximum from both.
Note: Good ETL project manager, or ETL developer can extend tool advantages to level that better/more expensive have!
The QA department that tests my apps at work uses an Oracle database that they all share. Things get really hairy w/ their cases getting changed. bug reported filed + I spend time just to find out the test case has been changed = time wasted.
What I'd like is for dev and qa to all have our own copy of Oracle running on our machines, so we can protect our data and chase our tails... less.
The problem, which I understand, is we don't have funding for all those licenses. Using an open source database won't work because we have all kinds of PL/SQL packages and triggers that I'm sure tie us to Oracle.
Does anyone know of a way (or maybe an open source product) to "fake" an Oracle database? There are no performance requirements at all. I don't mean mocking objects (we do use that for unit testing), but the actual "listening on a port for your request" RDMS. It's a longshot, but I have to ask.
Use Oracle Express for this purpose.
http://www.oracle.com/technetwork/products/express-edition/overview/index.html
Oracle Database 11g Express Edition
Free to develop, deploy, and distribute
Oracle Database 11g Express Edition (Oracle Database XE) is an
entry-level, small-footprint database based on the Oracle Database 11g
Release 2 code base. It's free to develop, deploy, and distribute;
fast to download; and simple to administer.
Oracle Database XE is a great starter database for:
Developers working on PHP, Java, .NET, XML, and Open Source
applications
DBAs who need a free, starter database for training and deployment
Independent Software Vendors (ISVs) and hardware vendors who want a
starter database to distribute free of charge
Educational institutions and students who need a free database for
their curriculum
With Oracle Database XE, you can now develop and deploy applications
with a powerful, proven, industry-leading infrastructure, and then
upgrade when necessary without costly and complex migrations.
Oracle Database XE can be installed on any size host machine with any
number of CPUs (one database per machine), but XE will store up to
11GB of user data, use up to 1GB of memory, and use one CPU on the
host machine.
Support is provided through a free Oracle Discussion Forum monitored
by Oracle employees as well as community experts.
You can use Oracle XE (Express Edition) which is free.
You can download it here.
Assuming that each developer needs less than 11 GB of data in their personal copy of the database, have you looked at using the free express edition of the Oracle database? You can install that on your local machine or even deploy it in production free of charge. You can't use enterprise edition features but basic PL/SQL should work exactly as it does in whatever edition of the database you're using now.
It's not obvious to me, however, that this is really the solution to the problem you're having. If test cases are getting changed without that information getting communicated to developers or test data that one person is relying on is being changed by some other person, creating more database instances with more copies of the same data isn't likely to be terribly helpful. If you have a local copy of the database, you need some way of getting the current version of all the objects (tables, packages, triggers, etc.). You need some way of getting the data that a particular tester is relying on. You need some way of moving your changes from your machine to the shared databases in a way that doesn't stomp on the changes other developers are making. None of these hurdles are insurmountable, but they do require a very solid build and deployment process-- otherwise, you end up with chaos where the version of code in your database is subtly different than the version of code in everyone else's database and the test data in your system has slightly different characteristics than the test data that QA is using leading to lots of bugs that are reproducible on one system but not another. If your current build and deployment process can't even ensure that test cases aren't changing while bugs are being investigated, I would tend to expect that adding more instances is going to make the problem worse, not better.
I've been mandated to find out what IBM Cognos does, and I cannot find useful information on the subject apart what I can read from the IBM Website and Wikipedia.
What I'm after is some concrete examples of what Cognos can do for businesses and organisations that intend to use it.
Financial Performance Management I have no idea about but we use BI 8.4/10.1 quite a bit. The Cognos product line is actually quite large and we only really use the baseline BI stuff with Framework Manager but I'll try and help you out, based on how we use it.
Think of BI itself as an application that lets you view your data in many different ways. Now so far, it's no different to Jasper Reports or BIRT (which, despite its name, appears to provide very little BI stuff).
It does this by modelling the data (models are created with Framework Manager hence why we use it over and above the standard reporting interface) to translate raw data into business data and also relational to dimensional data if your database isn't already dimensional.
It's this business view of the data combined with the dimensionality which allows really neat manipulation within Cognos BI.
You can create reports in a truly multi-dimensional way, aggregating data in various ways across things like dates, products, geographical regions, stores, divisions and so on (depending on your dimensional setup).
All of the reports are really dynamic in that you can collapse or expand individual dimensions at will so, if for example you want to drill down on a poorly-performing state to see which individual stores in that state are causing problems, it's a simple click on an icon.
No re-querying of the data, everything just happens in very quick time. And the charts and data that can be produced are very nice.
And, on top of that, Cognos BI comes with an inbuilt query studio and report studio which allow the creation of ad-hoc reports in the exact same interface the user sees when running standard reports. No more of the Eclipse-Designer/Web-App separation that we had to endure with BIRT.
Sorry if this sounds evangelistic but we're transitioning from BIRT to Cognos BI, and the difference is substantial.
Now you may not find a lot of information outside of the IBM website, although we did find a couple of dedicated sites when we first started examining the transition. Unfortunately, I don't have them available any more since the IBM information is more than adequate.
We also make a lot of use of the IBM developerWorks forums (we use Tivoli Common Reporting which ships with the Cognos runtimes) and the microsite as well. As well as the forums, there's a whole section of developerWorks dedicated to Cognos.
A bit late, but for the benefit of anyone browsing ... Cognos BI is essentially web based reporting/ dashboarding/ analytics. Historically it connected to relational databases only; from v8.4 onwards (and moreso from v10) it also connects to OLAP cube data sources. It's designed for end user self service reporting and includes mobile as well as web connectivity.
Cognos FPM provides in-server memory OLAP cube modelling (based on the TM1 engine). A key point of difference is that it permits end-user writeback and is generally used for budgeting and 'what-if' scenario modelling. Modelling is facilitated by Rules, which enable data modification. It also scales to the max. As noted above it may be integrated to Cognos BI (as well as being stand-alone), which means that a single dashboard may include reports from both relational & OLAP sources, and provide planning. So it's very powerful.
Note that Cognos Express provides essentially the same tools for the midmarket.
A little late but in case anyone else comes here and is looking for information, I would like to enhance #paxdiablo's answer. He was talking only about Modeling and reporting tool which is the best known Cognos.
There is also a powerful tool named Metric Studio which can track in an easy way, how business is performing. This tool is IMHO the best of the Cognos Suite, since it is truly BI for the high management.
Another thing that I love from Cognos (been using it since 2004) is the administration. From an IT perspective it is way easier to make things happen in Cognos rather than any other tool I've seen (BO included).
Just to name a few: you can link row-filtering with LDAP information (e.g. roles and customers); burst reporting through cognos content or email... the possibilities are huge.
I was looking for ETL tool and on google found lot about Pentaho Kettle.
I also need a Data Analyzer to run on Star Schema so that business user can play around and generate any kind of report or matrix. Again PentaHo Analyzer is looking good.
Other part of the application will be developed in java and the application should be database agnostic.
Is Pentaho good enough or there are other tools I should check.
Pentaho seems to be pretty solid, offering the whole suite of BI tools, with improved integration reportedly on the way. But...the chances are that companies wanting to go the open source route for their BI solution are also most likely to end up using open source database technology...and in that sense "database agnostic" can easily be a double-edged sword. For instance, you can develop a cube in Microsoft's Analysis Services in the comfortable knowledge that whatver MDX/XMLA your cube sends to the database will be intrepeted consistently, holding very little in the way of nasty surprises.
Compare that to the Pentaho stack, which will typically end interacting with Postgresql or Mysql. I can't vouch for how Postgresql performs in the OLAP realm, but I do know from experience that Mysql - for all its undoubted strengths - has "issues" with the types of SQL that typically crops up all over the place in an OLAP solution (you can't get far in a cube without using GROUP BY or COUNT DISTINCT). So part of what you save in licence costs will almost certainly be used to solve issues arising from the fact the Pentaho doesn't always know which database it is talking to - robbing Peter to (at least partially) pay Paul, so to speak.
Unfortunately, more info is needed. For example:
will you need to exchange data with well-known apps (Oracle Financials, Remedy, etc)? If so, you can save a ton of time & money with an ETL solution that has support for that interface already built-in.
what database products (and versions) and file types do you need to talk to?
do you need to support querying of web-services?
do you need near real-time trickling of data?
do you need rule-level auditing & counts for accounting for every single row
do you need delta processing?
what kinds of machines do you need this to run on? linux? windows? mainframe?
what kind of version control, testing and build processes will this tool have to comply with?
what kind of performance & scalability do you need?
do you mind if the database ends up driving the transformations?
do you need this to run in userspace?
do you need to run parts of it on various networks disconnected from the rest? (not uncommon for extract processes)
how many interfaces and of what complexity do you need to support?
You can spend a lot of time deploying and learning an ETL tool - only to discover that it really doesn't meet your needs very well. You're best off taking a couple of hours to figure that out first.
I've used Talend before with some success. You create your translation by chaining operations together in a graphical designer. There were definitely some WTF's and it was difficult to deal with multi-line records, but it worked well otherwise.
Talend also generates Java and you can access the ETL processes remotely. The tool is also free, although they provide enterprise training and support.
There are lots of choices. Look at BIRT, Talend and Pentaho, if you want free tools. If you want much more robustness, look at Tableau and BIRT Analytics.