Data Loading Software [closed] - loading

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
We deal with scientific research data and we have volumes and volumes of data put together in different template file formats (excel, csv, txt, xml etc). We were using old legacy C programs (developed in-house) to load these data into our databases. (We use ingres as our DBMS). Are there any open-source software that is available for ETL (extraction, transformation , loading) process?. What have been your experiences, if you have used any?

Based on what other Ingress users are saying, the 2 that are fairly well spoken of are Talend and Pentaho.
Pentaho site: http://www.pentaho.com/
Talend site - as already mentioned by Paul: http://talend.com/index.php

Here is an open source solution for importing multiple file formats into a database system or other system type.
http://talend.com/index.php
At the company I work at we use SQL Server Integration Services which does similar things but it should come with SQL Server if you're using that.

There is an opensource set of bi and etl tools - have a look at Pentaho - I believe it's etl tool is called "kettle" - pretty rich set of functionality, gui tools for the etl process.

We use DBMS/COPY but it looks like it is no longer in production. It has a GUI interface for setting up scripts or you can hand-write them.

Related

Selecting the ETL tool [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I will have this situation in my project:
DB: Cassandra
Data Source: A Relational DB
Manager of ETL process: .NET API
ETL TOOL: ???
I will use Cassandra as my database.
It will collaborate with Oracle db using an ETL tool, dynamically.
The source data has stored in oracle and it will use in Cassandra with the management of an API.
The question is that, what ETL tool is better in this situation, since my ETL manager, which will select the parameters dynamically, is a .NET API (User will use this API backstaires, using the project.)
SSIS was a good tool, since it's compatible with Microsoft .Net. but incompatible with Cassandra 3.x ). Then, we can't use the benefits of new versions of Cassandra, as materialized view, SASI secondary index (soon) and etc.

Testing Hadoop to Teradata flow [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
I would like to test a flow between an hadoop datalake and teradata tables. The thing is that I am new to these technologies.
The data lake is my data source for the datawarehouse I have on teradata.
I read about QuerySurge but I'd like to know if it is possible to create my own scripts to test the flows.
Teradata offers connectors for Cloudera (link) and Hortonworks (link) which facilitate moving data between the platforms.
QueryGrid is an offering from Teradata that allows you to create "linked servers" on your Teradata platform. Using these "linked servers" you are able to query data on a Hadoop platform from Teradata. Currently, these types of workloads are intended to be low concurrency. That landscape is evolving fairly quickly and concurrency rates may increase as the technologies evolve and mature.
Feel free to use QuerySurge.
I'm working on Querysurge from last 5 years to test and validate the data from different sources.
it's basically automation of custom SQL Scripts.
QuerySurge

Alternatives to SchemaSpy [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I am looking for a open source tool that can be used to generate ER diagram. Currently, this is done using SchemaSpy. Maven scripts are invoked during jenkins build to generate these data model diagrams. I have tried POCs using SchemaCrawler as well. However, the results are not much satisfactory. Would appreciate if I can get pointers to alternative tools that can be used along with the same setup (maven and jenkins).
If you would like to find out good alternatives to SchemaSpy try to use and test this tools:
SchemaCrawler
Red-Gate SQL Doc (not FOSS)
Dataedo (not FOSS)
SchemaSpy 6.0
Each of them has different advantages and disadvantages SchemaCawler is also open source java based and free. SchemaSpy 6.0 this is new version of SchemaSpy that has better look and feel plus fix some major issue.
Dataedo is very interesting tool that has also possibility to generate documentation to pdf, html. With Dataedo you can write comments of tables and columns and after apply them on your database. As I remember on supplier page you can find also free version.
The last solution that I want to recommend is Red-Gate SQL Doc. This is also generate nice looking documentation and has many options. But as usually this solution is not free you need pay to use it.

Tableau programming [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I am new here, and i hope that i could find answers for my questions related to open source reporting systems.
Is it possible to change in the programming logic of 'Tableau desktop'? I am asking this because i need to make changes that
enables me to log users' interactions with the system (Tableau
Desktop).
Is it possible to perform Big Data analysis by combining Tableau Desktop with Hadoop or Spark?
If the answers for the above questions is no, then could you please
recommend any other open source (free) reporting system that satisfy
these requirements.
Thank you in advance and best regards to all of you
Tableau has drivers to connect to several "big data" No SQL databases, and has added a Spark SQL driver as of Tableau version 8.3.
The full list of supported drivers can be found on Tableau's website at http://www.tableau.com/support/drivers
Your question about logging user interactions is not at all clear, but you might have better luck instituting logging at the database level instead of at client level.
In response to your question regarding user interactions, I'd recommend you take a look at the views_stats table in the Tableau Server database.
Instructions for connecting to the 'workgroup' database: http://onlinehelp.tableau.com/current/server/en-us/adminview_postgres_connect.htm
Versions 8 and 9 includes a Spark connection
As far as logging users goes, Tableau Desktop is designed as a single license tool for developers and shouldn't need to be logged.
If you're interested in logging users, you may be thinking of Tableau Server, which has built-in functions for things like that as well as a REST API, which has some additional functions.

How to test performance of software AG - Web methods implementation? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I have a requirement to test the performance of an ESB implementation done using software SG - web methods 9.5.
Please let me know the tools that can be used and the approach to be followed for testing.
Thanks
I used SoapUI to performance test webMethods Integration Server a couple of years ago.
I set up requests, number of clients etc in SoapUI that represented different scenarios of usage in the live system.
After the tests I exported data from SoapUI, wrote some scripts to analyze it and used Excel to present it in a pretty way.
Since you don't specify exactly what kind of performance test you want to run this may or may not work for you as well.
It's hard to provide any suggestions because little is known from your "ESB implementation" and little is known from the performance requirements. For example, from which point in your architecture do you want to test performance.
As suggested by ellak, using SOAPUI is an option if your "ESB implementation" exposes a web services and if you want to start load testing at the ESB level.
If you want better advice then you need to provide more information.

Resources