i want data intregration part do with the help of Talend and other reporting, dashboard work will do in microstrategy. How can i connect them ?
is any odbc or any kind of process is possible ?
As i understand you want to integrate an ETL tool (Talend) to adecuate data for presentation, the only thing that does not sound to clear for me it's why you have interest in integrate them.
They are two process apart, so you can take the processed data thru the ETL tool (Talend or whatever other tool) to a db and that be the source of the BI tool you prefer.
Regards,
Alejandro
Related
We're considering Snowflake and want to understand how we could use it, and possibly other tools, to overcome one of our main problems - ETL! We currently use a legacy DWH with an ETL process consisting of SSIS and some views. This has all the common pitfalls of this methodology - most notably that it takes ages!
I was under the assumption that we'd move to an ELT model in Snowflake, I started to research tools to do the 'T' part of it, however, I'm just listening to this podcast: https://www.dataengineeringpodcast.com/snowflakedb-cloud-data-warehouse-episode-110/
And it's suggesting that just slapping a SQL View over something and exposing it in say PowerBI or Tableau is enough for the T part of things!...
Just wondering what people's experience was here?
- Do you do transformations just by writing a view in Snowflake?
- Do you use a third party tool specifically to address this need?
Secondary to this, for the Extraction and Loading, do you:
- Do this using Snowflake only
- Use a third party tool
I'm specifically interested if you do this to create some kind of timeseries in Snowflake from a non timeseries source. That's something we'd be keen to do.
This question is hard to answer without sounding opinionated, especially not knowing your use case. For what it's worth here is what I think:
Don't stick views on top of your tables and expose to a reporting tool unless you have a very very simple setup. If you're considering a tool like Snowflake then you will probably want to go for something more sustainable, this approach can become prohibitive in terms of cost and complexity in your views.
Use a third-party tool to manage your ELT process. Your choice of tool will depend on your internal skills and cloud strategy, have a look at the tools out there like Stich, Fivetran etc. If you don't mind having on-premise technologies why not stick with SSIS or use something like Apache Airflow (requires up-skilling)
Snowflake will not help you with the E of ELT, you will need to use a third-party tool to manage the extract of data from your other systems like SSIS. It will help with the L part, for this you can use Snowpipe or COPY commands which are available within the Snowflake ecosystem. Snowflake will also help you share your data with external parties which is really nice.
My organization has created a fairly complicated dimensional model in Snowflake using layers of SQL views, against which we can point our reporting tools. We use a separate replication tool for extraction from source systems and loading into Snowflake. Using views simplifies our approach in that we don't need to use an additional tool. It also makes managing the code easier than something like SSIS. For instance, we can search for code using the Snowflake interface or our version control tool instead of having to open individual SSIS packages.
I don't have any experience on any ETL tool. However I want to know if it is possible to do the followings using any ETL tool or we need to write a java or any other batch job to do this:
Scenario 1:
The source system has different REST APIs. I need to get the data, transform it, then store the data in a MongoDB.
The hardest part is the transformation. There can be situation where I need to call a REST API of source, and based on its data I need to call several other REST APIs using the 1st API data. After that we need to format the entire data in different format and store it in Mongo.
Scenario 2:
The source system has a DB. I need to transform the data using my custom logic and store it in MongoDB.
Here the custom logic can include things like this:
From table1 of source I created collection1. After that I need to consult table2 and previously created collection1, process the data and then create collection2.
Is this possible using any ETL tool? If possible then which tool? If possible please mention in as short as possible, how it can be done using different terminology so that I can search internet, learn things and implement it.
Briefly speaking: yes, that is what ETL tools are exactly for. You can Extract data from REST sources, Transform using sophisticated logic and Load to target, like MongoDB.
Exact implementation depends on the tool. While I guess you will get help if you run across problems implementing the solution in any of the tools, I don't think anyone will prepare complete, detailed solutions for you.
Can I use Bigdata in ecommerce (Magento)? we are having a custom website that extracts data from DB and displays it as report. But due to large data, timeout is happening…What can i do?
Yes, you can use Big data in e-commerce.
But I don't think the problem you're solving is related to big data. Try generating the report on the server side (batch and store in a table) and use pagination to display the report on the website. If you have to generate the report on user action, try using materialized views or stored procedure to speed up the generation.
Yes, you can use big data platform such as Hadoop and NoSQLs whichever best fit for your use case. In my work experience so far people used Hadoop back-end with BI tools like tableau and Qilikview etc.
Hope this will help.
I have a question about mongodb. Indeed, I want to switch from oracle dbs to a mongo db in order to have more flexible structure. The goal of my project consists in carrying out some consistent data analysis after implementing a mongoDB which will able to store my data in json format for instance or some useful logs about requests done by my colleagues on a Web service.
Please, what are your tips ? What are the most efficient java frameworks in order to build a solid database ? Need I to learn some other languages ?
Feel free to suggest or to give me our advice on who to start properly with this tool. Any share of your feedback on your experiences with data analytics for BI will be a real pleasure for me.
Thanks... :)
I'm writing a report and thought you guys could help by providing me with the costs of company support in setting up and training a client on a data integrator for Salesforce. E.g., if someone wants to use Salesforce, but first needs a tool to consolidate and transfer data from back office systems to Salesforce how much would that support service cost?
Salesforce actually comes with a very good integration tool called Data Loader. It can be run as an interactive application under Windows or Macintosh, or it can be run as a command-line tool on Windows, Mac or Linux.
In interactive mode, it can import & export CSV files.
In batch mode it can also read data from, and write data to, a database.
For example, I have a Linux server where a daily cron job activates the Data Loader which runs several jobs. Some of these jobs run SQL against a database and upload the resulting data into Salesforce. Other jobs extract from Salesforce (using their SOQL query language, which is SQL-like) and store the information into a database.
Data Loader has a bit of a learning curve for batch mode (mostly around creating some XML configuration files), but the Interactive mode is very easy to use.
So, to answer your question... If it's a one-time data load, just run the interactive version and it's easy. If you want regularly-updated data, then use the batch mode. Support costs for operating the integration are really all in the setup. Once it's running, there shouldn't be any on-going costs unless the data structures change and you want to change the data being transferred. Better yet, if the system is setup by somebody who has done it before, you'll avoid a big learning curve.
If you want a figure to put into your report, then allow 3 days for the initial integration (allows for learning curve) and then a half-day for each additional one. That's generous, but provides extra time to debug problems.
To some degree, it depends on two factors:
Where is the data's source of truth?
How often do you want to sync the data?
If the answers are "it's a weird place and I only need to sync it once," then you probably want to figure out how to get it in CSV form and then use tools built into Salesforce to import it.
However, if the data lives in a database or data warehouse (postgres, mysql, mongo, redshift, snowflake, big query, etc) and especially if you want to keep Salesforce up to date with that source of truth continuously, then you could look into so-called "Reverse ETL" tools made for this purpose.
Costs depend on the tool chosen and the data volumes and other factors, but here are some options:
Grouparoo is an open source Reverse ETL tool. You can host it yourself for free. Paid plans start at $150/month.
Census is a SaaS Reverse ETL tool. Paid plans start at $300/month.
Hightouch is a SaaS Reverse ETL tool. Paid plans start at $350/month.