I have to develop a web application (for Healthcare sector) in python using a Big Data platform (NoSQL Database, like Elasticsearch for ex.)
I want to know, what is the best Big Data platform for this situation ?
Could someone help me ?
You need to do an evaluation of NoSQL databases. Like write down the features you need or how much you need them, examples how you want to access or fill the database, and then try out several noSQL databases how they perform or how you can work with them.
There is no "best" noSQL Database.
Related
I want to start a project on Laravel and want to go for NOSQL. I need extensive search with this project and was considering Mongodb but I am not sure about search.
Few related questions:
Is there enough support for using NOSQL, incase I get stuck somewhere?
NOSQL is flexible enough for searching parameters?
If I need to import data from previous project to NoSQL will it be a challenge?
What about realtime time, does NOSQL supports realtime?
Thanks in advance.
We're considering Snowflake and want to understand how we could use it, and possibly other tools, to overcome one of our main problems - ETL! We currently use a legacy DWH with an ETL process consisting of SSIS and some views. This has all the common pitfalls of this methodology - most notably that it takes ages!
I was under the assumption that we'd move to an ELT model in Snowflake, I started to research tools to do the 'T' part of it, however, I'm just listening to this podcast: https://www.dataengineeringpodcast.com/snowflakedb-cloud-data-warehouse-episode-110/
And it's suggesting that just slapping a SQL View over something and exposing it in say PowerBI or Tableau is enough for the T part of things!...
Just wondering what people's experience was here?
- Do you do transformations just by writing a view in Snowflake?
- Do you use a third party tool specifically to address this need?
Secondary to this, for the Extraction and Loading, do you:
- Do this using Snowflake only
- Use a third party tool
I'm specifically interested if you do this to create some kind of timeseries in Snowflake from a non timeseries source. That's something we'd be keen to do.
This question is hard to answer without sounding opinionated, especially not knowing your use case. For what it's worth here is what I think:
Don't stick views on top of your tables and expose to a reporting tool unless you have a very very simple setup. If you're considering a tool like Snowflake then you will probably want to go for something more sustainable, this approach can become prohibitive in terms of cost and complexity in your views.
Use a third-party tool to manage your ELT process. Your choice of tool will depend on your internal skills and cloud strategy, have a look at the tools out there like Stich, Fivetran etc. If you don't mind having on-premise technologies why not stick with SSIS or use something like Apache Airflow (requires up-skilling)
Snowflake will not help you with the E of ELT, you will need to use a third-party tool to manage the extract of data from your other systems like SSIS. It will help with the L part, for this you can use Snowpipe or COPY commands which are available within the Snowflake ecosystem. Snowflake will also help you share your data with external parties which is really nice.
My organization has created a fairly complicated dimensional model in Snowflake using layers of SQL views, against which we can point our reporting tools. We use a separate replication tool for extraction from source systems and loading into Snowflake. Using views simplifies our approach in that we don't need to use an additional tool. It also makes managing the code easier than something like SSIS. For instance, we can search for code using the Snowflake interface or our version control tool instead of having to open individual SSIS packages.
I have a question about mongodb. Indeed, I want to switch from oracle dbs to a mongo db in order to have more flexible structure. The goal of my project consists in carrying out some consistent data analysis after implementing a mongoDB which will able to store my data in json format for instance or some useful logs about requests done by my colleagues on a Web service.
Please, what are your tips ? What are the most efficient java frameworks in order to build a solid database ? Need I to learn some other languages ?
Feel free to suggest or to give me our advice on who to start properly with this tool. Any share of your feedback on your experiences with data analytics for BI will be a real pleasure for me.
Thanks... :)
I read some posts here, but a real answer I didnt find.
Normally I work and worked with normal SQL Databaeses (MS SQL, MySQL), when I developed applications (ERP, CRM, PPS, Web Shops etc.). A real contact/experience with document-oriented databases in real business was not possible.
Only in a private sector (hobby, experimental projects) I tested MongoDB and CouchDB. The experience was good, but not enough to say "Yes, let it use for business!", because I could not test it in a real environment.
But now, there is a chance to program from zero, which could be a big start for a business.
So my question:
Can I use Couchebase for a big business application, where thousands users would use it. Is it so fast and with good performance to handling thousnds of queries, requests/reposts etc.?
How looks like with backup and restore?
Where is the limit of couchbase?
Thank you for the anwser.
In short, yes.
Your questions are too broad to fully address here. Couchbase has many real-world installations with clients doing production work at large scale. You can see several references with write ups of their uses on the Couchbase site. (Note this is not a complete list of customers, only ones that have agreed to have their use highlighted.) You will definitely recognize some names.
Is it possible to design a twitter like DB using SQL server? a DB that will ensure high scalability and fast queries.
I am building a .NET platform that requires a similar model like twitter (User, Follower, Tweet) and looking into what will fit best in terms of fast queries and scalability.
Will it be possible using a relational DB or is a graph db much better?
SQL Server will most certainly be able to handle any load that you have. SQL Azure supports databases up to 150GB (though I hear you can get more if you ask). With Azure SQL Federation, you can scale out multiple databases on hundreds of nodes around the world.
As for a relational database like SQL Server, or the "NoSQL" variants like Azure Table Storage, it depends on your needs and how structured your data is. Given you'll probably do a lot of joins, querying for followers of users, tweets that someone should see, etc. you're best bet is to go with a relational db. Even Facebook still uses MySQL, so you're not exactly in bad company with using a relational db.