Is no sql database a good solution for a small application? - jdbc

I am developing an internal web application that needs a back end. The data stored is not really RDBMS type. Currently it is in XML document fashion that the application parses (XQuery) to display html tables and other type of fields.
It is likely that I will have a few more different types of XML documents and CSV(comma separated values) coming up. Given the scenario, I can always back the data up with a mySQL database, breaking the process that generates XML or CSV to insert straight in to database.
Is no-sql database a good choice in this scenario? or mySQL is still better? I do not see any need for clustering/high availability/distributed processing scenarios.

Define "better".
I think the choice should be made based on how relational (MySQL) or document-based (NoSQL) your data is.
A good way to know is to analyze typical use cases. Better yet, write two prototypes and measure.

Related

How to retrieve the data from database without using apache jackrabbit datastore?

I have integrated the jack rabbit with Oracle database and I am storing the
Data using Jackrabbit, if I don't want to retrieve the data using the
Jackrabbit, in what way I can get the data. In database data is storing in
blob type.
The way Jackrabbit stores the data in the DB is an implementation detail, and it does not magically map this into a "nice" DB schema if that's what you mean. (The hierarchical nature and all the JCR features make this impossible). It's a bit like having a Unix file system and then asking how can I read the low level inodes etc. from the file system implementation - you really should not.
Last but not least note that while it is running nothing else (except for a Jackrabbit cluster setup) must write to the DB (the tables used by Jackrabbit) as this will easily lead to data corruption.
As #TedTrippin already mentioned above, an ORM framework would make things much easier. But if you really want to do it manually in Oracle, the approach would be:
Study the code of the OCM http://jackrabbit.apache.org/jcr/object-content-mapping.html, then get the content according to the logic of associations and relations from Oracle, probably not in one but multiple queries per document; eventually with user-defined functions, which are supported in Oracle and might make things easier.
Would be interesting to know the background of your questions. You tagged it with "Spring" and "CMS". I don't see any reason why you would want to access the data directly from Oracle, it's tedious. In case you want to provide an API for the content to an external system, or in case you have lost a CMS that was once in front of and just using the Jackrabbit repo as a content store, you could still use such ORM / OCM framework standalone to make it easier to access the data.

Multidimensional data types

So I was thinking... Imagine you have to write a program that would represent a schedule of a whole college.
That schedule has several dimensions (e.g.):
time
location
indivitual(s) attending it
lecturer(s)
subject
You would have to be able to display the schedule from several standpoints:
everything held in one location in certain timeframe
everything attended by individual in certain timeframe
everything lecturered by a certain lecturer in certain timeframe
etc.
How would you save such data, and yet keep the ability to view it from different angles?
Only way I could think of was to save it in every form you might need it:
E.g. you have folder "students" and in it each student has a file and it contains when and why and where he has to be. However, you also have a folder "locations" and each location has a file which contains who and why and when has to be there. The more angles you have, the more size-per-info ratio increases.
But that seems highly inefficinet, spacewise.
Is there any other way?
My knowledge of Javascript is 0, but I wonder if such things would be possible with it, even in this space inefficient form.
If not that, I wonder if it would work in any other standard (C++, C#, Java, etc.) language, primarily in Java...
EDIT: Could this be done by using MySQL database?
Basically, you are trying to first store data and then present it under different views.
SQL databases were made exactly for that: from one side you build a schema and instantiate it in a database to store your data (the language is called Data Definition Language, DDL), then you make requests on it with the query language (SQL), what you call "views". There are even "views" objects in SQL databases to build these views Inside the database (rather than having to the code of the request in the user code).
MySQL can do that for sure, note that it is possible to compile some SQL engine for Javascript (SQLite for example) and use local web store to store the data.
There is another aspect to your question: optimization of the queries. While SQL can do most of the request job for your views. It is sometimes preferred to create actual copies of the requests results in so called "datamarts" (this is called de-normalizing a request), so that the hard work of selecting or computing aggregate/groups functions and so on is done once per period of time (imagine that a specific view changes only on Monday), then requesters just have to read these results. It is important in this case to separate at least semantically what is primary data from what is secondary data (and for performance/user rights reasons, physical separation is often a good idea).
Note that as you cited MySQL, I wrote about SQL but mostly any database technology could do that what you searched to do (hierarchical, object oriented, XML...) as long as the particular implementation that you use is flexible enough for your data and requests.
So in short:
I would use a SQL database to store the data
make appropriate views / requests
if I need huge request performance, make appropriate de-normalized data available
the language is not important there, any will do

ColdFusion improving performance of queries within loops

I've got a database setup that is a bit on the complicated side, with several many-many tables.
I'm trying to generate an XML document from this data. There's a bit of checking, like if a name is not defined in one language try to get the name from another language (instead of showing null)
The problem I have that there are a lot of queries within loops.
Are there any guidelines for this, like what stuff to stay away from and what to use, to improve the performance?
cfoutput cfloop cfquery ?
If the looping logic is basically doing data processing, eg: based on the values from the first query, deciding what to go back to the database with for the next query, the best thing you can do for performance is to take all that logic out of your CF code, and put it into the DB. Use the DB for data processing, use CF for handling the data once it's been processed, and converting it into output.
The only time CF should be doing data manipulation is if you need to process data from differing sources: eg the database, some remote service, the file system, a different database, etc. Basically only if the database can't do the data processing itself should you be involving ColdFusion.
Regarding, " like if a name is not defined in one language try to get the name from another language (instead of showing null)".
You should be able to do this in your query. Pretty much every db out there has a coalesce function. They all support case constructs as well. You just have to pick the most appropriate method for your situation.

Database or XML performs well in WP7?

In my app i have to store some data. I'm thinking of XML instead of database. But little confused that which is faster.The data contains some URLs and some strings.
Please let me know xml or database is better?
It depends on what kind of app you are trying to develop.
Like a weather forecast app , you just need to save several provinces/cities info .
I think xml is better . Because it is more easy to implement and maintain.
And Like a diary app , the data increase very fast. So DB is more better , because the large xml file would affect the performance.
I thinks these kinds of questions are more discussive and most likely to be voted for closing.
Nevertheless, the performance depends on the size of the stored data.
While an XML file is small, it will generally perform better then the DB (considering an overhead you will need to go through while deploying it, etc.)
But when you need to store a lot of structured data - DB will after all will the race.
And since I think that the phone is not a place for an RDBMS engine, I go with XML storage on WP7 for now.
One of the things I've experience with WP7 and the built in database is that there's a bit more upfront performance cost to using the database engine than there is with straight Isolated Storage and XML. It was enough of a performance hit during application startup that it was apparent to the user that there was a delay in populating their data.
I would say that for small amounts of data where you just need to read and display, XML is probably your best bet, but for data where you might have to do a lot of aggregating and grouping, it will probably wind up being easier to do with SQL, so you'll need to measure the trade-offs between performance and ease-of-coding/maintenance before you make your decision.

Best strategy for retrieving large dynamically-specified tables on an ASP.NET page

Looking for a bit of advice on how to optimise one of our projects. We have a ASP.NET/C# system that retrieves data from a SQL2008 data and presents it on a DevExpress ASPxGridView. The data that's retrieved can come from one of a number of databases - all of which are slightly different and are being added and removed regularly. The user is presented with a list of live "companies", and the data is retrieved from the corresponding database.
At the moment, data is being retrieved using a standard SqlDataSource and a dynamically-created SQL SELECT statement. There are a few JOINs in the statement, as well as optional WHERE constraints, again dynamically-created depending on the database and the user's permission level.
All of this works great (honest!), apart from performance. When it comes to some databases, there are several hundreds of thousands of rows, and retrieving and paging through the data is quite slow (the databases are already properly indexed). I've therefore been looking at ways of speeding the system up, and it seems to boil down to two choices: XPO or LINQ.
LINQ seems to be the popular choice, but I'm not sure how easy it will be to implement with a system that is so dynamic in nature - would I need to create "definitions" for each database that LINQ could access? I'm also a bit unsure about creating the LINQ queries dynamically too, although looking at a few examples that part at least seems doable.
XPO, on the other hand, seems to allow me to create a XPO Data Source on the fly. However, I can't find too much information on how to JOIN to other tables.
Can anyone offer any advice on which method - if any - is the best to try and retro-fit into this project? Or is the dynamic SQL model currently used fundamentally different from LINQ and XPO and best left alone?
Before you go and change the whole way that your app talks to the database, have you had a look at the following:
Run your code through a performance profiler (such as Redgate's performance profiler), the results are often surprising.
If you are constructing the SQL string on the fly, are you using .Net best practices such as String.Concat("str1", "str2") instead of "str1" + "str2". Remember, multiple small gains add up to big gains.
Have you thought about having a summary table or database that is periodically updated (say every 15 mins, you might need to run a service to update this data automatically.) so that you are only hitting one database. New connections to databases are quiet expensive.
Have you looked at the query plans for the SQL that you are running. Today, I moved a dynamically created SQL string to a sproc (only 1 param changed) and shaved 5-10 seconds off the running time (it was being called 100-10000 times depending on some conditions).
Just a warning if you do use LINQ. I have seen some developers who have decided to use LINQ write more inefficient code because they did not know what they are doing (pulling 36,000 records when they needed to check for 1 for example). This things are very easily overlooked.
Just something to get you started on and hopefully there is something there that you haven't thought of.
Cheers,
Stu
As far as I understand you are talking about so called server mode when all data manipulations are done on the DB server instead of them to the web server and processing them there. In this mode grid works very fast with data sources that can contain hundreds thousands records. If you want to use this mode, you should either create the corresponding LINQ classes or XPO classes. If you decide to use LINQ based server mode, the LINQServerModeDataSource provides the Selecting event which can be used to set a custom IQueryable and KeyExpression. I would suggest that you use LINQ in your application. I hope, this information will be helpful to you.
I guess there are two points where performance might be tweaked in this case. I'll assume that you're accessing the database directly rather than through some kind of secondary layer.
First, you don't say how you're displaying the data itself. If you're loading thousands of records into a grid, that will take time no matter how fast everything else is. Obviously the trick here is to show a subset of the data and allow the user to page, etc. If you're not doing this then that might be a good place to start.
Second, you say that the tables are properly indexed. If this is the case, and assuming that you're not loading 1,000 records into the page at once and retreiving only subsets at a time, then you should be OK.
But, if you're only doing an ExecuteQuery() against an SQL connection to get a dataset back I don't see how Linq or anything else will help you. I'd say that the problem is obviously on the DB side.
So to solve the problem with the database you need to profile the different SELECT statements you're running against it, examine the query plan and identify the places where things are slowing down. You might want to start by using the SQL Server Profiler, but if you have a good DBA, sometimes just looking at the query plan (which you can get from Management Studio) is usually enough.

Resources