Looking for alternatives to the database project - visual-studio

I've a fairly large database project which contains nine databases and one database with a fairly large schema.
This project takes a large amount of time to build and I'm about to pull my hair out. We'd like to keep our database source controlled, but having a hard getting the other devs to use the project and build the database project before checking in just because it takes so long to build.
It is seriously crippling our work so I'm look for alternatives. Maybe something can be done with Redgate's SQL Compare? I think maybe the only drawback here is that it doesn't validate syntax? Anyone's thoughts/suggestions would be most appreciated.

Please consider trying SQL Source Control, which is a product designed to work alongside SQL Compare as part of a database development lifecycle. It's in Beta at the moment, but it's feature complete and it's very close to its full release.
http://www.red-gate.com/products/SQL_Source_Control/index.htm
We'd be interested to know how this performs on a commit in comparison to the time it takes for Visual Studio to build your current Database Project. Do you actually need to build the project so often in VS that it's a problem? How large is your schema and how long is an average build?

Keeping Dev/live db in sync:
There are probably a whole host of ways of doing this, I'm sure other users will expand further (including software solutions).
In my case I use a two fold approach:
(a) run scripts to get differences between db (stored procs, tables, fields, etc)
(b) Keep a strict log of db changes (NOT data changes)
In my case I've over time built up a semi structured log thus:
Client_Details [Alter][Table][New Field]
{
EnforcePasswordChange;
}
Users [Alter][Table][New Field]
{
PasswordLastUpdated;
}
P_User_GetUserPasswordEnforcement [New][Stored Procedure]
P_User_UpdateNewPassword [New][Stored Procedure]
P_User_GetCurrentPassword [New][Stored Procedure]
P_Doc_BulkDeArchive [New][Stored Procedure]
ignore the tabing, the markdown has messed it up.
But you get the general gist.
I find that 99% of the time the log is all I need.

Related

SSIS Get List of all OLE DB Destinations in Data Flow

I have several SSIS Packages that we use to load in data from multiple different OLE DB data sources into our DB. Inside each package we have several Data Flow tasks that hold a large amount of OLE DB Sources and Destinations. What I'm looking to do is see if there is a way to get a text output the holds all of the Destinations flow configurations (Sources would be good to but not top of my list).
I'm trying to make sure that all my OLE DB Destination flows are pointed at the right table, as I've found a few hiccups, without having to double click on each Flow task and check that way, it just becomes tedious and still prone to missing things.
I'm viewing the packages in Visual Studio 2013. Any help is appreciated!
I am not aware of any programmatic ways to discover this data, other than building an application to read the XML within the *.dtsx package. Best advice, pack a lunch and have at it. I know for sure that there is nothing with respect to viewing and setting database tables (only server connections).
Though, a solution I may add once you have determined the list: create a variable(s) to store the unique connection strings and then set those connection strings inside the source/destination components. This will make it easier to manage going forward. In fact, you can take it one step further by setting the same values as parameter, as opposed to variables, which have the added benefit of being exposed on the server. This allows either you or the DBA to set the values as you promote through environments or change server nodes.
Also, I recommend rationalizing that solution into smaller solutions if possible. In my opinion, there is nothing worse than one giant solution that tries to do it all. I am not sure if any of this is helpful, but for what its worth, I do hope it helps.
You can use the SSIS Object Model for your needs..An example can be found here. Look in the method IterateAllDestinationComponentnsInPackage for the exact details. To start understanding the code, start in the Start method and follow the path.
Caveats: Make sure you use the appropriate Monikers and Class IDs for the Data Flow Tasks and your Destination Components. You can also use this for other Control Flow Tasks and Data Flow Components (for example, Source Components as your other need seems to be). Just keep in mind the appropriate Monikers and Class IDs.

Rails, how to migrate large amount of data?

I have a Rails 3 app running an older version of Spree (an open source shopping cart). I am in the process of updating it to the latest version. This requires me to run numerous migrations on the database to be compatible with the latest version. However the apps current database is roughly around 300mb and to run the migrations on my local machine (mac os x 10.7, 4gb ram, 2.4GHz Core 2 Duo) takes over three days to complete.
I was able to decrease this time to only 16 hours using an Amazon EC2 instance (High-I/O On-Demand Instances, Quadruple Extra Large). But 16 hours is still too long as I will have to take down the site to perform this update.
Does anyone have any other suggestions to lower this time? Or any tips to increase the performance of the migrations?
FYI: using Ruby 1.9.2, and Ubuntu on the Amazon instance.
Dropping indices beforehand and adding them again afterwards is a good idea.
Also replacing .where(...).each with .find_each and perhaps adding transactions could help, as already mentioned.
Replace .save! with .save(:validate => false), because during the migrations you are not getting random inputs from users, you should be making known-good updates, and validations account for much of the execution time. Or using .update_attribute would also skip validations where you're only updating one field.
Where possible, use fewer AR objects in a loop. Instantiating and later garbage collecting them takes CPU time and uses more memory.
Maybe you have already considered this:
Tell the db not to bother making sure everything is on disk (no WAL, no fsync, etc), you now have an in memory db which should make a very big difference. (Since you have taken the db offline you can just restore from a backup in the unlikely event of power loss or similar). Turn fsync/WAL on when you are done.
It is likely that you can do some of the migrations before you take the db offline. Test this in staging env of course. That big user migration might very well be possible to do live. Make sure that you don't do it in a transaction, you might need to modify them a bit.
I'm not familiar with your exact situation but I'm sure there are even more things you can do unless this isn't enough.
This answer is more about approach than a specific technical solution. If your main criteria is minimum downtime (and data-integrity of course) then the best strategy for this is to not use rails!
Instead you can do all the heavy work up-front and leave just the critical "real time" data migration (i'm using "migration" in the non-rails sense here) as a step during the switchover.
So you have your current app with its db schema and the production data. You also (presumably) have a development version of the app based on the upgraded spree gems with the new db schema but no data. All you have to do is figure out a way of transforming the data between the two. This can be done in a number of ways, for example using pure SQL and temporary tables where necessary or using SQL and ruby to generate insert statements. These steps can be split up so that data that is fairly "static" (reference tables, products, etc) can be loaded into the db ahead of time and the data that changes more frequently (users, sesssions, orders, etc) can be done during the migration step.
You should be able to script this export-transform-import procedure so that it is repeatable and have tests/checks after it's complete to ensure data integrity. If you can arrange access to the new production database during the switchover then it should be easy to run the script against it. If you're restricted to a release process (eg webistrano) then you might have to shoe-horn it into a rails migration but you can run raw SQL using execute.
Take a look at this gem.
https://github.com/zdennis/activerecord-import/
data = []
data << Order.new(:order_info => 'test order')
Order.import data
Unfortunaltly the downrated solution is the only one. What is really slow in rails are the activerecord models. The are not suited for tasks like this.
If you want a fast migration you will have to do it in sql.
There is an other approach. But you will always have to rewrite most of the migrations...

Optimizing Build in ClearCase Dynamic View

I'm trying to optimize my workflow as I still spend quite some time waiting for the computer when it should be the other way 'round IMO.
I'm supposed to hand in topical branches implementing a single feature or fixing a single bug, along with a full build log and regression test report. The project is huge, it takes about 30 minutes to compile on a fairly modern machine when compiling in a snapshot view.
My current workflow thus is to do all development work in a single snapshot view, and when a feature is ready for submission, I create a new dynamic view, merge the relevant changes from the snapshot and start the build/testing procedure overnight.
In a dynamic view, a full build takes about six hours, which is a major PITA, so I'm looking for a way to improve these figures. I've toyed with the cache settings, but that doesn't seem to make much difference. I'm currently pondering writing a script that will create a snapshot view with the same spec as the dynamic view, fetch the files into it and build there, but before I do that I wonder if there is a better way of improving my build times.
Can I somehow make MVFS cache all retrieved objects locally (I have both lots of harddisk space and RAM), ideally sharing the cache between multiple dynamic views (as I build feature branches, most files are bound to be identical between two different branches)
Is there any other setting I could tune to speed up local builds?
Am I doing it wrong (i.e. is there a better workflow for me, considering that snapshot views take about one hour to create)?
Considering that you can have a dynamic view and a snapshot view with the same config spec, I would really recommend:
having a dynamic view ready for merge operation
then, once the merge is done, updating your snapshot view (no need to recreate it from scratch, which takes too much time. Just launch an update)
That way, you get the best of both world:
easy and quick merges within the dynamic view
"fast"(er) compilation within the snapshot view dedicated for that step.
Even if the config spec might have to change in your case (if you really have to use one view per branch), you still can change the config spec of an existing snapshot view (and still benefit from an incremental update), rather than recreating a snapshot view for each branch you need to compile on.

Benchmarking Oracle 10G on Windows XP

I am not a DBA. However, I work on a web application that lives entirely in an Oracle database (Yes, it uses PL/SQL procedures to write HTML to clobs and then vomits the clob at your browser. No, it wasn't my idea. Yes, I'll wait while you go cry.).
We're having some performance issues, and I've been assigned to find some bottlenecks and remove them. How do I go about measuring Oracle performance and finding these bottlenecks? Our unhelpful sysadmin says that Grid Control wasn't helpful, and that he had to rely on "his experience" and queries against the data dictionary and "v$" views.
I'd like to run some tests against my local Oracle instance and see if I can replicate the problems he found so I can make sure my changes are actually improving things. Could someone please point me in the direction of learning how to do this?
Not too surprising there are entire books written on this topic.
Really what you need to do is divide and conquer.
First thing is to just ask yourself some standard common sense questions. Has performance slowly degraded or was there a big drop in performance recently is an example.
After the obvious a good starting point for you would be to narrow down where to spend your time - top queries is a decent start for you. This will give you particular queries which run for a long time.
If you know specifically what screens in you front-end are slow and you know what stored procedures go with that, I'd put some logging. Simple DBMS_OUTPUT.put_lines with some wall clock information at key points. Then I'd run those interactively in SQLNavigator to see what part of the stored procedure is going slow.
Once you start narrowing it down you can look to evaluate why a particular query is going slow. EXPLAIN_PLAN will be your best friend to start with.
It can be overwhelming to analyze database performance with Grid Control, and I would suggest starting with the simplier AWR report - you can find the scripts to generate them in $ORACLE_HOME/rdbms/admin on the db host. This report will rank the SQL seen in the database by various categories (e.g. CPU time, disk i/o, elapsed time) and give you an idea where the bottlenecks are on the database side.
One advantage of the AWR report is that it is a SQL*Plus script and can be run from any client - it will spool HTML or text files to your client.
edit:
There's a package called DBMS_PROFILER that lets you do what you want, I think. I found out my IDE will profile PL/SQL code as I would guess many other IDE's do. They probably use this package.
http://www.dba-oracle.com/t_dbms_profiler.htm
http://www.databasejournal.com/features/oracle/article.php/2197231/Oracles-DBMSPROFILER-PLSQL-Performance-Tuning.htm
edit 2:
I just tried the Profiler out in PL/SQL Developer. It creates a report on the total time and occurrences of snippets of code during runtime and gives code location as unit name and line number.
original:
I'm in the same boat as you, as far as the crazy PL/SQL generated pages go.
I work in a small office with no programmer particularly versed in advanced features of Oracle. We don't have any established methods of measuring and improving performance. But the best bet I'd guess is to try out different PL/SQL IDE's.
I use PL/SQL Developer by Allaround Automations. It's got a testing functionality that lets you debug your PL/SQL code and that may have some benchmarking feature I haven't used yet.
Hope you find a better answer. I'd like to know too. :)
"I work on a web application that
lives entirely in an Oracle database
(Yes, it uses PL/SQL procedures to
write HTML to clobs and then vomits
the clob at your browser"
Is it the Apex product ? That's the web application environment now included as standard part of the Oracle database (although technically it doesn't spit out CLOBs).
If so there is a whole bunch of instrumentation already built in to the product/environment (eg it keeps a rolling two-week history of activity).

Why do people use linq to sql?

Given the premise:
There are competent sql programmers
(correlary - writing sql queries are not an issue)
There are competent application developers
(correlary - there is simple/strong/flexible architecture for handling connections and simple queries from code)
Why do people use linq to sql?
There is overhead added to each transaction
There is strong likelihood of performance loss for moderate-complex calculations (DBs are made for processing sets and calculations and had teams of engineers working out optimization - why mess with this?)
There is loss of flexibility (if you want to add another ui (non .NET app) or access method, you either have to put the queries back in the db or make a separate data access layer)
There is loss of security by not having a centralized control of write/update/read on db (for example, a record has changed - if you allow applications to use linq to sql to update, then you cannot prove which application changed it or what instance of an application changed it)
I keep seeing questions about linq to sql and am wondering if I'm missing something.
I keep seeing questions about linq to sql and am wondering if I'm missing something.
It's not that you're missing something. It's that you have something most shops don't have:
There are competent sql programmers
Additionally, in your shop those competent sql programmers prefer to write sql.
Here's a point by point response:
There is overhead added to each transaction
Generally true. This can be avoided by translating the queries before they are needed to run using CompiledQuery for many (but not all!) scenarios.
There is strong likelihood of performance loss for moderate-complex calculations (DBs are made for processing sets and calculations and had teams of engineers working out optimization - why mess with this?)
Either you're writing linq, which is translated to sql, and then a plan is generated from the optimizer - or your writing sql from which a plan is generated by the optimizer. In both cases you are telling the machine what you want and it is supposed to figure out how to do it. Are you suggesting that subverting the optimizer by using query hints is a good practice? Many competent sql programmers will disagree with that suggestion.
There is loss of flexibility (if you want to add another ui (non .NET app) or access method, you either have to put the queries back in the db or make a separate data access layer)
A lot of people using linq are already SOA. The linq lives in a service. The non .NET app calls the service. Bada-bing bada-boom.
There is loss of security by not having a centralized control of write/update/read on db (for example, a record has changed - if you allow applications to use linq to sql to update, then you cannot prove which application changed it or what instance of an application changed it)
This is simply not true. You prove which application is connected and issuing sql commands the same way you prove which application is connected and calling a sproc.
Let me list you a few points:
There are small software companies or mid-sized companies who develop their software in-house who might rather focus on getting many application developers than getting a freelancer DB developer or even permanently hire one.
In most cases the overhead is a non-issue either due to the amount of data to be processed or due to the low traffic. Besides, when used properly, LINQ to SQL can perform as fast as most SQL queries + the associated .net code.
Many companies just stick with the Microsoft stack and they can only enjoy the integration. Some other company develops using SOA there's just no problem. The others aren't forced to choose LINQ-to-SQL and if they make that choice is their problem how to integrate it. Nobody ever said LINQ-to-SQL is a silver bullet :)
I believe security is gained with LINQ-to-SQL because I've bumped across lots of SQL queries taking in unescaped data with string concatenation etc and explaining the whole parametrized query idea has never been easy. Besides since all queries are eventually translated into SQL, unless the tracking issue you describe would happen via a stored procedure, there're again no problems at all.
I also believe your question can be posed more generally to address all ORMs and not just LINQ-to-SQL, and still most of what I said would hold true.
The problem is that it is very rare for somewhere to have a competent SQL developer who likes writing SQL and wouldn't rather be doing something else. I would consider myself competent in SQL, I used to do all my data access layers with stored procs or parametrized queries. Trouble is that it takes ages and is dull. I'd rather be writing great applications than messing around with data access layers that essentially have a select, insert, update and delete SQL statement(or proc) repeated dozens of times for each data object.
Linq-to-SQL takes away some of the repetitive nature. It has a tool to auto generate you business objects from your database schema, and it gives you a nice integrated query language that is compile time type verified and is in your code (Stored procs are a pain to source control neatly)
I can write a DAL in Linq-to-sql several times faster than I can using plain SQL, stored procs or parametrized queries.
If you want to maintain the use of stored procs both linq-to-sql and the EF both support the use of stored procs for all their data access, you just have to set up the appropriate mappings. So, you can still use your stored procs to log details and implement security if you want. We tend to opt for using windows auth, and use that to restrict access to each table for the various users, then we have a bunch of triggers on the tables that track details for audit purposes.
Two things I will quickly note is that firstly, the entity framework seems to be getting more support from MS at the moment, and I suspect that will be considered the kind of default standard for the future in preference to linq-to-sql. Secondly, in .Net 3.5 the EF and linq-to-sql do not have very good support for n-tier disconnected apps. In both of them you kind of have to muck around with either serializing data contexts across your disconnected tiers, or manually detach and re-attach your data objects. This is much improved in the .net 4.0 though. Just something to consider depending on which version you have available to you.
Existing question/answers in the same vein/spirit:
Doesn't Linq to SQL miss the point? Aren't ORM-mappers (SubSonic, etc.) sub-optimal solutions?
LINQ-to-SQL vs stored procedures?
What's wrong with Linq to SQL?
Why do I need Stored Procedures when I have LINQ to SQL
If using LINQ to SQL is there any good reason to learn SQL queries/syntax anymore?
https://stackoverflow.com/questions/216569/are-the-days-of-the-stored-procedure-numbered
I personally believe there's no right or wrong answer. It depends on what you're developing and how you're developing it. If you need razor-sharp performance, have an overly-complex data model, etc... skip the abstraction. If you feel the abstraction speeds up your development time, like the idea of capturing all application logic in a single codebase, etc... use it.
For me, it takes a lot less time to write linq to sql code than it does to write a bunch of stored procedures. That's especially true when the design isn't finished, in that case I don't yet know how much of the work I want to do on C# objects, and how much I want to do in SQL.
So, I can skip building datasets, I don't have to click click click to add queries, basically, linq to sql means I can change my code in less time.
Also, as a big fan of Haskell, I can write lots of functional-style code with linq to sql and it just works.
I'm not saying this is an ideal solution or even a great example (it was the result of a high level constraint on the architecture, not something we necessarily would have chosen from scratch), but...
I worked on an app where the code was completely isolated from the database except through a set of exposed stored procs. The code could not "know" anything about the database schema except was was returned from the stored procs.
While this isn't that unusual and it isn't too hard to write a DAL using ADO or whatever, I decided to try out Linq to Sql, even though it wouldn't be using it for its real intended purpose and wouldn't use most of the features. Turns out it was a great decision.
I created the Linq to Sql class, dragged the stored procs from server explorer onto the right side of the designer, then... Wait, there is no then. I was pretty much done.
Linq created strongly typed methods for each stored proc. For the procs that returned rows of data, Linq automatically created a class for the items in each row and returned a List<generatedClass> for them. I wrapped the calls themselves in a lightweight public DAL class that did some verification and some automatic parameter setting and I was done. I wrote a business object class and mapped the dynamically generated Linq class objects to the business object (did this by hand, but it isn't hard to do or maintain).
The program is now immune to any schema change that doesn't affect the stored procedure signatures. If the signatures do change, we just drag off the old proc from the design and drag it back to regenerate the code. A few passes through the unit tests to make changes (which usually don't go higher than the public DAL interface) and it's done. Things upstream of the DAL use Linq to Objects techniques to select, filter, and sort data that isn't in the right format straight from the stored proc calls.
We have some excellent DBAs writing the stored procedures and an entirely different group writing the other code, so maybe it is a good example of why (and how) you can use LINQ in the scenario you describe.
Some handy features are the debugger picking up sytax errors in your query, compared to writing SQL statements as strings. Mistakes that wont get picked up until runtime.
Plus I find LINQ statements easier to read than SQL.
It may be a case of convenience triumphing over performance. If you're programming at Facebook levels of uber-performance then you might think about every clock cycle but the simple truth is that the majority of applications don't need this attention and benefit from efficiencies in code maintenance and dev time (100k contractor vs. another $erver).
That said, there's a case for outsourcing as much of the query processing from the DB box in very high scale systems, else the DB is the bottle neck and you need to shard or re-architect down the line. Costly.
I think its fair to say that LINQ will scale better/easier both in terms of servers and from many core in that your LINQ codebase will get m-core for 'free' as soon as MS release C# 4.0.
I do see your point in asking and as a non-ASP.NET dev just beginning a www project for the first time, I can't see the point of 80% of ASP.NET (themes, controls etc.) - it seems I need to learn more and code more than the HTML itself! -- again, I'm sure there's an good reason for it all.
--- I haven't got the 50 pts to comment on the post I want to so I'm doing it here ---
David B suggests that writing some SQL is all there is to getting the most out of SQL Server and that using query hints is the steering mechanism. The same task can be achieved in many different ways in SQL and many with 1000s of times the performance gain. I suggest reading Inside T-SQL Querying (Itzik Ben Gan). Over and over, Itzik shows how to rethink the query and use new commands to shrink the logical reads sometimes from thousands into less than ten.
For very simple queries, the overhead of an extra layer adds to the roundtrip cost. For somewhat more complex queries in normal 'business app' scenarios, the optimizations done by the Linq-to-SQL expression->sql translation magic can often save a lot.
As an example, I recently did a 1:1 translation of a customer-supplied 1400+ (!) line stored proc to L2S. Not only did it go from 1400 lines of SQL to 500 lines of much more readable, strongly typed, and commented code. It also started hitting the database with an average of ~1500 reads instead of ~30k reads. This is before I even started looking at db-side optimizations - that saving is something I can 100% attribute to L2S's ability to eliminate predicates that can be evaluated client-side.
simple answer, there are two approaches: create exquisite Rube Goldberg contraptions, or just get the job done in a simple way. Many devs lean towards the former.
Developers get bored easily, and would often personally enjoy doing things a harder way that seems to provide a certain intellectual beauty. Are you developing an app or writing a PhD? As my msft director used to yell in the hallways, "I don't want another research project!"
please repeat after me (min 3x)
there is no silver bullet
I will not use a technology just because its the latest thing from msft
I will not use something just to get it on my resume
Not only are their competent SQL coders, any decent app programmer, especially LOB apps, should write intermediate SQL. If you don't know any SQL and are writing LINQ to SQL, how are you going to debug your data calls? How are you going to profile them to fix bottlenecks?
We're trying out LINQ to SQL and I think there are major issues with it, such as:
There is no simple way to return the query results to another object. This in itself seems insane. Microsoft created the var anonymous datatype, and recommends using it, but there is no way to move this data out of your local method, hence the oo paradigm breaks if you have to use the data in the same function that retrieved it.
Tables are NOT objects. Study up on 3rd normal form etc. Relational databases are for storing data, not using it. I don't want to be restricted or encouraged to use my tables as objects. The data I retrieve from the database will very often be joins of multiple tables, and may include SQL casts, functions, operators, etc.
There is no performance gain, and a slight loss
Now I have way more code to worry about. There are the dbml files and still a DAL to actually write the LINQ. Yes, lots of it is machine-generated, that doesn't mean its not there, its something else that can go wrong (i.e. your dbml files, etc.).
Now that I've given the background, I will attempt to answer you actual question, why do people use LINQ To SQL:
Its the latest thing from Microsoft and I want it on my resume.
Msft has convinced managers/execs that it will decrease coding time
Developers hate SQL. (no good dev environment or debugging except manually--it would be nice to have better intellisense to a sql tool.)
I encourage people not to jump on the bandwagon just because everyone else is, learn enough to put it on your resume, be willing to use it if forced to, but try and really understand the pros and cons first.

Resources