Version control for Tableau - etl

What's the best practice to control versions of Tableau projects?
If a change in Tableau project requires changes in the database (in my case - RedShift) and in ETL (in my case my python script), how to version control all of them together, such that I would be able to roll-back to previous version in case of a problem?
Thanks!

EDIT - Tableau has added version control features to Tableau Server since the time that this answer was originally provided.
At present Tableau Server does not provide version control functionality. There are a few ideas on the Tableau Community forum requesting integration with version control software such as Git or for version control to be baked into Tableau Server. Since Tableau workbooks are just XML files, then one could use some form of source control software for workbooks stored on a shared drive, and for publishing permissions to be restricted to a site/project admin
In theory a script could tie all of these components together. If a particular version of a Tableau workbook were associated with a specific database and ETL change (although I'm not sure what part the Python script plays here), then the previous version of the workbook could be retrieved from source control and republished as part of a rollback

Another way to accomplish the ability to rollback to previous version, is to run the native Tableau backup command just before applying any project changes. This will provide a snapshot of the server state at the time of the change.
The format is tabadmin backup backupfilename
In Tableau 8.0 and earlier, the server must be stopped first, via tabadmin stop
So your existing DB and ETL change deployment mechanism could be extended to call the backup command, and use a backupfilename that has the build or release number appended in the filename.
Running a server backup like this may not be as heavyweight an operation as you think- if your workbooks use all live connections and no cached or uploaded data, the backup command is quick and should complete in several seconds.

Related

Update database on different environments in Joomla3.9 project

We are working on Joomla3.9 project, have different environments and are using git as vcs. So every developer works on is own branch. It would be nice to have a database compare function like in TYPO3 or Contao (see the database differences after updating the project and apply the database changes just by one click). Or like the laravel migration system.
Any developer should easily update his own lokal database after database changes where made due an extension update via backend or by another developer. And of course the staging or live system must be updated easily too. We don't want to execute sql-scripts with the changes in phpMyAdmin.
We have tried https://dbv.vizuina.com/ . This is not the 100% solution. Like there is no cli support to start the migration process by an update script on the server.
Does anyone have a solution or knows an extension that can solve this problem? Or can this be handled with core Joomla functions (maybe with a little adjustment)?
So far, I've seen three possibilities to execute modifications to one ore many extension tables
1: Use the extension - revision control in the schema table. So add a new sql-file with an increased version number compared to the version number in the schema-table for this extension. Increase also the version in the manifest.xml and zip the extension again.
Reinstall the extension via extension->manage->install. So the new sql-file with the increased version number will be executed.
2: like the point above, but install the extension via joomla update mechanism (update server).
3.: create a new sql-file in sql/folder of the extension. No version name is needed for the new file, just update.sql oder another filename. Execute this script in script.php in update()-method, after the extension is installed (in this case it's an update) again.
The third possibility might be interesting. It should be possible to trigger the update()-method with a cli command / function, so that the method can be triggered via a script on the server.
But how can I get the info, which update-scripts have already been executed? Let's say I have 3 update files in sql-folder. update-1.sql, update-2.sql and update-3.sql.
update-1.sql has already been executed. So I don't want to execute this sql-file again - only the other two.
The schema-table is only used with the first two options. Do I have the info somewhere or must I manage the infos which update-scripts have been executed myself?
The answer related to versioning database for extensions depends on whether these extensions are tightly coupled to the application or need to be reusable to other applications also.
The latter case normally means that each extension accesses its own custom tables, in which case you should keep separate versioning for the database than for the extensions.
App version history can be kept in a db_version table. Then an insert statement is added at the end of each update script (adding an incremental version number). e.g.
insert into db_version(version,author,description) values(003,'Verna.Collins', 'removing obsolete column');
Provided that you need to apply data migration on extensions also, you need to maintain a db_version_extensions table which keeps version history for each of the extensions separately. e.g.
'001' 'extension1','Mandy.Aguilar','initial version'
'002' 'extension1','Mandy.Aguilar','adding extra column'
'001' 'extension2','Edna.Potter','initial version'
'002' 'extension2','Elvira.Townsend','dropping unused table'
..etc
Each extension zip should keep initial creation script and all sql-update files(which should normally not interfere with the rest of the app tables).
After pull it will be relatively easy to execute all the scripts with filename version greater than the last version number written in the database. This should be done for the app and for each extension separately.
Now if the extensions are tightly coupled to the app, it means that they might be using/updating tables of the app. For extensions of this type, you can add the updates as part of the application updates. These extensions could even be developed at the same repo, and be kept as directories instead of zip files.
Not sure if joomla supports any tools for automating the process of performing incremental db updates, but a nice tool is flyway, with ports for command-line, maven and graddle. See: how does flyway work

How can I perform a data compare on a VS 2013 SSDT project programmatically?

Visual Studio 2013 has a feature that allows for performing a data compare between your SSDT project and a target database.
According to another post here on SO, there are certain requirements with regards to performing such a compare.
Those requirements taken into consideration, I want to do something like this as a part of our build and deployment process:
Publish any DB schema changes to the target database(s) to make sure that source and target have exactly the same tables, columns, SP's, etc. to comply with the requirements mentioned in the link above
Run a data compare and generate an update script, or publish any changes in the source DB directly to the target DB
Currently, I have a script which takes care of bullet no. 1 by doing a schema compare, using a DACPAC, via sqlpackage.exe. It does not look like it is possible to perform a data compare using sqlpackage, though, and I have not found any other alternatives yet. In VS 2010 it was possible to run a data compare via the command window, but I have not seen any documentation regarding this in VS 2013...
Thus, my question is if there exists an API and/or other tools that allows for a data compare to be run programmatically through e.g. a Powershell script.
It appears you are correct, for schema diff there is command line support as long as SSDT is installed on disk (more details here), but there is no programmatic interface yet for data compare and update.

How to control changeset priority in tfs for automatic patches?

in our company we use tfs for source control of sql database version,when developers change the database they generate Equivalent script and put it in sql tfs project and checked in it with related workItem.after build we generate patch with this script for clients,but before pacth we need to some one decide on priority of checked in script,now i want to this decition become automatic and my question is how could specified priority in the moment of check in?
Sorry for my bad english,if you want more informationn to answer let me know.thanks.
Version handling of databases seems to be a never-ending problem. At a previous client, we gave the databases version properties, and then stored patch scripts in folders for each version, e.g. "Patches/2.0.10", "Patches/2.1.0". The patch scripts could then be executed in the same order as they were checked in (creation date).
Upon release, we ended up generating a complete patch script consisting of all those separate patches merged together (since the patches often affected the same data, they could be optimized) along with a new version number, allowing us to record what version any given databes instance had.

Script to execute on CVS check-in, without access to the server?

Is it possible to write a script that executes certain instructions, and is triggered by any check-in to a CVS repository?
The script would scan the list of files in the change-set and do a copy operation on certain files in a certain sub-directory.
I would hopefully be able to execute various console applications, including ones written in .NET.
Problem is, I need this done quickly and I don't have access to the CVS server, due to corporate IT red-tape, etc.
Is there a way to set this up on one of the client workstations instead?
Can it be done without interfering with my working folder?
Can you get commit notifications by email as this blog shows? If so, you could be able to use maildrop (or good old procmail, etc) to run arbitrary commands and scripts on your workstation when the commit notification mails arrive.
I found a .NET library that seems up to the task - SharpCVSLib.
http://csharpopensource.com/sharpcvslib.aspx
(Hopefully it will work on a developer workstation and not need to be hosted on the CVS server.)

How do you work on Oracle packages in a collaborative, version-controlled environment?

I'm working in a multi-developer environment in Oracle with a large package. We have a DEV => TST => PRD promotion pattern. Currently, all package edits are made directly in TOAD and then compiled into the DEV package.
We run into two problems:
Concurrent changes need to be promoted on different schedules. For instance, developer A makes a change that needs to be promoted tomorrow while developer B is concurrently working on a change that won't be promoted for another two weeks. When it comes promotion time, we find ourselves manually commenting out stuff that isn't being promoted yet and then uncommenting it afterwards...yuck!!!
If two developers are making changes at the same exact time and one of them compiles, it wipes out the other developer's changes. There isn't a nice merge; instead the latest compile wins.
What strategies would you recommend to get around this? We are using TFS for our source-control but haven't yet utilized this with our Oracle packages.
P.S. I've seen this posting, but it doesn't fully answer my question.
The key is to adopt a practice of only deploying code from the source control system. I'm not familiar with TSF, but it must implement the concepts of branches, tags, etc. The question of what to deploy then falls out of the build and release tagging in the source control system.
Additional tips (for Oracle):
it works best if you split the package spec and body into different files that use a consistent file pattern for each (e.g. ".pks" for package spec, and ".pkb" for package body). If you use an automated build process that can process file patterns then you can build all of the specs and then the bodies. This also minimizes object invalidations if you are only deploying a package body.
put the time in to configure an automated build process that is driven from a release or build state of your source control system. If you have even a moderate number of db code objects it will pay to be able to build the code into a reference system and compare it to your qa or production system.
See my answer about Tools to work with stored procedures in Oracle, in a team (which I have just retagged).
Bottom line : don't modify procedures directly with TOAD. Store the source as files, that you will store in source control, modify then execute.
Plus, I would highly recommend that each developer works on its own copy of the database (use Oracle Express, which is free). You can do that if you store all the scripts to create the database in source control. More insight can be found here.
To avoid 2 developers working on the same package at the same time:
1) Use your version control system as the source of the package code. To work on a package, the developer must first check out the package from version control; nobody else can check the package out until this developer checks it back in.
2) Don't work directly on the package code in Toad or any other IDE. You have no clue whether the code you are working on there is correct or has been modified by one or more other developers. Work on the code in the script you have checked out from version control, and run that into the database to compile the package. My preference is to use a nice text editor (TextPad) and SQL Plus, but you can do this in Toad too.
3) When you have finished, check the script back into version control. Do not copy and paste code out of the database into your script (see point 2 again).
The downside (if it is one) of this controlled approach is that only one developer at a time can work on a package. This shouldn't be a major problem as long as:
You keep packages down to a reasonable size (in terms of WHAT they do, not how many lines of code or number of procedures in them). Don't have one big package that holds all the code.
Developers are encouraged to check out code only when ready to work on it, and to check it back in as soon as they have finished making and testing their changes.
We use Oracle Developer Tools for Visual Studio.NET...plugs right into TFS
we do it with a Dev database for every stream, and labels for the different streams.
Our Oracle licensing gives us unlimited dev/test instances, but we are an ISV, you may have a different licensing option
You can use the Oracle developer tools for VS or you can use sql developer. SQL developer integrates with Subversion and CVS and you can download it for free. See here: http://www.oracle.com/technology/products/database/sql_developer/files/what_is_sqldev.html
We use Toad for Oracle with the TFS MSSCCI provider against TFS 2008. We use a Custom Tool that pulls database checkins from source control and packages them for release.
To my knowledge Oracle Developer Tools for Visual Studio.Net doesn't have any real source control integration with TFS or otherwise.
You might consider Toad Extensions for Visual Studio though it's not cheap, maybe $4k I think.
Another option is the Oracle Change Management Pack but believe it requires the Enterprise edition of Oracle which is much more pricey.
You may be interested in Gitora www.gitora.com. It helps managing Oracle database objects with Git.
This article about collaborative development with the Oracle database can also be helpful: http://blog.gitora.com/plsql-how-to-develop-two-features-simultaneously-but-deploy-only-one/
Full disclosure: I am the developer and author of the article.

Resources