Solr configuration and setup under Git source control - windows

I've got Solr running as a service on windows. I used NSSM (http://nssm.cc/) to set up the service to automatically start. The web server is Jetty.
I'd like to have my Solr directory under source control in Git because the configuration changes (and sometimes plugin changes) need to be picked up by all team members. At the very least, I'd like to have the configuration files (solrconfig.xml, schema.xml, stopwords.txt, etc.) under Git control, but ideally, I'd like to put the whole solr directory (including jar and war files) under Git control. Will this pose any problems? I can foresee us pulling commits and switching branches, all while the Solr service is running.
How have other teams configured Solr under source control?

The rule I go by is to check in configuration files (SolrConfig.xml, Stopwords.txt, dataconfig.xml etc.)
There are reasons, IMHO, to not check in the entire Solr directory in source control:
Solr directory contains the index data as well as configuration. Bad idea to check in the index, because
size of the repo will grow
your index isn't a data-source. In most cases, it relies on external source such as RDBMS to refresh itself. Huge risk on data-integrity when your database goes out of sync with your Solr Index.
Only in development box, we have Solr and the consuming app deployed in the same machine, otherwise, setting up Solr is independent of application deploy. Checkin in Solr directory in SC would mean unnecessarily big repositories to deploy.
Rather than doing the whole repository checkin, we ended up having the config files checked in and basic scripts to setup solr, create index, start an instance etc. So every team member could check out the code base, run a couple of build tasks and get ready to party :)

Related

Apache Nifi-registry deployment using git repo as flow repo

We would like to use Nifi registry with git as storage engine. In that case, i modified providers.xml and i was able to save the flows there.
Challenges:
There is no 2 way sync. We can only save the flows modified by Nifi user but if we modify the flow directly in git location, it will not be reflected on nifi registry
There is no review or approval process for Nifi registry. A user has to login to nifi-registry server, create a branch and issue a pull request.
As a workaround, we can delete the database file ( H2) and restart the nifi resgistry.
Lastly, everything should be automated in CI/CD like what we do for regular maven project.
Any suggestions ?
The purpose of the git storage is mostly to let user visualize the differences through tools like git hub, or any other tools that can support diffs, plus by pushing to a remote you also get a remote backup of the flow content. It is not meant to be modified outside of the application, just like you wouldn't bypass an application and go right into it's database and start changing data.

Howto handle infinispan cache creating and deployment

We have a infinispan cluster serving as cache server for our applications. Every time we need a new cache, we have to edit the config files, and redeploy the cluster, which is problematic. For obvious reasons, we don't want to redeploy the cache cluster.
We can add the new cache definition through web interface, or cli. But it has downside of not recording this configuration in a repo. Ideally I want to be able to add cache definitions in a way that is persistent in my code repo. So that in case of a disaster, I can simply redeploy the cache cluster.
We looked into creating cache definition through the source code, at application startup, but that doesn't seems to be possible.
Does anyone has an idea about the best practises for this issue?
After some R&D, this is what we found:
Programatic creation of the caches, are possible through jcache implementation in Infinispan, but we could not find a way to properly configure it. End result is just an empty cache definition, with no properties
What we ended up doing is to create caches using jboss cli. Use an script to create the cache definitions, and commit that script to version control system. This way you can recreate your cache server anytime by rerunning that script. The downside of this approach is that you are going to need to install jboss-cli on your deploying machine - CI probably- which is very inconvenient. We just decided to do this step manually for time being.

How do I tell Sonar not to store the source code in the database?

I am trying two options.
One is not to store source code
If it is not possible how to delete project from sonar database?
I tried with "sonar.import Sources=false" but this is not working for sonar version 6.1(deprecated after 4.5 version).
If I delete the project,will source code remain in database?
Storage of source code in database can't be disabled because it's used to display data in webapp.
Source code is indeed dropped from db when deleting a project.
This is late, but might be helpful for someone:
Sonar usually cache the project for performance purpose via squid mechanism, then thru queue mechanism it stores the project data in internal h2 database which can be changed to few supported databases, then you will be having advanced options to manipulate data on database(things like fail-over cases can be achieved), not that I know of any way of not to store project data in database.
Unless you configure certain user, default user can be admin to sonar dashboard with password as admin, Login to console and navigate to Administration-> projects->Management, now delete n number of unnecessary projects. Once you do this Sonar dashboard will not be able to show the project again until you re-analyze same project. To make sure this worked,after re-analyzing project click on the project on dashboard and check the version under Activity.
Additional info: If you modify the maven project code, first build the project & then do sonar:sonar for latest modifications to be reflected.
I agree with other answer, elaborating in few lines..

Heroku: Can I commit remotely

We have a CMS on heroku, some files were generated by the CMS, how can I pull those changes down? Can I commit the changes remotely and pull them down? Is there an FTP option of some kind?
See: https://devcenter.heroku.com/articles/dynos#ephemeral-filesystem
It's not designed for persistent file generation and usage.
In practice, it works like this: User puts some code into a repository. That code is dynamically pulled into temporary Amazon EC instances and executed. The code can be pulled from virtual machine to virtual machine, node to node, without disruption, across data centers. There is no real "place" to get the products of your code from the environment, because anything generated by the checked-out code can (and will) be destroyed as your code deploy skips around between the temporary machines.
That being said, there are some workarounds:
If your app includes something like a file browser within your deployed code, you can grab the (entirely) temporary files using that file browser, and commit it back to your persistent code trunk.
Another option is using something like S3 for your persistent storage, with your application reading from, and writing to, a data storage service, knowing that while heroku will just re-write and destroy your local data on a frequent basis, the external service will maintain the files.
Similarly, you can change your application to use heroku's postgres for persistent data storage, or use Amazon's RDS, (etc.).
Alternately, you can edit your application in such a way as to ensure that any files generated by it will be regenerated every time the code is refreshed, redeployed, and moved around.

Dev->Stage->Prod with Git deployment for Azure Websites

How best should I accomplish the following deployment objectives with Git deployment for Azure?
Easily switch when working locally to either use fake in-memory data or (eventually) non-production snapshot of real data
Deploy to staging environment on Azure such that at first I could use fake in-memory data and eventually move to non-production snapshot of real data.
Deploy to production with real data
I currently deploy using Github and a staging branch to a staging Azure website. Since I deploy to a public repo, the web.config file is ignored by git. (EDIT: I just learned that ignoring web.config actually causes deployment error on azure)
Any help/suggestion is appreciated.
It's actually supposed to be simpler than that. Please see this page. Basically, the idea is that you set some AppSettings in the Azure portal to override the default values that are committed to your repo.
Well... Here's what I did that works for me right now.
To quickly switch between fake in-memory data locally, I use a compilation symbol LOCAL and a preprocessor directive #if LOCAL.
Same compilation symbol works when you deploy to Azure, so I can work on fake data until I'm ready to switch to real db. I can also use the app settings if I really want to make to switch it more easily.
The challenge was to keep a web.config with "secrets" (like connection string) locally and not expose it to Github. I added it to .gitignore, but then my deployments started failing on Azure because it could not find the web.config. Just copying it to wwwroot via ftp did not help - Azure was looking for web.config in the repository.
So, to make this work I "slightly" altered the deployment process by first copying the Web.config from wwwroot to the repository before running the default deploy.cmd. This was simple - this is what you do:
Create a .deployment file in the root of your repository with the following:
[config]
command = deploy.my.cmd
Create deploy.my.cmd with the following script:
xcopy %DEPLOYMENT_TARGET%\Web.config %DEPLOYMENT_SOURCE%\\ /Y
deploy.cmd
Now, I have web.config with secrets locally. Git ignores this file. I uploaded the correct web.config to Azure via FTP, and it gets used whenever I deploy.

Resources