Heroku and Elasticsearch - which add-on to use? - heroku

I plan to use Elasticsearch on heroku.
I was looking for the best option of Elasticsearch add-on I can use.
Found was my first choice from the following reasons:
It is now part of elastic.
When using Elasticsearch on heroku it will be opened to the world - a secure wrapper to the transport client was introduced - https://github.com/foundit/elasticsearch-transport-module/
But it looks like this repository is not highly maintained, and Elasticseach 1.5 is the latest version which is supported.
What is the recommended add-on then?
If I want to use the latest version of Elasticsearch I am doomed to use an unsecure connection?
Maybe use the official java client?

Nick with Bonsai here. Based on your question, and my own obvious bias, I'll suggest Bonsai for the following reasons:
All of our clusters have SSL with basic auth to secure the connection. We feel pretty strongly that security comes as a standard feature.
We were the first hosted Elasticsearch provider, ever. (And one of the first addon providers on Heroku, ever, with our first search addon, Websolr.) So we've got plenty of experience hosting search and and thousands of other happy Heroku customers.
One definite tradeoff with using Bonsai is that we're generally always going to lag a bit behind the latest version of ES. As of this posting we're still running ES 1.7, but updates to ES 2.2 are just around the corner.
This is probably going to be true in the future as well. Part of the reason for this is that we're a small, bootstrapped company, and we have to be pragmatic in where we focus our engineering efforts. Plus as an operations company with thousands of businesses, we like to let major new upgrades spend a few months in the wild before we commit to supporting it.
We also work hard on providing managed upgrades, at least for versions that are sufficiently backwards compatible. Everyone has their tools for helping to manage upgrades, but I don't think any of the other providers do actual in-place upgrades.
Unless you have a hard requirement for a specific feature in 2.x (and if you do, please let me know) you may do fine on 1.7 until our 2.x support is fully baked. Drop us a line at info#bonsai.io to get whitelisted for the first release of that in the coming weeks.

Related

Schema deployment management for Athena

In order to apply devops principles to data (ugh, dataops!), things like continuous deployment need to be considered.
Hence why tools like dbDeploy exist. However dbDeploy seems to have been orphaned and is not maintained any more. In the past i've used this tool again and again, but I don't see much support for it, and I'm not sure why?
So i'm wondering just what do people use to manage and version their schemas. In particular i'm looking for something that will work with Athena (But this has a jdbc driver, so in theory any jdbc compliant tool)
I know one answer may be to switch mindset, and use the AWS Glue crawlers instead. But do people actually do that? Or are the crawlers more for POC/Quick start situations? I'm pretty sure you'll always want to override decisions the crawler makes, so how can that be handled?

Tracking api/even changes between different microservice versions before deployment

I work devops for a fairly large company that is in process of transitioning to microservices. This is a new area for most people involved and some of the governing requests seem like bad practice to me but I don't have the expertise to convince otherwise.
The request is to generate a report before deploying that would list any new api/events (Kafka is our messaging service) in a microservice.
The path that's being recommended is for devs to follow a style guide and then scrape the source code during CI/CD pipeline to generate a report that can be compared to previous reports and identify any new apis.
This seems backwards and unsustainable but I've been unable to find another solution that would satisfy their requests. I've recommended deploying to dev first, then using a tracing tool to identify any api changes, or event subscriptions, but they insist on having the report before deploying.
I'm hoping for any advice on best practice to accomplish this.
Tracing and detecting version changes is definitely over engineering. Whats simpler like #zenwraight has mentioned, is to version your APIs. While tracing through services to explore the different versions and schema could be a potential solution, it requires a lot more investment upfront and if thats not the bread and butter of the company, I would rather use a vendor product that might support something like this.
If discovery is a mechanism that is needed, I would recommend something that publishes internal API docs using a tool like Swagger so that you can search if there's an API you can consume.
And finally to support moving to different versions, I would recommend having an API onboarding process for the services so that teams can notify other teams that are using specific versions their services are coming to the end of their lifecycle and they will need to migrate to newer ones.

PostgreSQL from NodeJS application

I am exploring how best to access a PostgreSQL/PostGIS DB from NodeJS. All I need is simple SQL SELECT queries. Nothing more complex than:
SELECT *
FROM portal.catalog AS cat
WHERE ST_Intersects(st_geogfromtext('SRID=4326;POLYGON((20 50 ,19 50,19 49,20 50 ))'), cat.gpoly)
LIMIT 5000;
This will be on a windows7 or windows2008 server, running PostgreSQL 9.2/PostGIS 2.0, The traffic will be pretty light (only a few requests per minute).
Some preliminary research I have done has come up with the following potential directions. But I was interested in hearing from others what is working for them (as an easy implementation).
https://github.com/brianc/node-postgres (But I am having trouble building it do to firewall issues), Supposed the "pure" solution is better, but I am having issues there also) https://github.com/brianc/node-postgres-pure
http://www.infoq.com/articles/the_edge_of_net_and_node (And then I guess I would write my own ADO.NET adapter to PostgreSQL)
I have also seen references to ODBC for NodeJS (unclear if this is the way to go)
Is there something like the SQL adapter for NodeJS? http://blogs.msdn.com/b/sqlphp/archive/2012/06/08/introducing-the-microsoft-driver-for-node-js-for-sql-server.aspx
There was also a full blown ORM by EntitySpaces (which went bankrupt). Now a defunct opensource project: https://github.com/EntitySpaces/entityspaces.js
I've used node-postgres in the past, but recently opted for any-db, which has support for PostgreSQL.
Both have worked well, although I prefer any-db, particularly with respect to pooling and transactions. I believe any-db deserves more recognition.
Any-db is layered on top of BrianC's node-postgres.
But I just got the https://github.com/brianc/node-postgres-pure working, and it is a pleasure.
EntitySpaces is the way to go, not defunct at all.
http://download.cnet.com/EntitySpaces-Studio/3000-10250_4-10590953.html?tag=mncol;1
I got BrianC's PostgresPure system working (must have been a dependent module malfunctioning, since I did not do anything special.
Works just great.
See: https://github.com/brianc/node-postgres-pure

Best way to rotate apache log files on Windows

Being new to the Windows server platform I need some input to figure out which way is the best to rotate the apache log files. The server version is Apache/2.0.47 (Win32).
Apache is shipped with the rotatelogs.exe. I found this (rather) old post http://www.sitebuddy.com/Apache/Cat/Logging saying
Conclusion: It is unusable and dangerous (it will eat up all your
memory/file handlers ...etc...).
You can not even use rotatelogs.exe on 4+ sites, Apache will
lockup when starting (tested on Apache 2.2.0).
Same guy has created a dll-file http://www.sitebuddy.com/mod_log_rotate which I'm not to sure that our hosting company will be to happy to implement on the production servers.
So since we are running this rather old version of Apache (which I'm stuck with, since it is really the IBM HTTP Server shipped with WebSphere) I'm afraid of the rotatelog.exe, anyone aware of what would be the best option to implement?
Our hosting partner decided not to allow usage of rotatelogs.exe since they have had some issues for other customers. Therefore the solution was to create a PowerShell-script, that stopped the service, rotated the logs and started the service again.

Hosted full-text search options - IndexTank vs Solr vs Lucene

I am building an app using Ruby on Rails on Heroku and am confused about which full-text search option I should proceed with. A few things I care about:
Real-Time search: I am building a dynamic user-generated website.
Understands Rails Models: I would like to restrict search results based on who the user is (so, I don't really want "just" a site-wide search)
Additionally, something that is easy to configure on Heroku with Rails would be a bonus.
Heroku currently provides three options for full-text search: FlyingSphinx, Searchify IndexTank and WebSolr. Can anyone outline the pro's and cons of each.
Based on my research, it seems that a lot of people have been happy with IndexTank. In particular, this blog post by Gautam Rege briefly outlines his experience with the three options and how he prefers IndexTank.
However, after LinkedIn's acquisition of IndexTank, some key components of IndexTank were open-sourced and the IndexTank service discontinued. It seems that Searchify is one of the first few (if not, currently, the only) replacement for IndexTank. Does anyone have any experience using this? How does Searchify compare to IndexTank and the other two options - WebSolr and FlyingSphinx?
I'll address your question with regards to Searchify/IndexTank:
Searchify has true real-time indexing. The millisecond you add a document, it becomes searchable. No need to commit or reindex.
There is a Ruby client library for Searchify, here are the docs & download links: http://www.searchify.com/documentation/ruby-client
There is also a nice 3rd party client by kidpollo called Tanker that some Ruby folks prefer: https://github.com/kidpollo/tanker

Resources