ElasticSearch replication home/server

ElasticSearch replication home/server - elasticsearch

I am running a local ElasticSearch server from my own home, but would like access to the content from outside. Since I am on a dynamic IP and besides that do not feel comfortable opening up ports to the outside, I would like to rent a VPS somewhere, setup ElasticSearch and let this server be a read only copy of the one I have at home.
As I understand it, this should be possible - however I have been unsuccessful at creating any usable version that lets another server be a read-only version of my home ES-server.
Can anyone point me to a piece of information or create a guide, that would help me to set this up? I am rather known to ES-usage, however my setup-skills are still vague.

As I understand it, this should be possible
It might be possible with some workarounds, but it's definitely not built for that:
One cluster needs to be in one physical region; mainly because of latency and the stability of the network connection.
There are no read-only versions. You could only allow read access to a node (via a reverse proxy or the security plugin), but that's only a workaround.

Related

How to remotely connect to a local elasticsearch server - in a secure way ofc

I have been playing around with creating a webapp that uses elasticsearch to perform queries. Currently, everything is in production, thus on the localhost, let's say elasticsearch runs at 123.123.123.123:9200. All fun and games, but once the webapplication (react) is finished, the webapp should be able to send the queries to the current local elastic search db.
I have been reading around on how to get this done in a proper and most of all secure way. Summary of this all is currently:
"First off, exposing an Elasticsearch node directly to the internet without protections in front of it is usually bad, bad news." (see here: Accessing elasticsearch from a public domain name or IP).
Another interesting blog I found: https://code972.com/blog/2017/01/dont-be-ransacked-securing-your-elasticsearch-cluster-properly-107.
The problem with the above-mentioned sources is that they are a bit older, and thus I am not sure whether they are up to date.
Therefore the following questions:
Is nginx sufficient to act as a secure middleman, passing the queries from the end-users to elastic?
What is the difference at that point with writing a backend into the react application (e.g. using node and express)?
What is the added value taking into account the built-in security from elasticsearch (usernames, password, apikey, certificates, https,...)?
I am reading a lot about using a VPN or tunneling. I have the impression that these solutions are more geared towards a corporate-collaborative approach. Let's say I am running my front-end on a live server, I can use tunneling to show my work to colleagues, my employer. VPN would be more realistic for allowing employees -wish I had them, just a cs student here- to access e.g. the database within my private network (let's say an employee needs to access kibana to adapt something, let's say an API-key - just making something up here), he/she could use a VPN connection for that.
Thank you so much for helping me clarify the above-mentioned points!

TLS, authorisation and access control are free for the Elastic Stack, and have been for a while. I'd start by looking at the docs, as it's an easy way to natively secure your cluster
for nginx, it can be useful for rate limiting, or blocking specific queries for eg. however it's another thing to configure and maintain
from a client POV it would really only matter if you are using the official Elasticsearch clients, and you use nginx and make changes to the way the API would respond to the client (eg path rewrites, rate limiting)
it's free, it's native, it's easy to manage via Kibana
I'd follow the docs to secure Elasticsearch and then see if you need this at some point in the future. this would be handled outside Elasticsearch anyway, and you'd still want to secure Elasticsearch

The point in exposing Elasticsearch nodes directly to the internet is a higher vulnerability in principle. You should follow the rule of the least "surface" of your system on the internet.
A good practice is to hide from the internet whatever doesn't need to be there, although well protected. It takes ~20mins to get cyber attacks on any exposed service (see a showcase).
So I suggest you install a private network, such as a traditional VPN or an SDP product such as Shieldoo Mesh.

Inter-Gear Communication for Openshift?

I'm trying to create an app such that gear 2 according to this model can be accessed by gear 3,4...n when using the --scaling option.
The idea being for this structure is the head of a chain of relays. I'm trying to find where the relevant information is so all the following gears have the same behavior. It would look like this:
I've found no documentation that describes how to reach gear 2 (The Primary DNAS) with a url (internal/external ip:port) or otherwise, so I'm a little lost as to how to let the app scale properly.
I should mention so far I've only used bash scripting, but I'm not worried about starting the program in other languages, but so long as it follows that structure in openshift I'm not worried.
The end result is hopefully create a scalable instance of shoutcast on openshift.
To Be Clear:
I'm developing a cartridge, not using the diy, all I understand of openshift is in this guide but of course I'm limited because I'm new.
I'm stuck trying to figure out how to have the cartridge handle having additional gears use the first gear as a relay. I am not confused about how Openshift routes requests externally to the gears and load balances them. I'm not lost how to use port-forwarding to connect to my app, the goal would be to design the cartridge so this wouldn't be a requirement at all, to only use external routes.
The problem as described above is that additional gears need some extra configuration, they need an available source (what better than the first gear?). In fact the solution to my issue might be to somehow set up this cartridge to bypass haproxy with an external route that only goes to the first gear.
Github for those interested, pass it around, it'll remain public. Currently this works only as a standalone, scaling it (what I'd like to fix) causes issues. I've been working on this too long by myself, so have at it :)

There's a great KB article that explains how the routing works on OpenShift gears here https://help.openshift.com/hc/en-us/articles/203263674-What-external-ports-are-available-on-OpenShift-.
On a scalable application, haproxy handles all the traffic routing to your gears. the only way to access your gears is through the ports mentioned in the article above. rhc does however provide a port-forwading option that would allow you to access things like mysql directly from your local machine.
Please note: We don't allow arbitrary binding of ports on the externally accessible IP address.
It is possible to bind to the internal IP with port range: 15000 - 35530. All other ports are reserved for specific processes to avoid conflicts. Since we're binding to the internal IP, you will need to use port forwarding to access it: https://openshift.redhat.com/community/blogs/getting-started-with-port-forwarding-on-openshift

Setting up Mongo DB and hosting

Recently I stumbled across mongoDB, couchDB etc.
I am hoping to have a play with this type of database and was wondering how much access to the hosting server one needs to get it running.
If anyone has any knowledge of this, I would love to know whether it can be set up to work when your app is hosted via a 'normal' hosting company.

I use Mongo, and so I'm really only speaking for Mongo, but your typical web hosting environment wouldn't allow you to set up your own database. You'd want root-level (admin) access to the server to set up Mongo. To get that, you'd want something like a VPS or a dedicated server.
However, to just play around with Mongo, I'd recommend downloading the binary for your OS and giving it a run. Their JavaScript shell interface is very easy to use.
Hope that helps!
Tim

Various ways:-
1) There are many free mongodb hosting available. Try DotCloud.com. Many others here http://www.cloudhostingguru.com/mongoDB-server-hosting.php
2) If you are asking specifically about shared hosting, the answer is mostly no. But, if you could run mongoDB somewhere else (like from the above link) and want to connect from your website, it is probably possible if your host allows your own extensions (for php)
3) VPS

How about virtual private server hosting? The host gives you what looks like an entire machine... hard drive, CPU, memory. You get to install whatever you want, since it's your (virtual) machine.

In terms of MongoDB like others have said, you need the ability to install the MongoDB software and run it (normally as a daemon). However, hosted services are just beginning to appear, such as MongoHQ. Perhaps something like this might be appropriate once its out of beta (or if you request an invite).
It appears hosted CouchDB services are also popping up, such as couch.io or Cloudant. I personally have no experience with Couch so I can be less certain than with Mongo, but I'd imagine that again to run it yourself, you'd need to install the software (and thus require root access).
If you don't currently have a VPS or dedicated server (or the cloud-based versions of the aforementioned), perhaps moving your data out to a dedicated hosted service would be an ideal way to go to avoid the pain and expense of changing your hosting setup.

You can host your application and your database in the different hosting servers.
For MongoDB you can use mongohq or mongolab with space 0.5 Gb for free

1 A-record for every subdomain (10000+); any potential issues? Any other solution?

Most solutions I've read here for supporting subdomain-per-user at the DNS level are to point everything to one IP using *.domain.com.
It is an easy and simple solution, but what if I want to point first 1000 registered users to serverA, and next 1000 registered users to serverB? This is the preferred solution for us to keep our cost down in software and hardware for clustering.
alt text http://learn.iis.net/file.axd?i=1101
(diagram quoted from MS IIS site)
The most logical solution seems to have 1 x A-record per subdomain in Zone Datafiles. BIND doesn't seem to have any size limit on the Zone Datafiles, only restricted to memory available.
However, my team is worried about the latency of getting the new subdoamin up and ready, since creating a new subdomain consist of inserting a new A-record & restarting DNS server.
Is performance of restarting DNS server something we should worry about?
Thank you in advance.
UPDATE:
Seems like most of you suggest me to use a reverse proxy setup instead:
alt text http://learn.iis.net/file.axd?i=1102
(ARR is IIS7's reverse proxy solution)
However, here are the CONS I can see:
single point of failure
cannot strategically setup servers in different locations based on IP geolocation.

Use the wildcard DNS entry, then use load balancing to distribute the load between servers, regardless of what client they are.
While you're at it, skip the URL rewriting step and have your application determine which account it is based on the URL as entered (you can just as easily determine what X is in X.domain.com as in domain.com?user=X).
EDIT:
Based on your additional info, you may want to develop a "broker" that stores which clients are to access which servers. Make that public facing then draw from the resources associated with the client stored with the broker. Your front-end can be load balanced, then you can grab from the file/db servers based on who they are.

The front-end proxy with a wild-card DNS entry really is the way to go with this. It's how big sites like LiveJournal work.
Note that this is not just a TCP layer load-balancer - there are plenty of solutions that'll examine the host part of the URL to figure out which back-end server to forward the query too. You can easily do it with Apache running on a low-spec server with suitable configuration.
The proxy ensures that each user's session always goes to the right back-end server and most any session handling methods will just keep on working.
Also the proxy needn't be a single point of failure. It's perfectly possible and pretty easy to run two or more front-end proxies in a redundant configuration (to avoid failure) or even to have them share the load (to avoid stress).
I'd also second John Sheehan's suggestion that the application just look at the left-hand part of the URL to determine which user's content to display.
If using Apache for the back-end, see this post too for info about how to configure it.

If you use tinydns, you don't need to restart the nameserver if you modify its database and it should not be a bottleneck because it is generally very fast. I don't know whether it performs well with 10000+ entries though (it would surprise me if not).
http://cr.yp.to/djbdns.html

Technical issues when switching to an unmanaged Virtual Private Server (VPS) hosting provider?

I'm considering moving a number of small client sites to an unmanaged VPS hosting provider. I haven't decided which one yet, but my understanding is that they'll give me a base OS install (I'd prefer Debian or Ubuntu), an IP address, a root account, SSH, and that's about it.
Ideally, I would like to create a complete VM image of my configured setup and just ship those bits to the provider. Has anyone had any experience with this? I've seen Jeff talk about something like this in Coding Horror. But I'm not sure if his experience is typical. I suppose it also depends on the type of VM server used by the host.
Also, do such hosts provide reverse-DNS? That's kinda useful for sites that send out e-mails. I know GMail tends to bounce anything originating from a server without it.
Finally, I'd probably need multiple IP addresses as at least a couple of the sites have SSL protection which doesn't work with name-based virtual hosts. Has anyone run into trouble with multiple IPs through VPS? I wouldn't think so, but I've heard whisperings to the contrary.

Slicehost (referral link, if you so choose) offers reverse DNS, multiple IPs ($2/month/IP), Ubuntu/Debian (along with others). The only criteria it doesn't support is the ship-a-VM one, but it does let you clone VMs you've set up in their system via snapshots. You could thus set it up once, then copy that VM as many times as you like.
If that's a sacrifice you're willing to make, I highly recommend them - they've had great customer service the few times I've needed to contact them, decent rates, and a great admin backend.

I like XenPlanet, their prices seem to be comparable, but they also allow you to purchase extras like added disk space. Not sure if they let you buy additional bandwidth.
I have used them for a number of different machines and found their service to be very good.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

ElasticSearch replication home/server - elasticsearch

Related

How to remotely connect to a local elasticsearch server - in a secure way ofc

Inter-Gear Communication for Openshift?

Setting up Mongo DB and hosting

1 A-record for every subdomain (10000+); any potential issues? Any other solution?

Technical issues when switching to an unmanaged Virtual Private Server (VPS) hosting provider?

Categories

Resources