Can we monitor the elastic stack 6.0 and above(like elastic search..) without using the X-Pack?As we know many of the Features like security, machine learning, graph APIs don't be supported under BASIC(free Licence).
So I want to know if there are any APIs, without Licence limitation, can be used to implement those functionalities mentioned above?
All the information should be in the cluster APIs, you'll just lack the visualizations.
Monitoring (of the local cluster) is actually included in X-Pack Basic unlike the other features. Any reason you don't want to use it?
Alternatives include Kopf, Cerebro,... though you'll need to run them as a separate process and watch out for version compatibilities.
We've had success with ElasticHQ for Monitoring (requires python)
https://github.com/ElasticHQ/elasticsearch-HQ
And sentinl for setting up alerts/watchers (it is a plugin for kibana)
https://github.com/sirensolutions/sentinl/wiki
We have set up a reverse proxy to enable ssl/tls and use ubuntu user management to create logins, however, we do not limit access within Kibana itself.
We have little need for graph/machine learning so I am unaware of free alternatives.
The company I work for is heavily Open Source, so these projects suit us.
Related
I have been playing around with creating a webapp that uses elasticsearch to perform queries. Currently, everything is in production, thus on the localhost, let's say elasticsearch runs at 123.123.123.123:9200. All fun and games, but once the webapplication (react) is finished, the webapp should be able to send the queries to the current local elastic search db.
I have been reading around on how to get this done in a proper and most of all secure way. Summary of this all is currently:
"First off, exposing an Elasticsearch node directly to the internet without protections in front of it is usually bad, bad news." (see here: Accessing elasticsearch from a public domain name or IP).
Another interesting blog I found: https://code972.com/blog/2017/01/dont-be-ransacked-securing-your-elasticsearch-cluster-properly-107.
The problem with the above-mentioned sources is that they are a bit older, and thus I am not sure whether they are up to date.
Therefore the following questions:
Is nginx sufficient to act as a secure middleman, passing the queries from the end-users to elastic?
What is the difference at that point with writing a backend into the react application (e.g. using node and express)?
What is the added value taking into account the built-in security from elasticsearch (usernames, password, apikey, certificates, https,...)?
I am reading a lot about using a VPN or tunneling. I have the impression that these solutions are more geared towards a corporate-collaborative approach. Let's say I am running my front-end on a live server, I can use tunneling to show my work to colleagues, my employer. VPN would be more realistic for allowing employees -wish I had them, just a cs student here- to access e.g. the database within my private network (let's say an employee needs to access kibana to adapt something, let's say an API-key - just making something up here), he/she could use a VPN connection for that.
Thank you so much for helping me clarify the above-mentioned points!
TLS, authorisation and access control are free for the Elastic Stack, and have been for a while. I'd start by looking at the docs, as it's an easy way to natively secure your cluster
for nginx, it can be useful for rate limiting, or blocking specific queries for eg. however it's another thing to configure and maintain
from a client POV it would really only matter if you are using the official Elasticsearch clients, and you use nginx and make changes to the way the API would respond to the client (eg path rewrites, rate limiting)
it's free, it's native, it's easy to manage via Kibana
I'd follow the docs to secure Elasticsearch and then see if you need this at some point in the future. this would be handled outside Elasticsearch anyway, and you'd still want to secure Elasticsearch
The point in exposing Elasticsearch nodes directly to the internet is a higher vulnerability in principle. You should follow the rule of the least "surface" of your system on the internet.
A good practice is to hide from the internet whatever doesn't need to be there, although well protected. It takes ~20mins to get cyber attacks on any exposed service (see a showcase).
So I suggest you install a private network, such as a traditional VPN or an SDP product such as Shieldoo Mesh.
I am trying to build a prototype of Elasticsearch as a Service. I have thought of 2 different approaches and I'd like to get opinions towards one or the other implementation
One single installation of Elasticsearch, and a proxy layer on top to add user validation (http basic authentication + user account to validate the usage).
This approach would be relatively straight forward and the main challenge would be configure the cluster properly to handle the load, as well as the permissions so there are no data leaks of the users don't have access to the cluster management APIs.
Use Docker as a container and have one instance of elasticsearch for each user. In this case I would be providing the isolation by using the Linux container (Docker). I'd still need to manage authentication.
It probably would be good to implement both, play around and see how things behave. Any opinions about pros and cons of each approach?
Thanks!
Disclaimer: I am the founder of the Elasticsearch service provider Facetflow, which currently offers shared clusters.
I think that both approaches have merit, but maybe suited for different types of customers.
Looking at other SaaS providers, like MongoDB provider MongoLab, they essentially ended up offering both setups (although not using Docker).
So, pros and cons as I see them:
Shared Cluster
Most Elasticsearch as a Service providers operate this way.
Pros:
Far more affordable for the majority of users just looking for good search and analytics.
Simpler maintenance, less clusters for you to monitor
Potentially less versions of Elasticsearch to integrate with. If you need to communicate with other systems (which you do), write your own plugins (we did, for authentication, silos, entitlements, stats etc.) less versions will be far easier to maintain.
Cons:
Noisy neighbours have to be monitored and you have to scale and relocate indices to handle this.
Users have to choose from a limited list of versions of Elasticsearch, usually a single version.
Users don't get full cluster admin control.
Private Clusters using Docker
One provider that works this way is Found.
Pros:
Users could potentially be able to deploy a variety of versions of Elasticsearch
Users can have complete cluster admin access
Noisy neighbours don't affect their cluster, less manual intervention from you
Cons:
Complex monitoring and support. If people can do whatever they want (shut down the cluster over the api), you have to be clear where your responsibility as a provider ends, and what wakes you up at night.
Complex integration with multiple versions, see shared cluster pros.
More expensive since you have to allocate resources that might not always be used.
I am looking for a way to collect data remotely from various cloud instances (EC2, Rackpsace). The Rackspace API provides no way for collecting server performance metrics (ie load average, cpu usage, memory) via it's API, otherwise this would have never been asked.
I started looking at solutions like Capistrano or Mcollective (I have also considered collectd), but I am unsure of which one would best suit my application. I am trying to avoid using ssh keys for trending purposes (I don't want to have to keep logging in to collect these metrics) The script I am writing is a Ruby script which reboots a cloud server if it's load average is over a certain number. Because these providers don't expose these metrics via their API, I am looking at a way to gather them myself, and I am new to the Ruby community so after briefing over the documentation for all of these tools, I still haven't been able to get a sense of which framework would work best, or if there are other alternatives.
It sounds like Capistrano is more suited to be a deployment tool, although it can perform remote tasks, so after I read the documentation for that it was pretty much out for the purposes of my script.
MCollective looks really attractive for what I am trying to do but it seems I would have to write my own RPC style plugin for this purpose.
I've also considered plugging into some greater monitoring system such as Nagios, Munin, Zenoss, Hyperic, etc, but I'd rather not install some large bulk monitoring system when all I want to collect is but a few simple metrics.
If your intention is to trigger certain actions based on the system performance (like restarting when cpu usage is too high), you should check out god.
I'm not sure if this is also useful when you want to generate some performance statistics over a longer time period. Personally, I'm using Munin for this, but if you don't like it maybe you can find something on Ruby Toolbox | Server Monitoring.
I'm developing an app for filtering network connections from clients to my server (deny or allow to connect to my server).
I'm researching and found some resources like Windows Firewall API.
But I don't know if it's necessary for me or not.
What's the best API or solution to resolve it?
Thank so much.
regards,
Why don't you use an already-developed and proven app in the first place? If you really want to develop a filtering layer then what you need is a Filter driver and more specifically NDIS filter . A sample solution can be found here. But unless you are absolutely sure what you are doing and what you want to achieve I'd strongly suggest that you stick to an off-the-shelf solution - any firewall will be decent, or even a linux machine in front of your server with appropriate iptables rules.
Since you are working in a windows operating system. You would have to make use of Windows Filtering Platform as seen in the documentation on https://msdn.microsoft.com/en-us/library/aa366510.aspx
Drivers like TDS,LSP, and NDIS are all deprecated.
The programming language is C++. In my experience, it was a desktop application with the GUI in WxWidget and writing the filtering network connections hooks into the user mode.
There are two Filtering Layer Identifiers (Run-time Filtering Layer Identifiers and Management Filtering Layer Identifiers ), i made used of the earlier being that its more effective.
Should you need more assistance let me know.
Hy,
I'm implementing an IMAP client as a Mac OSX application using MacRuby.
For the sake of offline availability, I wanted to allow fulltext indexing and attribute based indexing of all messages. Attributes include common E-Mail stuff like from:, to:, etc...
This would allow for advanced results sprinkled with faceting, analytic calculations and such.
Now I'm unsure about the choices and good practices when it comes to integrating such a search feature. I have a strong web development background, therefore my intuitive action would be to setup a Solr server and start feeding it with data. This might just work theoretically, as I could write an Agent that manages the solr instance for my application in the background. But to me, this approach seems like an infrastructure hassle.
On the other side, I've read about people using the FTS3 functionality from SQLite. This approach is easily accessible by CoreData. I haven't used SQLite's FTS3 but I don't think it is as powerful as Solr can be.
What is your weapon of choice for a use case like mine?
I'm mainly interested in solutions that are actually in use by Objective-C/Cocoa/MacRuby developers.
In you're going to develop the app with Ruby give a try to picky. It is very simple to use.
There is an Objective-C Lucene port
http://svn.gna.org/viewcvs/etoile/trunk/Etoile/Frameworks/LuceneKit/
I have not used it, but in your situation, I'd at least check it out. In my experience, SQL based full text search can't compete with Lucene, but haven't tried SQLite for this.
EDIT: just noticed the ruby tag -- this started out as port of Lucene
https://github.com/dbalmain/ferret