I have a web app which gets its data from a Solr instance (Tomcat)
Additional queries are done client side with AJAX, the data is directly pulled from Solr. Now this gives users the option to perform any query they like, and is of course a huge security hole. It's not a particular big issue for this particular app, but I'm curious at how to fix this. How to secure Solr, when client side AJAX calls are required? (Preferably I would solve this with PHP.)
Instead of querying solr directly, you could create a simple PHP wrapper that limits the types of queries that are possible. Then, the client queries this PHP script which then queries solr. Once you've done that, you can limit access to the solr server to localhost either through the firewall or with your Java applications server.
Related
I have been playing around with creating a webapp that uses elasticsearch to perform queries. Currently, everything is in production, thus on the localhost, let's say elasticsearch runs at 123.123.123.123:9200. All fun and games, but once the webapplication (react) is finished, the webapp should be able to send the queries to the current local elastic search db.
I have been reading around on how to get this done in a proper and most of all secure way. Summary of this all is currently:
"First off, exposing an Elasticsearch node directly to the internet without protections in front of it is usually bad, bad news." (see here: Accessing elasticsearch from a public domain name or IP).
Another interesting blog I found: https://code972.com/blog/2017/01/dont-be-ransacked-securing-your-elasticsearch-cluster-properly-107.
The problem with the above-mentioned sources is that they are a bit older, and thus I am not sure whether they are up to date.
Therefore the following questions:
Is nginx sufficient to act as a secure middleman, passing the queries from the end-users to elastic?
What is the difference at that point with writing a backend into the react application (e.g. using node and express)?
What is the added value taking into account the built-in security from elasticsearch (usernames, password, apikey, certificates, https,...)?
I am reading a lot about using a VPN or tunneling. I have the impression that these solutions are more geared towards a corporate-collaborative approach. Let's say I am running my front-end on a live server, I can use tunneling to show my work to colleagues, my employer. VPN would be more realistic for allowing employees -wish I had them, just a cs student here- to access e.g. the database within my private network (let's say an employee needs to access kibana to adapt something, let's say an API-key - just making something up here), he/she could use a VPN connection for that.
Thank you so much for helping me clarify the above-mentioned points!
TLS, authorisation and access control are free for the Elastic Stack, and have been for a while. I'd start by looking at the docs, as it's an easy way to natively secure your cluster
for nginx, it can be useful for rate limiting, or blocking specific queries for eg. however it's another thing to configure and maintain
from a client POV it would really only matter if you are using the official Elasticsearch clients, and you use nginx and make changes to the way the API would respond to the client (eg path rewrites, rate limiting)
it's free, it's native, it's easy to manage via Kibana
I'd follow the docs to secure Elasticsearch and then see if you need this at some point in the future. this would be handled outside Elasticsearch anyway, and you'd still want to secure Elasticsearch
The point in exposing Elasticsearch nodes directly to the internet is a higher vulnerability in principle. You should follow the rule of the least "surface" of your system on the internet.
A good practice is to hide from the internet whatever doesn't need to be there, although well protected. It takes ~20mins to get cyber attacks on any exposed service (see a showcase).
So I suggest you install a private network, such as a traditional VPN or an SDP product such as Shieldoo Mesh.
I'm trying to build a small site that gets its data from a database (currently I use Firebase's Cloud Firestore).
I've build it using next.js and thought to host it on vercel. It looks very nice and was working well.
However, the site needs to handle ~1000 small documents - serve, search, and rarely update. In order to reduce calls to the database on every request, which is costly both in time, and in database pricing, I thought it would be better if the server could get the full list of item when it starts (or on the first request), and then hold them in memory and make data request get the data from its memory.
It worked well in the local dev server, but when I deployed it to vercel, it didn't work. It seems it forces me to work in serverless mode, where each request is separate, and I can't use a common in-memory cache to get the data.
Am I missing something and there is a way to achieve something like that with next.js on vercel?
If not, can you recommend other free cloud services that can provide what I'm looking for?
One option can be using FaunaDB and Netlify, as described in this post, but I ended up opening a free Wix site and using Wix data to store the data. I built http-functions module to provide access to the data via REST, which also caches highly used data in memory. Currently it seems to work like a charm!
I was trying to understand how Elastic-search compares with GraphQL when they try to solve similar purpose, or does GraphQL uses Elastic-search as a datasource? If anyone done further research share your understanding here ? Thanks in advance.
GraphQL is as the name suggests a query language (mostly for Web APIs). Elastic Search is a data store that exposes a "RESTful" interface. This interface also has some kind of a query language. In that sense they solve different problems:
GraphQL is for exposing data to web clients or apps. It is build to solve challenges faced in client server communication and app development. GraphQL tries to reduce the amount requests and the size of data sent between client and server. Furthermore it give you the ability to extend your API without versioning to keep old clients (e.g. old versions of your mobile app) working.
Elastic search is built to query large amounts of data effectively. Some of their prime use cases are advertised on their website. Usually you would not want to expose the elastic API directly to your client. GraphQL could act as a layer in between that restricts the operations allowed for clients and uses - as you said - elastic as a data source. Or maybe elastic search at some point likes GraphQL so much that they offer an API to write queries in GraphQL that would replace the REST API.
So now that we know that they solve different problems and can be used together, comparing them doesn't really make much sense.
I have a simple database application in mind and I am thinking of making it browser-accessible instead of creating a standalone one.
I almost finished creating the DB schema in a PostgreSQL Server and I will now start developing. My first idea was using PHP or Ruby On Rails to manage the backend logic and interfacing with the DB, but since this application is fairly simple I think that I can easily implement all business and data manipulation logic with JavaScript or with the DB triggers.
So I am now wondering: is there a way to directly send the queries to a PostgreSQL Server, without server-side scripting?
More generally: can a PostgreSQL(9.3) Server receive the queries in Http requests and provide the results in Http responses?
I know this might sound stupid, and I am not looking for answers like "Use JS for presentation, PHP for logic and DB for data storage". I believe this is a lightweight solution for a very simple application, so I want to try it if possible!
Yes, That is possible.
What you can do is to send it via REST API. (post, get request ).
Here are some reference for you:
https://github.com/begriffs/postgrest
https://github.com/pgrest/pgrest
Please take a look at this for more HTTP API
[update!]
This idea is currently not possible (as I tought when I answered you before).
I tought it was possible after checking this node-postgres library written in javascript but it uses Node.js specific functions not present in the web browser as stated by the library's creator himself and this answer at stack overflow.
There is this package called browserify that exports a Node.js javascript file into a browser front-end ready javascript file. The problem with node-postgres + browserify is that it throw some errors during the browserification process, precisely when it tries to access libpq (an API written in C for accessing PostgreSQL).
I'm sorry I have mistaken you
Yet I still have a suggestion for you. You can try CouchDB if you really want to build a backendless/serverless application. It is natively RESTful, handles authentication and authorization at some extent, is opensource but unfortunately: NoSQL. It processes queries based on Map/Reduce paradigm and Mango query language so it's an entire different world for you to discover if you are used with SQL.
[old answer, I'm leaving it here for learning purposes]
Have you considered using a PostgreSQL driver for JavaScript? It is not RESTful, but it can connect to PostgreSQL and query it!
The library is called node-postgres and you can download it via npm
https://www.npmjs.com/package/pg
Just don't forget to enable SSL connection in the PostgreSQL server and in the client to avoid man-in-the-middle attacks.
An here's a tip: if you need an ACL for allowing or denying selects or inserts for specific users you can manage that through PostgreSQL user management and privileges. PostgreSQL has row level security, allowing you to define which rows in a table can be selected updated and deleted for a given set of users or groups.
I have a Elasticsearch running on my server by default it runs on port 9200 and link is public means any one can insert, update, delete anything form anywhere. How do I make it secure like phpMyadmin which can be only accessed with the help of my code and not directly from browser or postman.
Elasticsearch does not perform authentication or authorization, leaving that as an exercise for the developer. Two popular ways I have seen are
Setup your own proxy (Nginx/HAProxy) fronting elasticsearch - this way you exercise full control. You can also use the Elasticsearch-jetty plugin to have jetty level auth
Shield - If budget permits use Shield which is a paid offering from Elasticsearch - https://www.elastic.co/products/shield
Even with these in place, depending on who you are exposing this to - you may want to disable certain things like dynamic scripting, throttles for DoS etc.
You can use the Elasticsearch basic authentication plugin - https://github.com/Asquera/elasticsearch-http-basic
The README there gives a good idea on how to set it up.
If you are using Kibana3 as a frontend to elasticsearch, you can secure it using https://github.com/fangli/kibana-authentication-proxy
I have enabled a relatively simple Nginx proxy that sits between my Elasticsearch and Kibana to configure authorized access to my dashboards and charts.
Look at my post here: https://udaysagars.wordpress.com/2016/04/04/how-i-configured-authorized-access-to-kibana-dashboards/
Also, you can view my application that uses this method here: http://udaysagar2177.github.io/ec2/twitter-analytics.html