I am not clear about the working of Row Level Security on Apache Superset. As we know it will filter the data and return the out put. My doubt is
The filter is applying to the SQL query itself to fetch the data from database?
Or the data set will load all the data based on the inline query, then row level filter will apply on that result set?
Any reference will be greatly appreciated...
Related
I have a java code which connects to Elasticsearch DB using Spring-data-elasticsearch and fetches all the index data by connecting to the repository and executing the findAll() method. The data received from ES is being processed by a seperate application. When new data is inserted into elastic search, I have the below queries
1. How can I fetch only the newly inserted data Programatically ?
2. Apart from using the DSL queries, Is there a way to Asyncronously get the new records as and when new data is inserted into elasticsearch DB.
I dont want to execute the findAll() method again. Because it returns the entire data ( including the previously processed records as well) .
Any help on this is much appreciated.
You will need to add a field (I call it createdAt here) to your entities that contains the timestamp when your application inserts into Elasticsearch. One possibility would be to use the auditing support of Spring Data Elasticsearch to have the value set automatically, or you set the value in your application. If the data is inserted by some other application you need to make sure that it contains a timestamp in a format that maps the field type definition of this field in your application.
Then you'd need to define a method in your repository like
SearchHits<T> findByCreatedAtAfter(Timestamp referenceValue);
As for getting a notification in some form when new data is inserted: I'm not aware that Elasticsearch offers something like that. You will probably need to regularly call the method that retrieves the data.
I am running a SQL query in starburst-presto. It's connected to elasticsearch using the relevant connector.
The SQL has an "order by" clause. This clause is not pushing down to elasticsearch. Basically, I want to sort the data in elasticsearch based on a specific field and return the result. The query with "order by" is taking a lot of time using presto. Is it possible to manage is somehow to get an optimal performance?
SQL: select e.employee_id from elasticsearch.es."employee:id:""2390571"" && (doj_timestamp:(>=15965454 && <=15972366)) sort=employee_id:desc" e offset 0 limit 5;
The above query is returning random results.
Can anyone please help here?
Your query has both ORDER BY and LIMIT, so in Presto it is called a Top N query.
Presto currently does not provide Top N pushdown, but this feature is in the works.
Umbrella issue for connector pushdown: https://github.com/prestosql/presto/issues/18
A draft PR for Top N pushdown (engine & SPI support): https://github.com/prestosql/presto/pull/4784
Please file an issue for Elasticsearch connector TopN pushdown. We will implement it anyway, but direct user feedback helps understand issue priorities.
You can learn more on the #pushdown channel on Presto community slack.
My existing system has some search SQL procedures that returns the data based on some filters. Now, to improve searches we have decided to use Elasticsearch for all our searches. We are in phase of making a prototype for now.
Below is what i have done till now:-
De-normalize all the data from my RDBMS and store into Elasticsearch using Logstash.
Query data from Elasticsearch based on the parameters using Elastisearch SQL API.
The main problem is the Pagination. Elasticsearch Sql has support for sending fetch_size parameter and in result it returns the cursor for the next set of records.
Cursor is fine if you want to get to the next paged set of results, but if a user wants to go from page 10 to page 100, how can we achieve that ?
I also searched for offset and skip support in elasticsearch SQL but could not find any references.
Has anyone faced such an issue ? I would appreciate any help or suggestions.
I tried to follow the link https://www.elastic.co/guide/en/elasticsearch/reference/current/sql-pagination.html
{
"query" : "Select client_clientid, clientpolicy_policyname from client_paged_list group by client_clientid, clientpolicy_policyname",
"fetch_size": 5
}
I am quite new to spring data elasticsearch.
I wanted to know is there any way that i can group an index records based on one field's value and perform aggregations at each group level.
Any suggestions would be helpful.
Thanks & Regards
Sumanth K P
I am new to Apache Lucene. Please someone guide me how apache lucene works.
For every request, will it invoke datasource(documents, database. etc) from lucene index?
or it will look at the index alone?
Once documents are indexed, Lucene will only look at the index and nowhere else.
You also need to understand the difference between indexing and storing data in the index. Former allows document to be found while latter allows the data to be read when relevant document is found.
Why is this necessary? Sometimes you can index all fields but only store the ID and retrieve the actual data from external source (e.g. database) using that ID. Or you can store data in the index and load it from there instead of going to another data source.