Filtering all field values per row - rethinkdb

I have a table called 'sample'. Based on which algorithm is used, each sample may have different field (property) names.
I need to be able to retrieve all samples which have field values that contain/match a user filter value.
So for instance, if a sample has the following properties:
example 1: "name", "gender", "state"
and another had properties:
example 2: "name", "gender", "rate"
and there would be thousands of such samples with more variation.
If a user looking at a table with a set of samples from the second example above ("name", "gender", "rate") and used a filter "foo", I need to query the table "sample" for all rows where any of the property's values contained/matched "foo" where value could be "foobar".
If they were looking at a set of samples that had the properties that example 1 has ("name", "gender", "state"), then I need to do the same, however, I cannot hard code the properties of either.
In SQL I would get the field names and dynamically build a SQL query string but with REQL object DOT notation, I am struggling with how to do it.

Related

ElasticSearch - backward pagination with search_after when sorting value is null

I have an application which has a dashboard, basically a table with hundreds of thousands of records.
This table has up to 50 different columns. These columns have different types in mapping: keyword, text, boolean, integer.
As records in the table might have the same values, I use sorting as an array of 2 attributes:
First attribute is what client wants to sort by. It can be a simple
sorting object or some sort query with nested filter.
Second
attribute is basically a default sorting by id, needed for sorting
the documents which have identical values for the column customer
wants to sort by.
I checked multiple topics/issues on github and here
on elastic forum to understand how to implement search_after
mechanism for back sorting but it's not working for all the cases I
need.
Please have a look at the image:
Imagine there is a limit = 3, the customer right now is on the 3d page of a table and all the data is sorted by name asc, _id asc
The names are: A, B, C, D, E on the image.
The ids are numeric parts of the Doc word.
When customer wants to go back to the previous page, which is a page #2 on my picture, what I do is pass the following to elastic:
sort: [
{
name: 'desc'
},
{
_id: 'desc'
}
],
search_after: [null, Doc7._id]
As as result, I get only one document, which is Doc6: null on my image. It seems to be logical, because I ask elastic to search by desc after null and id 7 and I have only 1 doc corresponding this..it's Doc6 but it's not what I need.
I can't make up the solution to get the data that I need.
Could anyone help, please?

Oracle SODA - How to sort on created using REST?

I can't work out how to use the $orderby with SODA on an id field (such as created or lastModified. I'm using SODA for REST directly and not the other projects.
Sort syntax is:
{
$orderby: {
path: 'created',
datatype: 'date',
order: 'desc'
}
}
And I've also tried:
{
"$orderby": {
"$fields": [{
"path": "created",
"datatype": "date",
"order": "desc"
}],
"$scalarRequired": true
}
}
And replacing the path with $id: 'created' (as you can use that in a filter specification to access non-document metadata. But nothing works to order properly.
Short of putting the created field into my object when I create them (which defeats the purpose of having those fields) how can I use orderby on a metadata field?
Max here from the SODA dev team. I am not 100% sure what you mean by an "id field". Looks like you mean the "created on" and "last modified" document components automatically maintained by SODA, right? If so, we don't support orderbys on these (though it could be added as an enhancement).
As of now, as you mentioned in your post, best option is to create a field in your JSON documents' content and set it to ISO8601 format timestamp value (e.g. 2020-10-13T07:01:01). You can then do an orderby on such a field (with datatype "datetime"). Please let me know if more details on this are needed.
In SODA REST, when you're listing collection contents, you could specify since=timestamp and until=timestamp query parameters. That'll give you all documents with last modified timestamp greater than the "since" one, and less than or equal to the "until" one.
Example:
http://host:port/ords/scott/soda/latest/myColl?since=2020:01:01T00:00:00&until=2021:01:01T00:00:00
As part of this operation, SODA automatically adds an orderby on "last modified". Not sure if that's useful to you though, since that's just for listing all documents in the collection (i.e. you can't combine it with a QBE, for example). So if this doesn't meet your needs, best option right now is to explicitly add something like a "modified' field to the document content, and do an orderby on that.

How to use Kibana and elastichsearch [7.5.0] to track number of documents containing particular value

I have an index which contains information about some objects. I want to display some of the information on my Kibana's dasboard. Lets assume an object looks as follows:
{
"_index": "obj",
"_type": "_doc",
"_id": "KwDPAHABfo5V345r4IYV",
"_version": 1,
"_score": 0,
"_source": {
"value_1": "some value",
"value_2": "some_other value",
"owner": "jason",
"modified_date": "2020-02-01T12:53:08.210317+00:00",
"created_date": "2020-02-01T12:53:08.243980+00:00"
}
}
I need to show (live) number of objects that has owner: 'UNKNOWN'. Thing is, that this value changes in time. Each change is a new document - they are not being updated. I need to track how many UNKNOWN owners currently I see. Updates (new documents) are being sent to elk in fixed intervals.
When I try to set up a metric, it sometimes shows 0, during the window between one update and another - when there is no documents flowing into elk. How can I make Kibana display only last documents with owner: 'UNKNOWN'?
How can I make Kibana display only last documents with owner: 'UNKNOWN'?
You could set up a data table visualization for that as an alternative to the one-dimensional metric visualization.
This is how I personally would configure the data table:
Set a filter with 'owner(.keyword) is UNKNOWN'.
Use the metric 'Top Hit' on the field created_date (or #timestamp, thats up to you) instead of the count metric.
Set the order to descending based on the timestamp field.
Split the rows (Term Aggregations) for every field you want to display in the rows. This will create 'columns' in your table.
Go to the options tab and enable count on the sum of all rows.
Set an appropriate time interval, e.g. last 1 hour.
This will display all the relevant data of your documents that have the field owner equal to UNKNOWN. Also, you see the ingestion/creation date timestamp of these documents in a descending order. Furthermore, you see the number of documents that match (configured via the options tab as described above).
I hope I could help you.

How to turn off ag-grid quick filter for specific column

AG-grid has "quick filter" feature, essentially a free-text search filter that searches through all columns.
The problem is, in some columns, I have date-time values and I don't want to search through data in those columns.
Using "column filter" to filter on each column separately is not an option because this will give me "AND" behavior: only when all columns that I search through contain the data I search for will this column be visible. And I need "OR" behavior: a row that contains the data I search for in any column (except column that has date-time values) should be visible.
So, how can I remove some columns from the set of columns this "quick filter" searches through?
I just needed to set an "empty" function as "getQuickFilterText" function for the column I don't want to search through:
colDef = {
getQuickFilterText: () => ''
}

Creating custom fields based on analysis output in ealsticsearch

I am having document where value is raw string :
{ "content" : "field1=1 , field2=foo"}
My intention is to, query by field1, field2 values.
Closest thing I can think of is to use custom analyser which will create tokens based on comma separator, and then I can search with matching exact values like "field1=1" or "field2=foo" . However, ideally I like to search by range values for field1, pattern matching for field2 etc.
Is there any way to achieve this? I could not find any way to store result of analysis which I can query in this way.
How are you ingesting the documents? If you are doing via logstash ,then you can apply the transformation there using a filter processor
I'm having a little difficulty understanding your question. However, I think you are asking if there's a way to make the type of Field1 numeric and the type of Field2 searchable?
Hopefully you are running Kibana so you can use the Dev console to test this out. If you just let Elastic import data it will create aggregateable and searchable fields for both field1 and field2 because they are both set to string values:
PUT /content_default/type/1 {"field1":"1" , "field2":"foo"}
If you instead omit the quotes around the 1, Elastic will create the field as a long (assuming you haven't already imported a document with a string in the same field) - this allows you to search by range. Here I'm creating a new field3 and setting the value to 1, if you query you should see it's a long
PUT /content_default/type/2 {"field1":"1" , "field2":"foo", "field3":1}
You can pre-load a template to allow you to define types up-front before loading any data - that way Elastic doesn't have to guess what types your fields should be. With strings you can also define whether you want them to be just keywords, searchable or both.
Something like this should do the trick for you:
PUT _template\with_template
{
   "template":"content_with_template",
   "mappings":{
      "content_with_template":{
         "properties":{
            "field2":{
               "analyzer":"simple",
               "type":"text"
            },
            "field1":{
               "type":"keyword"
            },
            "field3":{
               "type":"long"
            }
         }
      }
   }
}
Then put a document in the new 'content_with_template' index like this, at this point it doesn't matter if field3 is in quotes or not - as long as it's parses to a number it'll save
PUT /content_with_template/type/1
{ "field1":"a1d" , "field2":"foo", "field3":1}
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-templates.html

Resources