How to filter string prefixes with Vega-lite - filter

Is it possible to filter records with Vega-lite by strings?
Example:
record: "ABCD"
record: "AMFK"
record: "AMRK"
I would like to process only records where the string starts with "AM".
I studied the documentation and found solutions only for comparing the entire string. Is it possible to truncate the string? Or use something like "LEFT()" in Excel? Or something completely different?
Edit:
Possibly of importance, I'm using the Vega-lite app in Airtable.

You can do this using a filter transform along with an appropriate vega expression. For example (open in editor):
{
"data": {
"values": [
{"key": "ABCD", "value": 1},
{"key": "AMFK", "value": 2},
{"key": "AMRK", "value": 3}
]
},
"transform": [{"filter": "slice(datum.key, 0, 2) == 'AM'"}],
"mark": "bar",
"encoding": {
"x": {"type": "quantitative", "field": "value"},
"y": {"type": "nominal", "field": "key"}
}
}

Related

JMESPath current array index

In JMESPath with this query:
people[].{"index":#.index,"name":name, "state":state.name}
On this example data:
{
"people": [
{
"name": "a",
"state": {"name": "up"}
},
{
"name": "b",
"state": {"name": "down"}
},
{
"name": "c",
"state": {"name": "up"}
}
]
}
I get:
[
{
"index": null,
"name": "a",
"state": "up"
},
{
"index": null,
"name": "b",
"state": "down"
},
{
"index": null,
"name": "c",
"state": "up"
}
]
How do I get the index property to actually have the index of the array? I realize that #.index is not the correct syntax but have not been able to find a function that would return the index. Is there a way to include the current array index?
Use-case
Use Jmespath query syntax to extract the numeric index of the current array element, from a series of array elements.
Pitfalls
As of this writing (2019-03-22) this feature is not a part of the standard Jmespath specification.
Workaround
This is possible when running Jmespath from within any of various programming languages, however this must be done outside of Jmespath.
This is not exactly the form you requested but I have a possible answer for you:
people[].{"name":name, "state":state.name} | merge({count: length(#)}, #[*])
this request give this result:
{
"0": {
"name": "a",
"state": "up"
},
"1": {
"name": "b",
"state": "down"
},
"2": {
"name": "c",
"state": "up"
},
"count": 3
}
So each attribute of this object have a index except the last one count it just refer the number of attribute, so if you want to browse the attribute of the object with a loop for example you can do it because you know that the attribute count give the number of attribute to browse.

Elastic query group by

I've started the process of learning ElasticSearch and I was wondering if somebody could help me shortcut the process by providing some examples of how I would a build couple of queries.
Here's my example schema...
PUT /sales/_mapping
{
"sale": {
"properties": {
"productCode: {"type":"string"},
"productTitle": {"type": "string"},
"quantity" : {"type": "integer"},
"unitPrice" : {"type": double}
}
}
}
POST /sales/1
{"productCode": "A", "productTitle": "Widget", "quantity" : 5, "unitPrice":
5.50}
POST /sales/2
{"productCode": "B", "productTitle": "Gizmo", "quantity" : 10, "unitPrice": 1.10}
POST /sales/3
{"productCode": "C", "productTitle": "Spanner", "quantity" : 5, "unitPrice":
9.00}
POST /sales/4
{"productCode": "A", "productTitle": "Widget", "quantity" : 15, "unitPrice":
5.40}
POST /sales/5
{"productCode": "B", "productTitle": "Gizmo", "quantity" : 20, "unitPrice":
1.00}
POST /sales/6
{"productCode": "B", "productTitle": "Gizmo", "quantity" : 30, "unitPrice":
0.90}
POST /sales/7
{"productCode": "B", "productTitle": "Gizmo", "quantity" : 40, "unitPrice":
0.80}
POST /sales/8
{"productCode": "C", "productTitle": "Spanner", "quantity" : 100,
"unitPrice": 7.50}
POST /sales/9
{"productCode": "C", "productTitle": "Spanner", "quantity" : 200,
"unitPrice": 5.50}
What query would I need to generate the following results?
a). Show the show the number of documents grouped by product code
Product code Title Count
A Widget 2
B Gizmo 4
C Spanner 3
b). Show the total units sold by product code, i.e.
Product code Title Total units sold
A Widget 20
B Gizmo 100
C Spanner 305
TIA
You can accomplish that using aggregations, in particular Terms Aggregations. And it can be done in just one run, by including them within your query structure; in order to instruct ES to generate analytic data based in aggregations, you need to include the aggregations object (or aggs), and specify within it the type of aggregations you would like ES to run upon your data.
{
"query": {
"match_all": {}
},
"aggs": {
"group_by_product": {
"terms": {
"field": "productCode"
},
"aggs": {
"units_sold": {
"sum": {
"field": "quantity"
}
}
}
}
}
}
By running that query, besides the resulting hits from your search (in this case we are doing a match all), and additional object will be included, within the response object, holding the corresponding resulting aggregations. For example
{
...
"hits": {
"total": 6,
"max_score": 1,
"hits": [ ... ]
},
"aggregations": {
"group_by_product": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "b",
"doc_count": 3,
"units_sold": {
"value": 60
}
},
{
"key": "a",
"doc_count": 2,
"units_sold": {
"value": 20
}
},
{
"key": "c",
"doc_count": 1,
"units_sold": {
"value": 5
}
}
]
}
}
}
I omitted some details from the response object for brevity, and to highlight the important part, which is within the aggregations object. You can see how the aggregated data consists of different buckets, each representing the distinct product types (identified by the key key) that were found within your documents, doc_count has the number of occurrences per product type, and the unit_sold object, holds the total sum of units sold per each of the product types.
One important thing to keep into consideration is that in order to perform aggregations on string or text fields, you need to enable the fielddata setting within your field mapping, as that setting is disabled by default on all text based fields. In order to update the mapping, for ex. of the product code field, you just need to to a PUT request to the corresponding mapping type within the index, for example
PUT http://localhost:9200/sales/sale/_mapping
{
"properties": {
"productCode": {
"type": "string",
"fielddata": true
}
}
}
(more info about the fielddata setting)

Microsoft LUIS builtin.number

I used builtin.number in my LUIS app trying to collect a 4 digit pin number. The following is what's returned from LUIS when my input is "one two three four".
"entities": [
{
"entity": "one",
"type": "builtin.number",
"startIndex": 0,
"endIndex": 2,
"resolution": {
"value": "1"
}
},
{
"entity": "two",
"type": "builtin.number",
"startIndex": 4,
"endIndex": 6,
"resolution": {
"value": "2"
}
},
{
"entity": "three",
"type": "builtin.number",
"startIndex": 8,
"endIndex": 12,
"resolution": {
"value": "3"
}
},
{
"entity": "four",
"type": "builtin.number",
"startIndex": 14,
"endIndex": 17,
"resolution": {
"value": "4"
}
},
As you can see, it's returning individual digits in both text and digit format. Seems to me that it's more important to return the whole digit than the individual ones. Is there a way to do it so that I get '1234' as result for builtin.number?
Thanks!
It's not possible to do what you're asking for by only using LUIS. The way LUIS does its tokenization is that it recognizes each word/number individually due to the whitespace. It goes without saying that 'onetwothreefour' will also not return 1234.
Additionally, users are unable to modify the recognition of the prebuilt entities on an individual model level. The recognizers for certain languages are open-source, and contributions from the community are welcome.
All of that said, a way you could achieve what you're asking for is by concatenating the numbers. A JavaScript example might be something like the following:
var pin = '';
entities.forEach(entity => {
if (entity.type == 'builtin.number') {
pin += entity.resolution.value;
}
}
console.log(pin); // '1234'
After that you would need to perform your own handling/regexp, but I'll leave that to you. (after all, what if someone provides "seven eight nine ten"? Or "twenty seventeen"?)

Elastic Search. Search by sub-collection value

Need help with specific ES query.
I have objects at Elastic Search index. Example of one of them (Participant):
{
"_id": null,
"ObjectID": 6008,
"EventID": null,
"IndexName": "crmws",
"version_id": 66244,
"ObjectData": {
"PARTICIPANTTYPE": "2",
"STATE": "ACTIVE",
"EXTERNALID": "01010111",
"CREATORID": 1006,
"partAttributeList":
[
{
"SYSNAME": "A",
"VALUE": "V1"
},
{
"SYSNAME": "B",
"VALUE": "V2"
},
{
"SYSNAME": "C",
"VALUE": "V2"
}
],
....
I need to find the only entity(s) by partAttributeList entities. For example whole Participant entity with SYSNAME=A, VALUE=V1 at the same entity of partAttributeList.
If i use usul matches:
{"match": {"ObjectData.partAttributeList.SYSNAME": "A"}},
{"match": {"ObjectData.partAttributeList.VALUE": "V1"}}
Of course I will find more objects than I really need. Example of redundant object that can be found:
...
{
"SYSNAME": "A",
"VALUE": "X"
},
{
"SYSNAME": "B",
"VALUE": "V1"
}..
What I get you are trying to do is to search multiple fields of the same object for exact matches of a piece of text so please try this out:
https://www.elastic.co/guide/en/elasticsearch/guide/current/multi-query-strings.html

How do I change color of a bar in Vega-lite Bar Chart?

I want to change the default blue color of a bar in a Vega-Lite bar chart. How can I do it? I am posting the json specification below:
{
"data": {
"values": [
{"a":"A", "b":28}, {"a":"B", "b":55}, {"a":"C", "b":43},
{"a":"D", "b":91}, {"a":"E", "b":81}, {"a":"F", "b":53},
{"a":"G", "b":19}, {"a":"H", "b":87}, {"a":"I", "b":52}
]
},
"mark": "bar",
"encoding": {
"x": {bin:false, "type": "ordinal","field": "a"},
"y": {"type": "quantitative","field": "b"}
}
}
Thanks in advance.
I found the answer to my own question. :) I was supposed to add a color key inside encoding block. Please see the updated code below:
{
"data": {
"values": [
{"a":"A", "b":28}, {"a":"B", "b":55}, {"a":"C", "b":43},
{"a":"D", "b":91}, {"a":"E", "b":81}, {"a":"F", "b":53},
{"a":"G", "b":19}, {"a":"H", "b":87}, {"a":"I", "b":52}
]
},
"mark": "bar",
"encoding": {
"x": {"type": "ordinal","field": "a"},
"y": {"type": "quantitative","field": "b"},
"color": {"value": "#ff9900"}
}
}

Resources