SHOW TABLES and DESCRIBE <table> Equivalents in Apache Pinot - apache-pinot

What are the equivalents for the SHOW TABLES and DESCRIBE <table> in Apache Pinot?
Basically, I just want to get a list of tables in the cluster and I want to be able to get the column names and their data types for a chosen table.

As of Apache Pinot 0.11.x, it supports fetching schema only from its Controller APIs. See docs here https://docs.pinot.apache.org/basics/components/exploring-pinot#rest-api

I think you can you use this to see all available commands:
calcite dialect reference
describe:
DESCRIBE DATABASE databaseName
| DESCRIBE CATALOG [ databaseName . ] catalogName
| DESCRIBE SCHEMA [ [ databaseName . ] catalogName ] . schemaName
| DESCRIBE [ TABLE ] [ [ [ databaseName . ] catalogName . ] schemaName . ] tableName [ columnName ]
| DESCRIBE [ STATEMENT ] { query | insert | update | merge | delete }

Related

Keep files one daily, one 3 days and one week and delete others

I have a folder containing database backup files for the last 15 days.
I need to delete all files and keep only one daily, one 3 days and one week.
Can someone help on this ?
I tried some scripts but none met my requirements
mkdir -p monthly
mkdir -p weekly
ln backup_$NOW.tgz weekly/
# find current month
month=$(date +%Y-%m-)
# find the first file of the current month in the weekly folder
first_monthly=$(ls --sort=time -1 weekly/*$month* 2>/dev/null | tail -1)
# and put it in the monthly folder
ln -f $first_monthly monthly/
# we need only 5 weekly backups
ls --sort=time -1 weekly/* 2>/dev/null | tail -n+6 >> /tmp/deletebackups.txt
# we need only 5 monthly backups
ls --sort=time -1 monthly/* 2>/dev/null | tail -n+6 >> /tmp/deletebackups.txt
# delete the extra files
#rm $(cat /tmp/deletebackups.txt) 2>/dev/null
xargs --arg-file /tmp/deletebackups.txt rm
here's one way to determine what to keep or delete. it uses an array of "age in days to keep" and then checks to see if the file age is in that collection.
$DaysOldToKeep = #(
0
3
7
)
$SourceDir = $env:TEMP
$Today = (Get-Date).Date
$FileList = Get-ChildItem -LiteralPath $SourceDir -File |
Sort-Object -Property CreationTime
foreach ($FL_Item in $FileList)
{
$DaysOld = ($Today - $FL_Item.CreationTime).Days
if ($DaysOld -notin $DaysOldToKeep)
{
Write-Warning ('[ {0} ] is [ {1} ] days old & should be removed.' -f $FL_Item.Name, $DaysOld)
}
else
{
Write-Host (' [ {0} ] is [ {1} ] days old & should be KEPT.' -f $FL_Item.Name, $DaysOld)
}
}
truncated output ...
WARNING: [ testing-making-dir-and-file.txt ] is [ 9 ] days old & should be removed.
WARNING: [ hd4B753.tmp ] is [ 8 ] days old & should be removed.
[ Itunes_AlbumAutoRating_Disable.ps1_2019-04-08.log ] is [ 7 ] days old & should be KEPT.
[ vscode-inno-updater-1554768842.log ] is [ 7 ] days old & should be KEPT.
[ hd464FA.tmp ] is [ 7 ] days old & should be KEPT.
[ hd4E2F0.tmp ] is [ 7 ] days old & should be KEPT.
WARNING: [ Genre-List_2019-04-09.log ] is [ 6 ] days old & should be removed.
WARNING: [ Grouping-Strings-List_2019-04-10.log ] is [ 5 ] days old & should be removed.
WARNING: [ Itunes_R-PC-SC_Save.ps1_2019-04-11.log ] is [ 4 ] days old & should be removed.
[ Magenoob_-_Merged_User_Info.csv ] is [ 3 ] days old & should be KEPT.
[ Itunes_Default-Rating_Set.ps1_2019-04-12.log ] is [ 3 ] days old & should be KEPT.
[ MagicTheGathering_-_Scryfall-Default-Cards.json ] is [ 3 ] days old & should be KEPT.
WARNING: [ hd4C490.tmp ] is [ 2 ] days old & should be removed.
WARNING: [ hd45A92.tmp ] is [ 1 ] days old & should be removed.
[ exthost-825471.cpuprofile ] is [ 0 ] days old & should be KEPT.
[ vscode-inno-updater-1555314279.log ] is [ 0 ] days old & should be KEPT.
[ npp.7.6.6.Installer.x64.exe ] is [ 0 ] days old & should be KEPT.
[ hd43E2A.tmp ] is [ 0 ] days old & should be KEPT.
[ hd44D37.tmp ] is [ 0 ] days old & should be KEPT.
[ hd4488C.tmp ] is [ 0 ] days old & should be KEPT.
[ hd45A09.tmp ] is [ 0 ] days old & should be KEPT.
[ Itunes_AlbumAutoRating_Disable.ps1_2019-04-15.log ] is [ 0 ] days old & should be KEPT.
A way to retrieve the first newest files would be :
ls -t weekly/ | head -1
And to have the rest of the oldest, to delete :
ls -t weekly/ | tail +2
If you want to keep it bash only, it may be the simplest way.
try something like this (remove -whatif if you want really remove):
$File1Founded=$false
$File3Founded=$false
$File7Founded=$false
$CurrentDate1=(get-date).Date
$CurrentDate3=(get-date).AddDays(-3).Date
$CurrentDate7=(get-date).AddDays(-7).Date
Get-ChildItem "C:\temp\test1" -file | sort CreationTime -Descending | %{
$DateFile=$_.CreationTime.Date
if ($DateFile -eq $CurrentDate1)
{
if ($File1Founded)
{
$_
}
else
{
$File1Founded=$true
}
}
elseif ($DateFile -eq $CurrentDate3)
{
if ($File3Founded)
{
$_
}
else
{
$File3Founded=$true
}
}
elseif ($DateFile -eq $CurrentDate7)
{
if ($File7Founded)
{
$_
}
else
{
$File7Founded=$true
}
}
else
{
$_
}
} | Remove-Item -WhatIf

Grok pattern for data separated by pipe with whitespaces and optional values in it

I have a textfile/logfile in which the values are separated by a pipe symbol. "|" with multiple whitespaces.
Also I just wanted to try it without gsub.
An example is below,
Does anyone know how to write a GROK pattern to extract it for logstash? as I am very new to it. Thanks in advance
5000| | |applicationLog |ClientLog |SystemLog |Green | |2014-01-07 11:58:48.76948 |12345 (0x1224)|1) Error 2)Sample Log | Configuration Manager
Since the number of | are inconsistent between different words, you can match it with .*? and extract rest of the data as predefined grok pattern
%{NUMBER:num}.*?%{WORD:2nd}.*?%{WORD:3rd}.*?%{WORD:4th}.*?%{WORD:5th}.*?%{TIMESTAMP_ISO8601}
which will give you,
{
"num": [
[
"5000"
]
],
"BASE10NUM": [
[
"5000"
]
],
"2nd": [
[
"applicationLog"
]
],
"3rd": [
[
"ClientLog"
]
],
"4th": [
[
"SystemLog"
]
],
"5th": [
[
"Green"
]
],
"TIMESTAMP_ISO8601": [
[
"2014-01-07 11:58:48.76948"
]
],
"YEAR": [
[
"2014"
]
],
"MONTHNUM": [
[
"01"
]
],
"MONTHDAY": [
[
"07"
]
],
"HOUR": [
[
"11",
null
]
],
"MINUTE": [
[
"58",
null
]
],
"SECOND": [
[
"48.76948"
]
],
"ISO8601_TIMEZONE": [
[
null
]
]
}
You can test it at online grok debugger.
Since you are new to grok you might want to read, grok filter plugin basics
If you can, I'd suggest you to also have a look in dissect filter which is faster and efficient than grok,
The Dissect filter is a kind of split operation. Unlike a regular
split operation where one delimiter is applied to the whole string,
this operation applies a set of delimiters to a string value. Dissect
does not use regular expressions and is very fast. However, if the
structure of your text varies from line to line then Grok is more
suitable. There is a hybrid case where Dissect can be used to
de-structure the section of the line that is reliably repeated and
then Grok can be used on the remaining field values with more regex
predictability and less overall work to do.

Find ec2 instances with improper or missing tags

I am trying to simply output a list of all instance IDs that do not follow a particular tagging convention.
Tag is missing (Tag Keys: Environment or Finance)
Environment Tag value is not one of (prod, stg, test, dev)
Finance Tag value is not one of (GroupA , GroupB)
For (1) I can use the following:
aws ec2 describe-instances --output json --query 'Reservations[*].Instances[?!not_null(Tags[?Key==`Environment`].Value)] | [].InstanceId'
[
"i-12345678901234567",
"i-76543210987654321"
]
But I still need (2) and (3). What if the tag exists but is empty, or has a typo in the value?
"ec2 --query" functionality is limited and I've yet to find a way for it to get me (2) or (3), especially when it comes to inverting results.
I've gone back and forth trying to
modify the output from the CLI to make it easier to parse in JQ
VS
trying to wrangle the output in JQ
For (2) and (3). Here's a pair of outputs from the CLI that I've tried sending to JQ to parse with sample output for 2 instances:
CLI Sample Output [A] Tag.Value and Tag.Key need to be paired when searching, and then negating/inverting a set of searches...
aws ec2 describe-instances --output json --query 'Reservations[].Instances[].{ID:InstanceId, Tag: Tags[]}' | jq '.[]'
{
"Tag": [
{
"Value": "GroupA",
"Key": "Finance"
},
{
"Value": "stg",
"Key": "Environment"
},
{
"Value": "true",
"Key": "Backup"
},
{
"Value": "Another Server",
"Key": "Name"
}
],
"ID": "i-87654321"
}
{
"Tag": [
{
"Value": "GroupB",
"Key": "Finance"
},
{
"Value": "Server 1",
"Key": "Name"
},
{
"Value": "true",
"Key": "Backup"
},
{
"Value": "stg",
"Key": "Environment"
}
],
"ID": "i-12345678"
}
CLI Sample Output [B] Tag value being inside an array has been enough to trigger syntax errors when attempting things like "jq map" or "jq select"
aws ec2 describe-instances --output json --query 'Reservations[].Instances[].{ID:InstanceId, EnvTag: Tags[?Key==`Environment`].Value, FinTag: Tags[?Key==`Finance`].Value}' | jq '.[]'
{
"EnvTag": [
"stg"
],
"ID": "i-87654321",
"FinTag": [
"GroupA"
]
}
{
"EnvTag": [
"stg"
],
"ID": "i-12345678",
"FinTag": [
"GroupB"
]
}
I find most of the time, when I try to expand some solution from a simpler use case, I only ever end up with cryptic syntax errors due to some oddity in the structure of my incoming dataset.
Example Issue 1
Below is an example of how the inverting / negating fails. This is using CLI output B:
aws ec2 describe-instances --output json --query 'Reservations[].Instances[].{ID:InstanceId, EnvTag: Tags[?Key==`Environment`].Value, FinTag: Tags[?Key==`Finance`].Value}' | jq '.[]' | jq 'select(.EnvTag[] | contains ("prod", "dev") | not)'
I would expect the above to return everything except prod and dev. But it looks like the logic is inverted on each item as opposed to the set of contains:
"!A + !B" instead of "!(A or B)"
The resulting dataset returned is a list of everything, including dev and prod.
Example Issue 1.5
I can workaround the logic issue by chaining the contain excludes, but then I discover that "contains" won't work for me as it will pickup typos that still happen to contain the string in question:
aws ec2 describe-instances --output json --query 'Reservations[].Instances[].{ID:InstanceId, EnvTag: Tags[?Key==`Environment`].Value, FinTag: Tags[?Key==`Finance`].Value}' | jq '.[]' | jq 'select(.EnvTag[] | contains ("dev") | not) | select(.EnvTag[] | contains ("stg") | not) | select(.EnvTag[] | contains ("test") | not) | select(.EnvTag[] | contains ("prod") | not) | select (.EnvTag[] | contains ("foo") | not)'
prod != production
"prod" contains("prod") = true
"production" contains ("prod") = true <-- bad :(
I believe I've found a solution.
It can be greatly simplified. First, in this case, there is no need to invoke jq twice. jq '.[]' | jq ... is equivalent to jq '.[] | ...'
Second, the long pipeline of 'select' filters can be condensed, for example to:
select(.EnvTag[]
| (. != "dev" and . != "stg" and . != "prod" and . != "test" and . != "ops"))
or, if your jq has all/2, even more concisely to:
select( . as $in | all( ("dev", "stg", "prod", "test", "ops"); . != $in.EnvTag[]) )
I believe I've found a solution. It may not be optimal, but I've found a way to pipe-chain excludes of exact strings:
aws ec2 describe-instances --output json --query 'Reservations[].Instances[].{ID:InstanceId, EnvTag: Tags[?Key==`Environment`].Value, FinTag: Tags[?Key==`Finance`].Value}' | jq '.[]' | jq 'select(.EnvTag[] != "dev") | select (.EnvTag[] != "stg") | select (.EnvTag[] != "prod") | select (.EnvTag[] != "test") | select (.EnvTag[] != "ops") | .ID'
I verified this by changing an environment tag from "ops" to "oops".
Upon running this query, it returned the single instance with the oops tag.

Filter logstash does not work with Grok Debugger

I use Grok Debugger to verify my pattern which is: %{DATA:evolution} %{DATA:value}
But the problem that it does not read the content of the variable "value".
If I want to use two successive strings, how can I do that please?
Use WORD instead of DATA which captures also spaces:
%{WORD:evolution} %{WORD:value}
Results:
{
"evolution": [
[
"Send"
]
],
"value": [
[
"functionHandle"
]
]
}

use smo to clone azure SQL database?

I'm writing a program to test update scripts for Azure sql.
The idea is to
- first clone a database (or fill a clone with the source schema and content)
- then run the update script on the clone
Locally I have this working, but for azure I have the probem that I don't see any file names. If I restore one database to another on the same azure "server", don't I have to rename the data files during restore too?
For local restore I do this:
restore.Devices.AddDevice(settings.BackupFileName, DeviceType.File);
restore.RelocateFiles.Add(new RelocateFile("<db>", Path.Combine(settings.DataFileDirectory, settings.TestDatabaseName + ".mdf")));
restore.RelocateFiles.Add(new RelocateFile("<db>_log", Path.Combine(settings.DataFileDirectory, settings.TestDatabaseName + "_1.ldf")));
restore.SqlRestore(srv);
Is something similar required for cloning a database on azure?
Lots of Greetings!
Volker
You can create a database as a copy of [source]:
CREATE DATABASE database_name [ COLLATE collation_name ]
| AS COPY OF [source_server_name].source_database_name
{
(<edition_options> [, ...n])
}
<edition_options> ::=
{
MAXSIZE = { 100 MB | 500 MB | 1 | 5 | 10 | 20 | 30 … 150…500 } GB
| EDITION = { 'web' | 'business' | 'basic' | 'standard' | 'premium' }
| SERVICE_OBJECTIVE =
{ 'basic' | 'S0' | 'S1' | 'S2' | 'S3'
| 'P1' | 'P2' | 'P3' | 'P4'| 'P6' | 'P11'
| { ELASTIC_POOL(name = <elastic_pool_name>) } }
}
[;]

Resources