Automate Dbeaver SQL Export - syntax

Dbeaver recently pushed an update to allow you to code an export. The documentation lists the export function as
#export {"type": "csv", "producer": {...}, "consumer": {...}, "processor": {...}}
I'm not sure how to format this to get the code to run. The JSON text in the documentation looks like the following:
{
"type": <ID of the processor>,
"producer": {
<producer settings>
},
"consumer": {
<consumer settings>
},
"processor": {
<processor-specific settings>
},
}
I'm aiming to get the code to output data to a folder and file name. The associated IDs are outputFolder and outputFilePattern which belong in the consumer settings. I've tried various permutations to get this to work, but receive errors like unterminated object at outputFolder, expecting ':' at outputFolder, invalid syntax, etc. The most obvious permutation is:
#export {"type": "csv", "producer": {}, "consumer": {"outputFolder": "C:\downloads", "outputFilePattern": "Data"}, "processor": {...}}
This returns the error 'invalid escape sequence at column 52 path $..outputFolder'. If you don't put quotes around outputFolder it returns the same error.
https://dbeaver.com/docs/wiki/Export-Command/#Producer-Settings

Related

How do I use FreeFormTextRecordSetWriter

I my Nifi controller I want to configure the FreeFormTextRecordSetWriter, but I have no Idea what I should put in the "Text" field. I'm getting the text from my source (in my case GetSolr), and just want to write this, period.
Documentation and mailinglist do not seem to tell me how this is done, any help appreciated.
EDIT: Here the sample input + output I want to achieve (as you can see: not ransformation needed, plain text, no JSON input)
EDIT: I now realize, that I can't tell GetSolr to return just CSV data - but I have to use Json
So referencing with attribute seems to be fine. What the documentation omits is, that the ${flowFile} attribute should containt the complete flowfile that is returned.
Sample input:
{
"responseHeader": {
"zkConnected": true,
"status": 0,
"QTime": 0,
"params": {
"q": "*:*",
"_": "1553686715465"
}
},
"response": {
"numFound": 3194,
"start": 0,
"docs": [
{
"id": "{402EBE69-0000-CD1D-8FFF-D07756271B4E}",
"MimeType": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"FileName": "Test.docx",
"DateLastModified": "2019-03-27T08:05:00.103Z",
"_version_": 1629145864291221504,
"LAST_UPDATE": "2019-03-27T08:16:08.451Z"
}
]
}
}
Wanted output
{402EBE69-0000-CD1D-8FFF-D07756271B4E}
BTW: The documentation says this:
The text to use when writing the results. This property will evaluate the Expression Language using any of the fields available in a Record.
Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)
I want to use my source's text, so I'm confused
You need to use expression language as if the record's fields are the FlowFile's attributes.
Example:
Input:
{
"t1": "test",
"t2": "ttt",
"hello": true,
"testN": 1
}
Text property in FreeFormTextRecordSetWriter:
${t1} k!${t2} ${hello}:boolean
${testN}Num
Output(using ConvertRecord):
test k!ttt true:boolean
1Num
EDIT:
Seems like what you needed was reading from Solr and write a single column csv. You need to use CSVRecordSetWriter. As for the same,
I should tell you to consider to upgrade to 1.9.1. Starting from 1.9.0, the schema can be inferred for you.
otherwise, you can set Schema Access Strategy as Use 'Schema Text' Property
then, use the following schema in Schema Text
{
"name": "MyClass",
"type": "record",
"namespace": "com.acme.avro",
"fields": [
{
"name": "id",
"type": "int"
}
]
}
this should work
I'll edit it into my answer. If it works for you, please choose my answer :)

Generating types from schema for TypeScript yields empty file

I take these steps, which means, I first create the json:
apollo-codegen introspect-schema schema.graphqls --output schema.json
which yields:
{
"data": {
"__schema": {
"queryType": {
"name": "Query"
},
"mutationType": null,
"subscriptionType": null,
"types": [
{
"kind": "OBJECT",
"name": "Query",
"description": "",
"fields": [
{
...
But after, when I run:
apollo-codegen generate **/*.graphqls --schema schema.json --target typescript --output schema.ts
I get an empty schema.ts types file:
/* tslint:disable */
// This file was automatically generated and should not be edited.
/* tslint:enable */
Ideas?
In your apollo-codegen generate line you're targeting .graphl*s* files. Should that be .graphql?
Which file(s) are you expecting the generator to match? It needs to have files that contain queries or mutations that would be run against your schema - it won't generate code for the schema itself:
The purpose of this command is to generate types for query and mutation operations made against the schema (it will not generate types for the schema itself).
Source: https://www.apollographql.com/docs/angular/features/developer-tooling.html#introspect

Duplicated mapping key in serverless.yml

I'm using Serverless to deploy a couple of functions written in C# to AWS.
While deploying a message duplicated mapping key in "...\serverless.yml" is thrown.
Separately, both functions get deployed but when put together the said error message is shown.
What am I missing?
{
"service": "serverlessquick",
"provider": {
"name": "aws",
"runtime": "nodejs4.3"
},
"functions": {
"hello": {
"handler": "handler.hello",
"events": [
{
"http": {
"path": "hello",
"method": "get"
}
}
]
}
}
}
Make your yaml code into json this way any mistakes are obvious, and will be more likely to be picked up by the parser. If anyone comes across this, use the referenced config above.
I got a similar error but in my case it was indentation problem. Go through your YML file and see if the changes you have made have the correct identation. A json parser as mention by others can help narrow down where in the YML file the problem exists

Loading JSON file with serde in Cloudera

I am trying to work with a JSON file with this bag structure :
{
"user_id": "kim95",
"type": "Book",
"title": "Modern Database Systems: The Object Model, Interoperability, and Beyond.",
"year": "1995",
"publisher": "ACM Press and Addison-Wesley",
"authors": [
{
"name": "null"
}
],
"source": "DBLP"
}
{
"user_id": "marshallo79",
"type": "Book",
"title": "Inequalities: Theory of Majorization and Its Application.",
"year": "1979",
"publisher": "Academic Press",
"authors": [
{
"name": "Albert W. Marshall"
},
{
"name": "Ingram Olkin"
}
],
"source": "DBLP"
}
I tried to use serde to load JSON data for Hive. I followed both ways that I saw here : http://blog.cloudera.com/blog/2012/12/how-to-use-a-serde-in-apache-hive/
With this code :
CREATE EXTERNAL TABLE IF NOT EXISTS serd (
user_id:string,
type:string,
title:string,
year:string,
publisher:string,
authors:array<struct<name:string>>,
source:string)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION '/user/hdfs/data/book-seded_workings-reduced.json';
I got this error:
error while compiling statement: failed: parseexception line 2:17 cannot recognize input near ':' 'string' ',' in column type
I alson tried this version : https://github.com/rcongiu/Hive-JSON-Serde
which gave a different error :
Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot validate serde: org.openx.data.jsonserde.JsonSerde
Any idea?
I also want to know what are alternatives to work with a JSON like this to make queries on 'name' field in 'authors'. Whether it's Pig or Hive?
I have already converted it in to a "tsv" file. But, since my authors column is a tuple, I don't know how make requests on 'name' with Hive, If I build a table from this file. Should I change my script for "tsv" conversion or keep it? Or are there any alternatives with Hive or Pig?
Hive does not have built in support for JSON. So for using JSON with Hive we need to use third part jars like:
https://github.com/rcongiu/Hive-JSON-Serde
You have couple of issues with the create table statement. It should look like this:
CREATE EXTERNAL TABLE IF NOT EXISTS serd (
user_id string,type string,title string,year string,publisher string,authors array<string>,source:string)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION...
The JSON records your are using keep each record in a single line like this:
{"user_id": "kim95", "type": "Book", "title": "Modern Database Systems: The Object Model, Interoperability, and Beyond.", "year": "1995", "publisher": "ACM Press and Addison-Wesley", "authors": [{"name":"null"}], "source": "DBLP"}
{"user_id": "marshallo79", "type": "Book", "title": "Inequalities: Theory of Majorization and Its Application.", "year": "1979", "publisher": "Academic Press","authors": [{"name":"Albert W. Marshall"},{"name":"Ingram Olkin"}], "source": "DBLP"}
After downloading the project from GIT you need to compile the the project which will create a jar you need to add this jar in the Hive session before running the create table statement.
Hope it helps...!!!
add jar only add to session which won't be available and finally it is getting error.
Get the JAR loaded on all the nodes at Hive and Map Reduce path like the below location so that HIVE and Map Reduce component will pick this whenever it’s been called.
/hadoop/CDH_5.2.0_Linux_parcel/parcels/CDH-5.2.0- 1.cdh5.2.0.p0.36/lib/hive/lib/json-serde-1.3.6-jar-with-dependencies.jar
/hadoop/CDH_5.2.0_Linux_parcel/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop-mapreduce/lib/json-serde-1.3.6-jar-with-dependencies.jar
Note: This path varies to cluster.

Parse response from a "folder items" request to find a file

Using the v2 of the box api, I use the folder items request to get information on files in a folder: http://developers.box.com/docs/#folders-retrieve-a-folders-items
I'm looking at trying to parse the response data. Any ideas how I can do this in bash to easily find a file in the user's account? I would like to find the name of the file where I can get the ID of the file as well.
response looks something like this:
{
"total_count": 25,
"entries": [
{
"type": "file",
"id": "531117507",
"sequence_id": "0",
"etag": "53a93ebcbbe5686415835a1e4f4fff5efea039dc",
"name": "agile-web-development-with-rails_b10_0.pdf"
},
{
"type": "file",
"id": "1625774972",
"sequence_id": "0",
"etag": "32dd8433249b1a59019c465f61aa017f35ec9654",
"name": "Continuous Delivery.pdf"
},
{ ...
For bash, you can use sed or awk. Look at Parsing JSON with Unix tools.
Also if you can use a programming language, then python can be your fastest option. it has a nice module json http://docs.python.org/library/json.html. It has a simple decode API which will give a dict as the output
Then
import json
response_dict = json.loads(your_response)
I recommend using jq for parsing/munging json in bash. It is WAY better than trying to use sed or awk to parse it.

Resources