Nifi convert sql to json structured - apache-nifi

I am trying to figure out a way to get data out of SQL and format it in a specific json format and i am having a hard time doing that in nifi.
Data that is in the table looks like this.
{
"location_id": "123456",
"name": "My Organization",
"address_1": "Address 1",
"address_2": "Suite 123",
"city": "My City",
"state": "FL",
"zip_code": "33333",
"description": "",
"longitude": "-2222.132131321332113",
"latitude": "111.21321321321321321",
"type": "data type"
}
And i want to convert it into a format like this.
{
"type": "FeatureCollection",
"features": [
{
"geometry": {
"type": "Point",
"coordinates": [
$.longitude,
$.latitude
]
},
"type": "Feature",
"properties": {
"name": $.name,
"phone": $.phone_number,
"address1": $.address_1,
"address2": $.address_2,
"city": $.city,
"state": $.state,
"zip": $.zip_code,
"type": $.type
}
}
]
}
This is what i have so far and by all means if i am doing this in a weird way let me know.
I was thinking i could split all of these into single record jsons format them in this format.
{
"geometry": {
"type": "Point",
"coordinates": [
$.longitude,
$.latitude
]
},
"type": "Feature",
"properties": {
"name": $.name,
"phone": $.phone_number,
"address1": $.address_1,
"address2": $.address_2,
"city": $.city,
"state": $.state,
"zip": $.zip_code,
"type": $.type
}
}
And then merge all of the records together and merge it back around this
{
"type": "FeatureCollection",
"features": [
]
}
I def feel like i am doing this weird, just not sure how to get it done haha.

Try ExecuteSQLRecord with a JsonRecordSetWriter instead of ExecuteSQL, this will allow you to output the rows as JSON objects without converting to/from Avro. If you don't have too many rows (that would cause an out-of-memory error), you can use JoltTransformJSON to do the whole transformation (without splitting the rows) with the following Chain spec:
[
{
"operation": "shift",
"spec": {
"#FeatureCollection": "type",
"*": {
"#Feature": "features[&1].type",
"name": "features[&1].properties.name",
"address_1": "features[&1].properties.address_1",
"address_2": "features[&1].properties.address_2",
"city": "features[&1].properties.city",
"state": "features[&1].properties.state",
"zip_code": "features[&1].properties.zip",
"type": "features[&1].properties.type",
"longitude": "features[&1].geometry.coordinates.longitude",
"latitude": "features[&1].geometry.coordinates.latitude"
}
}
}
]
If you do have too many rows, you can use SplitJson to split them into smaller chunks, then JoltTransformJSON (with the above spec) then MergeRecord to merge them back into one large array. To get them nested into the features field, you could use ReplaceText to "wrap" the array in the outer JSON object, but that too may cause an out-of-memory error.

Related

Problems With Array when Generating Dynamic Schema for Power Automate

PowerAutomate errors when trying to create a powerapp from the following schema generated automatically via swashbuckle decorations:
{
"dynSearchAndReplaceText": {
"type": "object",
"required": [
"fileName",
"fileContent",
"phrases"
],
"properties": {
"phrases": [
],
"fileName": {
"type": "string",
"x-ms-visibility": "important",
"x-ms-summary": "Filename",
"description": "The filename of the source file, the file extension is mandatory: 'file.pdf' and not 'file'"
},
"fileContent": {
"format": "byte",
"type": "string",
"x-ms-visibility": "important",
"x-ms-summary": "File Content",
"description": "The file content of the source file"
}
}
}
}
I'd thought the problem might be related to the array of phrase (I want users to be able to provide a number of strings that can be searched for and their individual replacements.
The 'phrases' array is as below:
"phrases": [
{
"replacementText": {
"type": "string",
"x-ms-visibility": "important",
"x-ms-summary": "ReplacementText",
"description": "The text to be inserted"
},
"searchText": {
"type": "string",
"x-ms-visibility": "important",
"x-ms-summary": "SearchText",
"description": "The text value to locate"
},
"type": "object",
"x-ms-visibility": "important",
"description": "A text phrase to locate and replace."
}
],
Does power automate support arrays at this depth in the schema?

VSC validation with $schema and usage of additionalProperties = false

i have a dataobject.json and a corresponding example.json. I like to compare both if everything what is in the example is the same notation as in the dataobject.
I added in the exampe the dataobject file as a schema to validate against. This works, but only for the required field, not for the optional properties. - there the validation doesn't find a problem, even if there are some.
To validate these I added the "additionalProperties": false line. This works in general, so I find all the deviations, but I also get a problem that property §schema is not allowed.
How can I solve this?
the dataobject
{
"$schema": "http://json-schema.org/draft-07/schema",
"type": "object",
"title": "GroupDO",
"required": [
"id",
"name"
],
"additionalProperties": false,
"properties": {
"id": {
"type": "string",
"format": "uuid",
"example": "5dd6c80a-3376-4bce-bc47-8t41b3565325",
"description": "Unique id ."
},
"name": {
"type": "string",
"example": "ABD",
"description": "The name."
},
"GroupSort": {
"type": "integer",
"format": "int32",
"example": 1,
"description": "Defines in which order the groups should appear."
},
"GroupTextList": {
"type": "array",
"description": "A descriptoin in multiple languages.",
"items": {
"$ref": "../../common/dataobjects/Description_1000_DO.json"
}
},
"parentGroupId": {
"type": "string",
"format": "uuid",
"example": "8e590f93-1ab6-40e4-a5f4-aa1eeb2b6a80",
"description": "Unique id for the parent group."
}
},
"description": "DO representing a group object. "}
the example
{ "$schema": "../dataobjects/GroupDO.json",
"id": "18694b46-0833-4790-b780-c7897ap08500",
"version": 1,
"lastChange": "2020-05-12T13:57:39.935305",
"sort": 3,
"name": "STR",
"parentGroupId": "b504273e-61fb-48d1-aef8-c289jk779709",
"GroupTexts": [
{
"id": "7598b668-d9b7-4d27-a489-19e45h2bdad0",
"version": 0,
"lastChange": "2020-03-09T14:14:25.491787",
"languageIsoCode": "de_DE",
"description": "Tasche"
},
{
"id": "376e82f8-837d-4bb2-a21f-a9e0ebd59e23",
"version": 0,
"lastChange": "2020-03-09T14:14:25.491787",
"languageIsoCode": "en_GB",
"description": "Bag"
}
]
}
the problem messages:
property $schema is not allowed
Thanks in advance for your help.

Add images via Shopware 6 API

I have a Shopware 6.3 shop and need to migrate images to it using the integration API.
How should I construct a body for a media upload? Do I need to put a file somewhere or just pass in the link?
I have managed to push new products into Shopware via guide here: https://docs.shopware.com/en/shopware-platform-dev-en/admin-api-guide/writing-entities?category=shopware-platform-dev-en/admin-api-guide#creating-entities but I am not sure how to handle media. In this guide it is only explained how to create links between already uploaded media files to products in here https://docs.shopware.com/en/shopware-platform-dev-en/admin-api-guide/writing-entities?category=shopware-platform-dev-en/admin-api-guide#media-handling but no examples as to how to actually push the media files.
I have URL's for each image I need (in the database, along with produc id's and image positions).
The entity schema describes media as:
"media": {
"name": "media",
"translatable": [
"alt",
"title",
"customFields"
],
"properties": {
"id": {
"type": "string",
"format": "uuid"
},
"userId": {
"type": "string",
"format": "uuid"
},
"mediaFolderId": {
"type": "string",
"format": "uuid"
},
"mimeType": {
"type": "string",
"readOnly": true
},
"fileExtension": {
"type": "string",
"readOnly": true
},
"uploadedAt": {
"type": "string",
"format": "date-time",
"readOnly": true
},
"fileName": {
"type": "string",
"readOnly": true
},
"fileSize": {
"type": "integer",
"format": "int64",
"readOnly": true
},
"metaData": {
"type": "object",
"readOnly": true
},
"mediaType": {
"type": "object",
"readOnly": true
},
"alt": {
"type": "string"
},
"title": {
"type": "string"
},
"url": {
"type": "string"
},
"hasFile": {
"type": "boolean"
},
"private": {
"type": "boolean"
},
"customFields": {
"type": "object"
},
"createdAt": {
"type": "string",
"format": "date-time",
"readOnly": true
},
"updatedAt": {
"type": "string",
"format": "date-time",
"readOnly": true
},
"translated": {
"type": "object"
},
"tags": {
"type": "array",
"entity": "tag"
},
"thumbnails": {
"type": "array",
"entity": "media_thumbnail"
},
"user": {
"type": "object",
"entity": "user"
},
"categories": {
"type": "array",
"entity": "category"
},
"productManufacturers": {
"type": "array",
"entity": "product_manufacturer"
},
"productMedia": {
"type": "array",
"entity": "product_media"
},
"avatarUser": {
"type": "object",
"entity": "user"
},
"mediaFolder": {
"type": "object",
"entity": "media_folder"
},
"propertyGroupOptions": {
"type": "array",
"entity": "property_group_option"
},
"mailTemplateMedia": {
"type": "array",
"entity": "mail_template_media"
},
"documentBaseConfigs": {
"type": "array",
"entity": "document_base_config"
},
"shippingMethods": {
"type": "array",
"entity": "shipping_method"
},
"paymentMethods": {
"type": "array",
"entity": "payment_method"
},
"productConfiguratorSettings": {
"type": "array",
"entity": "product_configurator_setting"
},
"orderLineItems": {
"type": "array",
"entity": "order_line_item"
},
"cmsBlocks": {
"type": "array",
"entity": "cms_block"
},
"cmsSections": {
"type": "array",
"entity": "cms_section"
},
"cmsPages": {
"type": "array",
"entity": "cms_page"
},
"documents": {
"type": "array",
"entity": "document"
}
}
},
but it is not clear what fields are crucial. Do I need to create product-media folder first and then use it's id when making a POST request to media endpoint? Can I just specify the URL and will Shopware download the image itself to a folder or keep pointing to the URL I have used. I need to house the images inside the Shopware.
There is no problem for me to download the images from the URL and push them to Shopware but I am not sure how to use the API for it (there is a lot of images and they need to be done in bulk).
One possible solution:
FIRST: create a new media POST /api/{apiVersion}/media?_response=true
SECOND: "Upload Image" /api/{apiVersion}/_action/media/{mediaId}/upload?extension={extension}&fileName={imgName}&_response=true
more information can be found here: https://forum.shopware.com/discussion/comment/278603/#Comment_278603
In CASE images are for products use the endpoint POST /api/{apiVersion}/product-media and set the coverId
A complete listing of all routes is available via the OpenAPI schema: [your-domain/localhost]/api/v3/_info/openapi3.json
It's also possible to set all the media and the cover & coverId during product creation by one request. Therefore, set the product Cover and product Media
{
"coverId":"3d5ebde8c31243aea9ecebb1cbf7ef7b",
"productNumber":"SW10002","active":true,"name":"Test",
"description":"fasdf",
"media":[{
"productId":"94786d894e864783b546fbf7c60a3640",
"mediaId":"084f6aa36b074130912f476da1770504",
"position":0,
"id":"3d5ebde8c31243aea9ecebb1cbf7ef7b"
},
{
"productId":"94786d894e864783b546fbf7c60a3640",
"mediaId":"4923a2e38a544dc5a7ff3e26a37ab2ae",
"position":1,
"id":"600999c4df8b40a5bead55b75efe688c"
}],
"id":"94786d894e864783b546fbf7c60a3640"
}
Keep in mind to check if the bearer token is valid by checking for example like this:
if (JwtToken.ValidTo >= DateTime.Now.ToUniversalTime() - new TimeSpan(0, 5, 0))
{
return Client.Get(request);
}
else
{
// refresh the token by new authentication
IntegrationAuthenticator(this.key, this.secret);
}
return Client.Get(request);
This will work for Shopware 6.4
As a general advice, it depends. The APIs changed a little bit since 6.4 and there is also an official documentation available at https://shopware.stoplight.io/docs/admin-api/docs/guides/media-handling.md.
However, i think that it is always a little easier to have a real life example. What i do in our production environment is basically these steps.
(Optional) Check, if the media object exists
Create an media-file object using the endpoint GET /media-files/
If it exist then upload an image using the new media-id reference.
Let us assume the filename is yourfilename.jpg. What you also will need is a media-folder-id, which will reference the image-folder within Shopware. This can be obtained in Shopware via Admin > Content > Media > Product Media.
Step 0
Before uploading an image to Shopware, you want to ensure that the image does not exists, so that you can skip it.
This step is optional, as it is not mandatory to create an image. However you want to have some sort of validation mechanism in a production environment.
Request-Body
POST api/search/media
This will run a request against the Shopware-API with a response.
{
"filter":[
{
"type":"equals",
"field":"fileName",
"value":"yourfilename"
},
{
"type":"equals",
"field":"fileExtension",
"value":"jpg"
},
{
"type":"equals",
"field":"mediaFolderId",
"value":"d798f70b69f047c68810c45744b43d6f"
}
],
"includes":{
"media":[
"id"
]
}
}
Step 1
Create a new media-file
Request-Body
POST api/_action/sync
This request will create a new media-object in Shopware.
The value for media_id must be any UUID. I will use this value: 94f83a75669647288d4258f670a53e69
The customFields property is optional. I just use it to keep a reference of hash value which i could use to validate changed values.
The value for the media folder id is the one you will get from your Shopware-Backend.
{
"create-media": {
"entity": "media",
"action": "upsert",
"payload": [
{
"id": "{{media_id}}",
"customFields": {"hash": "{{file.hash}}"},
"mediaFolderId": "{{mediaFolderId}}"
}
]
}
}
Response
The response will tell you that everything works as expected.
{
"success":true,
"data":{
"create-media":{
"result":[
{
"entities":{
"media":[
"94f83a75669647288d4258f670a53e69"
],
"media_translation":[
{
"mediaId":"94f83a75669647288d4258f670a53e69",
"languageId":"2fbb5fe2e29a4d70aa5854ce7ce3e20b"
}
]
},
"errors":[
]
}
],
"extensions":[
]
}
},
"extensions":[
]
}
Step 2
This is the step where we will upload an image to Shopware. We will use a variant with the content-type image/jpg. However, a payload with an URL-Attribute would also work. See the details in the official documentation.
Request-Body
POST api/_action/media/94f83a75669647288d4258f670a53e69/upload?extension=jpg&fileName=yourfilename
Note that the media-id is part of the URL. And also the filename but without the file-extension JPG!
This body is pretty straightforward an in our case there is no payload, as we use an upload with Content-Type: "image/jpeg".
This would be a payload if you want to use an URL as resource:
{
"url": "<url-to-your-image>"
}

Nifi JoltTransformRecord UUID in default transform not working as expected

I have a Nifi workflow which uses JoltTranformRecord for doing some manipulation in the data which is record based. I have to create a default value uuid in each message in flow file.
My JoltTranformRecord configuration is as below.
Jolt specification :
[{
"operation": "shift",
"spec": {
"payload": "data.payload"
}
}, {
"operation": "default",
"spec": {
"header": {
"source": "${source}",
"client_id": "${client_id}",
"uuid": "${UUID()}",
"payload_type":"${payload_type}"
}
}
}]
Shift operation and all other default operations are working fine as expected. But UUID is coming same for all the messages. I need different UUIDs for each messages. I don't want to add another processor for this purpose only.
My workflow below :
Reader & Writer configurations for JoltRecord processor is :
IngestionSchemaJsonTreeReader ( From JsonTreeReader Processor ):
IngestionSchemaAvroRecordSetWriter ( From AvroWriter Processor ) :
Configured schema registry has below schemas defined in it.
com.xyz.ingestion.pre_json
{
"type": "record",
"name": "event",
"namespace": "com.xyz.ingestion.raw",
"doc": "Event ingested to kafka",
"fields": [
{
"name": "payload",
"type": [
"null",
"string"
],
"default": "null"
}
]
}
com.xyz.ingestion.raw -
{
"type": "record",
"name": "event",
"namespace": "com.xyz.ingestion.raw",
"doc": "Event ingested to kafka",
"fields": [
{
"type": {
"name": "header",
"type": "record",
"namespace": "com.xyz.ingestion.raw.header",
"doc": "Header data for event ingested",
"fields": [
{
"name": "payload_type",
"type": "string"
},
{
"name": "uuid",
"type": "string",
"size": "36"
},
{
"name": "client_id",
"type": "string"
},
{
"name": "source",
"type": "string"
}
]
},
"name": "header"
},
{
"type": {
"name": "data",
"type": "record",
"namespace": "com.xyz.ingestion.raw.data",
"doc": "Payload for event ingested",
"fields": [
{
"name": "payload",
"type": [
"null",
"string"
],
"default": "null"
}
]
},
"name": "data"
}
]
}
The expression language is evaluated per record. UUID() is executed for each evaluation. So uuid must be unique for each record. From the information you provided I cannot see why you are getting duplicate uuids.
I tried to reproduce your problem with following flow:
GenerateFlowFile:
SplitJson: configure $ as JsonPathExpression to split Json array into records.
JoltTransformRecord:
As you can see the way I am adding the UUID is not different from how you do it. But I am getting different UUIDs as expected:

How do I parse a date field and generate a date in string format in NiFi

Each of my flow file contains 2000 records. I would like to parse 01/01/2000 into a column year = 2000, column month = Jan and column day = 01
i.e. the input column 01/01/2000 into 3 values separated by commas 01,Jan,2000
Lets say you have a schema like this for a person with a birthday and you want to split out the birthday:
{
"name": "person",
"namespace": "nifi",
"type": "record",
"fields": [
{ "name": "first_name", "type": "string" },
{ "name": "last_name", "type": "string" },
{ "name": "birthday", "type": "string" }
]
}
You would need to modify the schema so it had the fields you want to add:
{
"name": "person",
"namespace": "nifi",
"type": "record",
"fields": [
{ "name": "first_name", "type": "string" },
{ "name": "last_name", "type": "string" },
{ "name": "birthday", "type": "string" },
{ "name": "birthday_year", "type": ["null", "string"] },
{ "name": "birthday_month", "type": ["null", "string"] },
{ "name": "birthday_day", "type": ["null", "string"] }
]
}
Lets say the input record has the following text:
bryan,bende,1980-01-01
We can use UpdateRecord with a CsvReader and CsvWriter, and UpdateRecord can populate the three fields we want by parsing the original birthday field.
If we send the output to LogAttribute we should see the following now:
first_name,last_name,birthday,birthday_year,birthday_month,birthday_day
bryan,bende,1980-01-01,1980,01,01
Here is the link to the record path guide for details on the toDate and format functions:
https://nifi.apache.org/docs/nifi-docs/html/record-path-guide.html
You can use UpdateRecord for this, assuming your input record has the date column called "myDate", you'd set the Replacement Value Strategy to Record Path Value, and your user-defined properties might look something like:
/day format(/myDate, "dd")
/month format(/myDate, "MMM")
/year format(/myDate, "YYYY")
Your output schema would look like this:
{
"namespace": "nifi",
"name": "myRecord",
"type": "record",
"fields": [
{"name": "day","type": "int"},
{"name": "month","type": "string"},
{"name": "year","type": "int"}
]
}

Resources