I'm currently evaluating BIRT and trying to understand how everything works.
I'd like to render multiple reports (PDFs) from a single XML file
This xml contains the data required for all reports/PDFs.
Each report should be mapped with the data from a specific node in the XML DOM tree of the file.
How can I archive this?
Related
I have a large line delimited (not comma) CSV file (1.2 million lines - 140mb) that contains both data and metadata from a test. The first 50 or so lines are metadata which I need to extract to populate an SQL table.
I have built a Logic App which uses the Azure Blob Storage connector as a trigger. This CSV file is copied into the blob and it triggers the app to do it's stuff. For small files under 50mb this works fine however I get this error for larger files.
InvalidTemplate. Unable to process template language expressions in action 'GetMetaArray' inputs at line '0' and column '0': 'The template language function 'body' cannot be used when the referenced action outputs body has large aggregated partial content. Actions with large aggregated partial content can only be referenced by actions that support chunked transfer mode.'.
The output query is take(split(body('GetBlobContent'), decodeUriComponent('%0D%0A')),100)
The query allows me to put the line delimited meta data into an array so I can perform some queries against it to extract data I convert into variables and use them to check the file for consistency (e.g meta data must meet certain criteria)
I understand that the "Get Blob Content V2" supports chunking natively however from the error it seems like I cannot use the body function to return my array. Can anyone offer any suggestions how I get around this issue? I only need to use a tiny proportion of this file
Thanks Jonny
l have a 20GB XML file in my local system, I want to split the data into multiple chunks and also I want to remove specific attributes from that file. how will I achieve using Nifi?
Use SplitRecord processor and define XML Reader/Writer controller services to read the xml data and write only the required attributes into your result xml.
Also define Records Per Split property value to include how many records you needed for each split.
I have made a custom processor which converts a Excel workbook in JSON and output it but i want that workbook and JSON both in Output. Is it Possible?
The content of a flow file is just bytes, you could put whatever you want in it assuming that something downstream knows how to understand the combination of an Excel workbook and a JSON file.
A more common approach is for a processor to have multiple relationships, one would be "original" which is where you would transfer the original input to the processor, in this case the Excel workbook, and the other would be "success" where you would transfer the successfully created JSON, and then maybe a "failure" relationship where you would transfer the Excel workbook if you couldn't create the JSON for some reason.
I am using a custom processor for csv to json conversion which converts the csv file data into a json array which contains json objects of the data.
My requirement is to get the file attributes like filename, uuid, path etc. and construct a json from these.
Question:
How can I get the related attributes of the file and construct the a json object appending it to the same json getting constructed before.
Just been few days working with apache nifi, so just going with the exact requirements now with the custom processor.
I can't speak to which attributes are being written for your custom processor, but there is a set of core attributes that most/all flow files have, such as filename and uuid. If you are using GetFile or ListFile/FetchFile to read in your CSV file, you will have those and a number of other attributes available (see the doc for more info).
When you have a flow file that has the appropriate attributes set, you can use the AttributesToJSON processor to create a JSON object containing a flat list of the specified attributes, and that object can replace the flow file content or become its own attribute (named 'JSONAttributes') depending on the setting of the "Destination" property of AttributesToJSON.
I have multiple excel files with two types of metadata, Now i have to push the data into two different tables based on metadata of excel files using SSIS.
There are many, many different ways to do this. You'd need to share a lot more information on how your data is structured to really give a great answer, but here's the general strategy I'd suggest.
In the control flow tab, have a separate data flow for each Excel file. The data flows will all work the same, with the exception of having a different Excel source in each data flow, so it will be enough to get the first version working and then copy and paste for the other files.
In the data flow, use a conditional split transformation to read the metadata coming from Excel and send the row to the correct table.
If you really want to be fancy, however, you could create a child package that includes all your data flow logic. Using the Execute Package Task you can pass the Excel file name to the child package for each Excel file you need to import. This way you consolidate your logic in one package and can still import from multiple Excel files in parallel.