I want to convert input CSV file to XML file using ESQL in IIB v10. Can you please help me with the ESQL code to achieve the same. I've provided the Input CSV file sample and Output XML file sample as below:
Input CSV file
Output XML file
Your question is fundamentally wrong. Using ESQL only to do it on Integration Bus is like using a knife to cut down a tree (when you have the choice with a chainsaw). If you want to convert a csv file to an xml, the proper solution is the following :
1) Define a new DFDL schema to parse the CSV file
2) Define your xsd for the output XML
3) Use the DFDL parser when you read the CSV, and use the structure you created (on the fileInput node for example, I don't know your exact case)
4) Use a mapping node to map from your DFDL structure to your XML structure (defined in the xsd)
Note : the last step can be done with alternatives solution, like compute Nodes (ESQL, Java, C#, php).
If you have any additional questions, feel free to contact me
Related
I'm using NiFi 1.11.4 to read CSV files from an SFTP, do a few transformations and then drop them off on GCS. Some of the files contain no content, only a header line. During my transformations I convert the files to the AVRO format, but when converting back to CSV no file output is produced for the files where the content is empty.
I have the following settings for the Processor:
And for the Controller:
I did find the following topic: How to use ConvertRecord and CSVRecordSetWriter to output header (with no data) in Apache NiFi? but in the comments it mentions explicitly that ConvertRecord should cover this since 1.8. Sadly I understood it incorrectly, it does not seem to work or my setup is wrong.
While I could make it work with by explicitly writing the schema as a line to empty files, I wanted to know if there is also a more elegant way?
This is my first time using Ruby. I'm writing an application that parses data and performs some calculations based on it, the source of which is a JSON file. I'm aware I can use JSON.parse() here but I'm trying to write my program so that it will work with other sources of data. Is there a clear cut way of doing this? Thank you.
When your source file is JSON then use JSON.parse. Do not implement a JSON parser on your own. If the source file is a CSV, then use the CSV class.
When your application should be able to read multiple different formats then just add one Reader class for each data type, like JSONReader, CSVReader, etc. And then decide depending on the file extension which reader to use to read the file.
I am trying to use the Python Client Library to add multiple files to a dataset I have created for AutoMl Translate. I was unable to find a good example for the csv file that is to be used. Here is the link to their Python Client Library example code to add files to a dataset.
I have created a csv in a bucket of the following form:
UNASSIGNED,gs://<bucket name>/x,gs://<bucket name>/y
Where I am trying to add two files called x and y.
and I get the following error:
google.api_core.exceptions.GoogleAPICallError: None No files to import. Please specify your input files.
The problem was how I formatted the csv file. Each file I want to add needs to have its own line,
The correct csv file would look like this:
UNASSIGNED,gs://<bucket name>/x
UNASSIGNED,gs://<bucket name>/y
I am trying to read a csv from local file system and convert the content into JSON format using Apache Nifi and put the JSON format file in the local system. I have succeeded in converting the first row of csv file but not other rows. What am I missing?
Input:
1,aaa,loc1
2,bbb,loc2
3,ccc,loc3
and my nifi workflow is as here:
http://www.filedropper.com/mycsvtojson
My output is as below which is desired format but I want that to happen for all the rows.
{ "id" : "1", "name" : "aaa",
"location" : "loc1" }
There are a few different ways this could be done...
A custom Java processor that reads in a CSV and converts to JSON
Using the ExecuteScript processor to do something similar in a Groovy/Jython script
Use SplitText to split your original CSV into single lines, then use your current approach with ExtractText and ReplaceText, and then a MergeContent to merge back together
Use ConvertCsvToAvro and then ConvertAvroToJson
Although the last option makes an extra conversion to Avro, it might be the easiest solution requiring almost no work.
This question is a bit older, but there is now a ConvertRecord processor in NiFi 1.3 and newer, which should be able to handle this conversion directly for you, and it avoids having to use split up the data by creating a single JSON array with all of the values, if that is desirable.
If the input files in XML format, I shouldn't be using TextInputFormat because TextInputFormat assumes each record is in each line of the input file and the Mapper class is called for each line to get a Key Value pair for that record/line.
So I think we need a custom input format to scan the XML datasets.
Being new to Hadoop mapreduce, is there any article/link/video that shows the steps to build a custom input format?
thanks
nath
Problem
Working on a single XML file in parallel in MapReduce is tricky because XML does not contain a synchronization marker in its data format. Therefore, how do we work with a file format that’s not inherently splittable like XML?
Solution
MapReduce doesn’t contain built-in support for XML, so we have to turn to another Apache project, Mahout, a machine learning system, which provides an XML InputFormat.
So I mean no need to have custom input format since Mahout library present.
I am not sure, whether you are going to read or write but both were described in above link.
Pls have a look at XmlInputFormat implementation details here.
Furthermore, XmlInputFormat extends TextInputFormat