Pentaho-spoon Reading text file - text-files

We have a text file with multiple fields and it is being used in different transformations. When I do 'Get Fields' in Text File Input, I get fields as follows:
I don't need all these fields for the next step so I kept only required fields(i.e. 1st, 3rd,18th and 19th) as follows and removed other fields in Text File Input as there are '?' per parameter in the next step.
But it is picking the value of initial fields only.
I even tried using 'Position' as per the file, but no luck. Can anyone please tell me what I am missing here?

Text File Input reads the columns sequentially even though you specify certain column names in the Fields tab.
Select all the fields in the Fields tab of the Text File Input and use Select Values as the next step and there select only the required fields.

Related

Deleting entire row if text is found at any column of the sequential file

Using SORT, is it possible to delete a record if a supplied text is in the row? For instance, in the following records any record that contains the text "record" would not be copied.
Suppose:
123456abcdrecord123
111recordaaaaaaaaaa
recordjjjjjj1111111
11111111111abcccccc
So my output should be:
11111111111abcccccc
Can anyone suggest the right control cards for SORT?
Try
OMIT COND=(1,19,SS,EQ,C'record')
Substring search for INCLUDE and OMIT

data factory special character in column headers

I have a file I am reading into a blob via datafactory.
Its formatted in excel. Some of the column headers have special characters and spaces which isn't good if want to take it to csv or parquet and then SQL.
Is there a way to correct this in the pipeline?
Example
"Activations in last 15 seconds high+Low" "first entry speed (serial T/a)"
Thanks
Normally, Data Flow can handle this for you by adding a Select transformation with a Rule:
Uncheck "Auto mapping".
Click "+ Add mapping"
For the column name, enter "true()" to process all columns.
Enter an appropriate expression to rename the columns. This example uses regular expressions to remove any character that is not a letter.
SPECIAL CASE
There may be an issue with this is the column name contains forward slashes ("/"). I accidentally came across this in my testing:
Every one of the columns not mapped contains forward slashes. Unfortunately, I cannot explain why this would be the case as Data Flow is clearly aware of the column name. It can be addressed manually by adding a Fixed rule for EACH offending column, which is obviously less than ideal:
ANOTHER OPTION
The other thing you could try is to pre-process the text file with another Data Flow using a Source dataset that has no delimiters. This would give you the contents of each row as a single column. If you could get a handle on the just first row, you could remove the special characters.

How to read an excel sheet and put the cell value within different text fields through UiPath?

How to read an excel sheet and put the cell value within different text fields through UiPath?
I have a excel sheet as follows:
I have read the excel contents and to iterate over the contents later I have stored the contents in a Output Data Table as follows:
Read Range - Output:
DataTable: CVdatatable
Output Data Table
DataTable: CVdatatable
Text: opCVdatatable
Screenshot:
Finally, I want to read the text opCVdatatable in a iteration and write them into text fields. So in the desired Input fileds I mentioned opCVdatatable or opCVdatatable+ "[k(enter)]" as required.
Screenshot:
But UiPath seems to start from the begining of the Output Data Table whenever I called for opCVdatatable.
Inshort, each desired Input fileds are iteratively getting filled up by all the data with the data stored in the Output Data Table.
Can someone help me out please?
My first recommendation is to use Workbook: Read range activity to read data from Excel because it is quicker, works in the background, and does not require excel to be installed on the system.
Start your sequence like this (note the add headers property is not checked):
You do not need to use Output Data Table because this activity outputs a string containing all row items. What you want to do instead is to access the items in the data table and output each one as a string in your type into, e.g., CVDatatable.Rows(0).Item(0).ToString, like so:
You mention you want to read the text opCVdatatable in an iteration and write them into text fields. This is a little bit more complex, but i'll give you an example. You can use a For Each Row activity and loop through each row in CVDatatable, setting the index property if required. See below:
The challenge is to get the selector correct here and make it dynamic, so that it targets a different text field per iteration. The selector for the type into activity will depend on the system you are targeting, but here is an example:
And the selector for this:
Also, here is a working XAML file for you to test.
Hope this helps.
Chris
Here's a different, more general approach. Instead of including the target in the process itself, the Excel would be modified to include parts of a selector:
Note that column B now contains an identifier, and this ID depends on the application you will be working with. For example, here's my sample app looks like. As you can see, the first text box has an id of 585, the second one is 586, and so on (note that you can work with any kind of identifier including the control's name if exposed to UiPath):
Now, instead of adding multiple Type Into elements to your workflow, you would add just a single one, loop over each of the datatable's row, and then create a dynamic selector:
In my case the selector for the Type Into activity looks as follows:
"<wnd cls='#32770' title='General' /><wnd ctrlid='" + row(1).ToString() + "' />"
This will allow you to maintain the process from the Excel sheet alone - if there's a new field that needs to be mapped, just add it to your sheet. No changes to the Workflow are required.

Talend tInputFileDelimited component java.lang.NumberFormatException for CSV file

As a beginner to TOS for BD, I am trying to read two csv files in Talend OS, i have inferred the metadata schema from the same CSV file, and setup the first row to be header, and delimiter as comma (,)
In my code:
The tMap will read the csv file, and do a lookup on another csv file, and generate two output files passed and reject records.
But while running the job i am getting below error.
Couldn't parse value for column 'Product_ID' in 'row1', value is '4569,Laptop,10'. Details: java.lang.NumberFormatException: For input string: "4569,Laptop,10"
I believe it is considering entire row as one string to be the value for "Product_ID" column
I don't know why that is happening when i have set the delimiter and row separator correctly.
Schema
I can see no rows are going from the first tInputFileDelimited due to above error.
Job Run
Input component
Any idea what else can i check?
Thanks in advance.
In your last screenshot, you can see that the Field separator of your tFileInputDelimited_1 is ; and not ,.
I believe that you haven't set up your component to use the metadata you created for your csv file.
So you need to configure the component to use the metadata you've created by selecting Repository under Property Type, and selecting the delimited file metadata.

Convert ascii files into normal human-readable file

I have got ASCII files and want to convert them into maybe excel or tab/csv delimited text file. The file is a table with field name and field attributes. It also includes index name, table name and field(s) to index if required depending on the software. I don't think it is necessary to think of this. Well, field name and field attributes are enough, I hope so. I just want the information hidden inside. Can you all experts help me to get this done.
The lines are something like this:
10000001$"WORD" WORD$10001890$$$$495.7$$$N$$
10000002$11-word-word word$10000002$$$$$$$Y$$
10000003$11-word word word$10033315$0413004$$$$$$N$$
10000004$11-word word word$10033315$$$$$$$Y$017701$
The general answer, before knowing your ascii file in details, operating system, and so on, would be:
1 - cut the top n-lines, that containg the information you don't want. Leave the filds names, if you want to.
2 - check if the fields are separated by a common character, for example, one comma ,
3 - import the file inside a spreadsheet program, like Excel or OpenOffice Calc. In OOCalc, choose to import the file, then select the correct separating character
that's all.

Resources