data factory special character in column headers - etl

I have a file I am reading into a blob via datafactory.
Its formatted in excel. Some of the column headers have special characters and spaces which isn't good if want to take it to csv or parquet and then SQL.
Is there a way to correct this in the pipeline?
Example
"Activations in last 15 seconds high+Low" "first entry speed (serial T/a)"
Thanks

Normally, Data Flow can handle this for you by adding a Select transformation with a Rule:
Uncheck "Auto mapping".
Click "+ Add mapping"
For the column name, enter "true()" to process all columns.
Enter an appropriate expression to rename the columns. This example uses regular expressions to remove any character that is not a letter.
SPECIAL CASE
There may be an issue with this is the column name contains forward slashes ("/"). I accidentally came across this in my testing:
Every one of the columns not mapped contains forward slashes. Unfortunately, I cannot explain why this would be the case as Data Flow is clearly aware of the column name. It can be addressed manually by adding a Fixed rule for EACH offending column, which is obviously less than ideal:
ANOTHER OPTION
The other thing you could try is to pre-process the text file with another Data Flow using a Source dataset that has no delimiters. This would give you the contents of each row as a single column. If you could get a handle on the just first row, you could remove the special characters.

Related

Issue with choice action when running transform map

I'm trying to insert records to a table by using transform maps. I have this field in the target table, which is a choice type, and I have set the choice action in the source table's field to reject if there's no matching value found. But, when I tried inserting the record using the transform map with the correct value, which exists in the choice list of the target field, it still got rejected and hence not inserting the records.
I have tried searching for possible reasons as to why it still got rejected even with correct value in the source field. Here's the sample link that I have found: https://hi.service-now.com/kb_view.do?sysparm_article=KB0677334
It says that if there are more than 40 characters for the choice list value it will be truncated and might not match those choice. But the choices in the target field has only 20 characters or less.
I have first tried running the transform map in the lower environments before proceeding to production. In the lower environment it works fine and the records got inserted. But, when I tried it in production it got rejected.
There is a difference between choice and choice list. Within the choice list the values are comma separated sys_ids. I could imagine that you have multiple values for import and then the max character are reached or the values do not match, etc.
You could use this approach:
Instead of a direct assignment, source to target field, use the script to target. Then you gain the full script power ;)
Maybe here you could add some logic like switch case or whatever, I guess you get the point.

Insert image into a csv column ruby

I'm currently doing a crawler for a website, and my goal is to have a CSV, with a name in the first column and an image the second one, which is inserted with a Ruby script using the CSV#open method.
I have already used this method but I don't know, and I don't find information about the problematic that is to insert an image into a column.
Is it really possible? If not, which functionality would you use to have a list with string + image after crawling?
A CSV (Comma Separated Values) file is a TEXT file which as the name implies has various values separated by commas, expressed using plain ASCII, or sometimes unicode. It is intended as a light weight way to transfer tabular data between different computer systems or programs. You can use it to spit out a table in a database, or the VALUES in something like a spreadsheet. The normal convention is for the first row(line) of the file to contain names or labels that represent what that column contains, and then data in the subsequent rows.
As such, there really is no practical way to embed an image within a CSV file. This is not a limitation of Ruby or Watir, but a limitation of textfiles which spans pretty much all languages and operating systems.
To do what you want you would be better off to save the images into a specific directory using unique filenames and insert those filenames into the CSV file.

How to add filter to excel table in UI Path?

I have an excel file with a table named 'Table1' in it. I have to perform 'Filter Table' activity in UiPath with the condition "column1 begins with '*my column'". But when I specify the value like this, the column is filtered for 'ends with' operation.
Here is the screenshot for my table-
Below is the screenshot for the steps I followed-
This has been answered many times on UiPath Forum
For example https://forum.uipath.com/t/filter-table-in-excel-data-tables/559/3
If you use *my value as the search / filter pattern, then it'd mean, anything in the beginning and must have my value in the end. So, it is being interpreted correctly as Ends With. If you want to have a Begins With filter, you should have your filter text followed by the wildcard, like - my value*.
Further, if you want to include wildcard as a literal in the search pattern, you'd need to escape that by enclosing it in brackets like [*]my value* - this'd search for text beginning with *my value.
MS Excel / VBA also supports Tilde ~ as an escape character in some cases.
In excel filters, '' represents any series of characters.
The issue in the above case is that the filter value in the condition already contains a ''. Because of this, system always reads it as '*My column' => '[any characters]My column'. i.e., value ends with 'My column'.
To resolve this issue, I have specified contains filter instead of Begins with as 'My column'.
I have also tried to escape '*'. But it threw excel exception.
In addition, you can not specify condition as "Column1 Like '*My column%'". This works file when you are adding filter to 'DataTable'(after performing 'ReadRange' activity). But in this case, you will retrieve all the records and then you will be filtering the columns. This will lead to performance issues if the the excel table is huge.
You can follow the syntax below to perform filter activities in an excel:
DataTableName.Select("[ColumnName]='Datawithwhichweneedtofilter’").CopytoDataTable()

Split a Value in a Column with Right Function in SSIS

I need an urgent help from you guys, the thing i have a column which represent the full name of a user , now i want to split it into first and last name.
The format of the Full name is "World, hello", now the first name here is hello and last name is world.
I am using Derived Column(SSIS) and using Right Function for First Name and substring function for last name, but the result of these seems to be blank, this where even i am blank. :)
It's working for me. In general, you should provide more detail in your questions on places such as this to help others recreate and troubleshoot your issue. You did not specify whether we needed to address NULLs in this field nor do I know how you'd want to interpret it so there is room for improvement on this answer.
I started with a simple OLE DB Source and hard coded a query of "SELECT 'World, Hello' AS Name".
I created 2 Derived Column Tasks. The first one adds a column to Data Flow called FirstCommaPosition. The formula I used is FINDSTRING(Name,",", 1) If NAME is NULLable, then we will need to test for nullability prior to calling the FINDSTRING function. You'll then need to determine how you will want to store the split data in the case of NULLs. I would assume both first and last are should be NULLed but I don't know that.
There are two reasons for doing this in separate steps. The first is performance. As counter-intuitive as it sounds, doing less in a derived column results in better performance because the SSIS engine can better parallelize the operations. The other is more simple - I will need to use this value to make the first and last name split so it will be easier and less maintenance to reference a column than to copy paste a formula.
The second Derived Column is going to actually perform the split.
My FirstNameUnicode column uses this formula (FirstCommaPosition > 0) ? RTRIM(LTRIM(RIGHT(Name,FirstCommaPosition))) : "" That says "If we found a comma in the preceding step, then slice out everything from the comma's position to the end of the string and apply trim operations. If we didn't find a comma, then just return a blank string. The default string type for expressions will be the Unicode (DT_WSTR) so if that is not your need, you will need to cast the resultant into the correct string codepage (DT_STR)
My LastNameUnicode column uses this formula (FirstCommaPosition > 0) ? SUBSTRING(Name,1,FirstCommaPosition -1) : "" Similar logic as above except now I use the SUBSTRING operation instead of RIGHT. Users of the 2012 release of SSIS and beyond, rejoice fo you can use the LEFT function instead of SUBSTRING. Also note that you will need to back off 1 position to remove the comma.

Convert ascii files into normal human-readable file

I have got ASCII files and want to convert them into maybe excel or tab/csv delimited text file. The file is a table with field name and field attributes. It also includes index name, table name and field(s) to index if required depending on the software. I don't think it is necessary to think of this. Well, field name and field attributes are enough, I hope so. I just want the information hidden inside. Can you all experts help me to get this done.
The lines are something like this:
10000001$"WORD" WORD$10001890$$$$495.7$$$N$$
10000002$11-word-word word$10000002$$$$$$$Y$$
10000003$11-word word word$10033315$0413004$$$$$$N$$
10000004$11-word word word$10033315$$$$$$$Y$017701$
The general answer, before knowing your ascii file in details, operating system, and so on, would be:
1 - cut the top n-lines, that containg the information you don't want. Leave the filds names, if you want to.
2 - check if the fields are separated by a common character, for example, one comma ,
3 - import the file inside a spreadsheet program, like Excel or OpenOffice Calc. In OOCalc, choose to import the file, then select the correct separating character
that's all.

Resources