extract and replace parameters in SQL query using M-language - powerquery

This question is related with this question. However, in that question I made some wrong assumptions...
I have a string that contains a SQL query, with or without one or more parameters, where each parameter has a "&" (ampersand sign) as prefix.
Now I want to extract all parameters, load them into a table in excel where the user can enter the values for each parameter.
Then I need to use these values as a replacement for the variables in the SQL query so I can run the query...
The problem I am facing is that extracting (and therefore also replacing) the parameter names is not that straight forward, because the parameters are not always surrounded with spaces (as I assumed in my previous question)
See following examples
Select * from TableA where ID=&id;
Select * from TableA where (ID<&ID1 and ID>=&ID2);
Select * from TableA where ID = &id ;
So, two parts of my question:
How can I extract all parameters
How can I replace all parameters using another table where the replacements are defined (see also my previous question)

A full solution for this would require getting into details of how your data is structured and would potentially be covering a lot of topics. Since you already covered one way to do a mass find/replace (which there are a variety of ways to accomplish in Power Query), I'll just show you my ugly solution to extracting the parameters.
List.Transform
(
List.Select
(
Text.Split([YOUR TEXT HERE], " "), each Text.Contains(_,"&")
),
each List.Accumulate
(
{";",")"}, <--- LIST OF CHARACTERS TO CLEAN
"&" & Text.AfterDelimiter(_, "&"),
(String,Remove) => Text.Replace(String,Remove,"")
)
)
This is sort of convoluted, but here's the best I can explain what is going on.
The first key part is combining List.Select with Text.Split to extract all of the parameters from the string into a list. It's using a " " to separate the words in the list, and then filtering to words containing a "&", which in your second example means the list will contain "(ID<&ID1" and "ID>=&ID2);" at this point.
The second part is using Text.AfterDelimiter to extract the text that occurs after the "&" in our list of parameters, and List.Accumulate to "clean" any unwanted characters that would potentially be hanging on to the parameter. The list of characters you would want to clean has to be manually defined (I just put in ";" and ")" based on the sample data). We also manually re-append a "&" to the parameter, because Text.AfterDelimiter would have removed it.
The result of this is a List object of extracted parameters from any of the sample strings you provided. You can setup a query that takes a table of your SQL strings, applies this code in a custom column where [YOUR TEXT HERE] is the field containing your strings, then expand the lists that result and remove duplicates on them to get a unique list of all the parameters in your SQL strings.

Related

data factory special character in column headers

I have a file I am reading into a blob via datafactory.
Its formatted in excel. Some of the column headers have special characters and spaces which isn't good if want to take it to csv or parquet and then SQL.
Is there a way to correct this in the pipeline?
Example
"Activations in last 15 seconds high+Low" "first entry speed (serial T/a)"
Thanks
Normally, Data Flow can handle this for you by adding a Select transformation with a Rule:
Uncheck "Auto mapping".
Click "+ Add mapping"
For the column name, enter "true()" to process all columns.
Enter an appropriate expression to rename the columns. This example uses regular expressions to remove any character that is not a letter.
SPECIAL CASE
There may be an issue with this is the column name contains forward slashes ("/"). I accidentally came across this in my testing:
Every one of the columns not mapped contains forward slashes. Unfortunately, I cannot explain why this would be the case as Data Flow is clearly aware of the column name. It can be addressed manually by adding a Fixed rule for EACH offending column, which is obviously less than ideal:
ANOTHER OPTION
The other thing you could try is to pre-process the text file with another Data Flow using a Source dataset that has no delimiters. This would give you the contents of each row as a single column. If you could get a handle on the just first row, you could remove the special characters.

searching in CLOB for words in a list/table

I have a large table with a clob column (+100,000 rows) from which I need to search for specific words within a certain timeframe.
{select id, clob_field, dbms_lob.instr(clob_field, '.doc',1,1) as doc, --ideally want .doc
dbms_lob.instr(clob_field, '.docx',1,1) as docx, --ideally want .docx
dbms_lob.instr(clob_field, '.DOC',1,1) as DOC, --ideally want .DOC
dbms_lob.instr(clob_field, '.DOCX',1,1) as DOCX --ideally want .DOCX
from clob_table, search_words s
where (to_char(date_entered, 'DD-MON-YYYY')
between to_date('01-SEP-2018') and to_date('30-SEP-2018'))
AND (contains(clob_field, s.words )>0) ;}
The set of words are '.doc', '.DOC', '.docx', and '.docx'. When I use
CONTAINS() it seems to ignore the dot and so provides me with lots of rows, but not with the document extensions in it. It finds emails with .doc as part of the address, so the doc will have a period on either side of it.
i.e. mail.doc.george#here.com
I don't want those occurrences. I have tried it with a space at the end of the word and it ignores the spaces. I have put these in a search table I created, as shown above, and it still ignores the spaces. Any suggestions?
Thanks!!
Here's two suggestions.
The simple, inefficient way is to use something besides CONTAINS. Context indexes are notoriously tricky to get right. So instead of the last line, you could do:
AND regexp_instr(clob_field, '\.docx', 1,1,0,'i') > 0
I think that should work, but it might be very slow. Which is when you'd use an index. But Oracle Text indexes are more complicated than normal indexes. This old doc explains that punctuation characters (as defined in the index parameters) are not indexed, because the point of Oracle Text is to index words. If you want special characters to be indexed as part of the word, you need to add it to the set of printjoin characters. This doc explains how, but I'll paste it here. You need to drop your existing CONTEXT index and re-create it with this preference:
begin
ctx_ddl.create_preference('mylex', 'BASIC_LEXER');
ctx_ddl.set_attribute('mylex', 'printjoins', '._-'); -- periods, underscores, dashes can be parts of words
end;
/
CREATE INDEX myindex on clob_table(clob_field) INDEXTYPE IS CTXSYS.CONTEXT
parameters ('LEXER mylex');
Keep in mind that CONTEXT indexes are case-insensitive by default; I think that's what you want, but FYI you can change it by setting the 'mixed_case' attribute to 'Y' on the lexer, right below where you set the printjoins attribute above.
Also it seems like you're trying to search for words which end in .docx, but CONTAINS isn't INSTR - by default it matches entire words, not strings of characters. You'd probably want to modify your query to do AND contains(clob_field, '%.docx')>0

How to add filter to excel table in UI Path?

I have an excel file with a table named 'Table1' in it. I have to perform 'Filter Table' activity in UiPath with the condition "column1 begins with '*my column'". But when I specify the value like this, the column is filtered for 'ends with' operation.
Here is the screenshot for my table-
Below is the screenshot for the steps I followed-
This has been answered many times on UiPath Forum
For example https://forum.uipath.com/t/filter-table-in-excel-data-tables/559/3
If you use *my value as the search / filter pattern, then it'd mean, anything in the beginning and must have my value in the end. So, it is being interpreted correctly as Ends With. If you want to have a Begins With filter, you should have your filter text followed by the wildcard, like - my value*.
Further, if you want to include wildcard as a literal in the search pattern, you'd need to escape that by enclosing it in brackets like [*]my value* - this'd search for text beginning with *my value.
MS Excel / VBA also supports Tilde ~ as an escape character in some cases.
In excel filters, '' represents any series of characters.
The issue in the above case is that the filter value in the condition already contains a ''. Because of this, system always reads it as '*My column' => '[any characters]My column'. i.e., value ends with 'My column'.
To resolve this issue, I have specified contains filter instead of Begins with as 'My column'.
I have also tried to escape '*'. But it threw excel exception.
In addition, you can not specify condition as "Column1 Like '*My column%'". This works file when you are adding filter to 'DataTable'(after performing 'ReadRange' activity). But in this case, you will retrieve all the records and then you will be filtering the columns. This will lead to performance issues if the the excel table is huge.
You can follow the syntax below to perform filter activities in an excel:
DataTableName.Select("[ColumnName]='Datawithwhichweneedtofilter’").CopytoDataTable()

VB 6 Advanced Find Method in View Code

Lets say i have 3 table A,B,C.
In every table i have some insert query.
I want to using Find "ctrl+f" to find every insert query with some format.
Example: i want to find code that contain "insert [table_name] value" no matter what is the table name (A or B or C), so i want to search some code but skip the word in the middle of it.
I have googling with any keyword, but i doesn't get any solution that even close to what i want.
Is it possible to do something like this.?
You need to use what are known as "wildcard" characters.
In the find window, you'll notice there is a check box called "Use Pattern Matching".
If you check this, then you can use some special characters to expand your search.
? is a wildcard that indicates any character can take this place.
* is a wildcard that indicates a string of any length could take this place
eg. ca? would match cat, car, cam etc
ca* would match cat, car, catastrophe, called ... etc
So something along the lines of insert * value should find what you are interested in.

Split a Value in a Column with Right Function in SSIS

I need an urgent help from you guys, the thing i have a column which represent the full name of a user , now i want to split it into first and last name.
The format of the Full name is "World, hello", now the first name here is hello and last name is world.
I am using Derived Column(SSIS) and using Right Function for First Name and substring function for last name, but the result of these seems to be blank, this where even i am blank. :)
It's working for me. In general, you should provide more detail in your questions on places such as this to help others recreate and troubleshoot your issue. You did not specify whether we needed to address NULLs in this field nor do I know how you'd want to interpret it so there is room for improvement on this answer.
I started with a simple OLE DB Source and hard coded a query of "SELECT 'World, Hello' AS Name".
I created 2 Derived Column Tasks. The first one adds a column to Data Flow called FirstCommaPosition. The formula I used is FINDSTRING(Name,",", 1) If NAME is NULLable, then we will need to test for nullability prior to calling the FINDSTRING function. You'll then need to determine how you will want to store the split data in the case of NULLs. I would assume both first and last are should be NULLed but I don't know that.
There are two reasons for doing this in separate steps. The first is performance. As counter-intuitive as it sounds, doing less in a derived column results in better performance because the SSIS engine can better parallelize the operations. The other is more simple - I will need to use this value to make the first and last name split so it will be easier and less maintenance to reference a column than to copy paste a formula.
The second Derived Column is going to actually perform the split.
My FirstNameUnicode column uses this formula (FirstCommaPosition > 0) ? RTRIM(LTRIM(RIGHT(Name,FirstCommaPosition))) : "" That says "If we found a comma in the preceding step, then slice out everything from the comma's position to the end of the string and apply trim operations. If we didn't find a comma, then just return a blank string. The default string type for expressions will be the Unicode (DT_WSTR) so if that is not your need, you will need to cast the resultant into the correct string codepage (DT_STR)
My LastNameUnicode column uses this formula (FirstCommaPosition > 0) ? SUBSTRING(Name,1,FirstCommaPosition -1) : "" Similar logic as above except now I use the SUBSTRING operation instead of RIGHT. Users of the 2012 release of SSIS and beyond, rejoice fo you can use the LEFT function instead of SUBSTRING. Also note that you will need to back off 1 position to remove the comma.

Resources