LINQ2SQL - How to detect mutiple instances of one string within another - linq

I'm trying to determine a way to find all words in a database table called 'Word' that contain certain letters which is easy enough for single instances of letters, for example if I want to find all words that contain 'L' and 'I' I would use:
Words.Where(w => w.Word_value.IndexOf("I") > 0 && w.Word_value.IndexOf("L") > 0)
However, if I needed to find all words containing the letter 'I' and three instances of the letter 'L' (e.g. 'LILLY'), I am at a loss. Is there a way I can do a count of instances of a string within another?
Any help would be greatly appreciated.

Try this.
int count = Words.Length - Words.Replace("L", "").Length;

I would recommend using SQL Server's full text search capabilities.
Create a stored procedure that implements the functionality you require using full text search and then expose that using Linq to Sql or Linq to Entities.
Also see the CONTAINS predicate.

Related

extract and replace parameters in SQL query using M-language

This question is related with this question. However, in that question I made some wrong assumptions...
I have a string that contains a SQL query, with or without one or more parameters, where each parameter has a "&" (ampersand sign) as prefix.
Now I want to extract all parameters, load them into a table in excel where the user can enter the values for each parameter.
Then I need to use these values as a replacement for the variables in the SQL query so I can run the query...
The problem I am facing is that extracting (and therefore also replacing) the parameter names is not that straight forward, because the parameters are not always surrounded with spaces (as I assumed in my previous question)
See following examples
Select * from TableA where ID=&id;
Select * from TableA where (ID<&ID1 and ID>=&ID2);
Select * from TableA where ID = &id ;
So, two parts of my question:
How can I extract all parameters
How can I replace all parameters using another table where the replacements are defined (see also my previous question)
A full solution for this would require getting into details of how your data is structured and would potentially be covering a lot of topics. Since you already covered one way to do a mass find/replace (which there are a variety of ways to accomplish in Power Query), I'll just show you my ugly solution to extracting the parameters.
List.Transform
(
List.Select
(
Text.Split([YOUR TEXT HERE], " "), each Text.Contains(_,"&")
),
each List.Accumulate
(
{";",")"}, <--- LIST OF CHARACTERS TO CLEAN
"&" & Text.AfterDelimiter(_, "&"),
(String,Remove) => Text.Replace(String,Remove,"")
)
)
This is sort of convoluted, but here's the best I can explain what is going on.
The first key part is combining List.Select with Text.Split to extract all of the parameters from the string into a list. It's using a " " to separate the words in the list, and then filtering to words containing a "&", which in your second example means the list will contain "(ID<&ID1" and "ID>=&ID2);" at this point.
The second part is using Text.AfterDelimiter to extract the text that occurs after the "&" in our list of parameters, and List.Accumulate to "clean" any unwanted characters that would potentially be hanging on to the parameter. The list of characters you would want to clean has to be manually defined (I just put in ";" and ")" based on the sample data). We also manually re-append a "&" to the parameter, because Text.AfterDelimiter would have removed it.
The result of this is a List object of extracted parameters from any of the sample strings you provided. You can setup a query that takes a table of your SQL strings, applies this code in a custom column where [YOUR TEXT HERE] is the field containing your strings, then expand the lists that result and remove duplicates on them to get a unique list of all the parameters in your SQL strings.

Is it indexing Or tagging?

I have two classes claim and index. i have a field in my claim class called topic which is a string. I m trying to index the topic column not using database index column features. But it should by coding the following method.
Suppose i have claim 1, for claim 1 topic field ("i love muffins muffins") i ll do the folowing treatment
#1. Create an empty Dictionary with "word"=>occurrences
#2. Create a List of the stopwords exemple stopwords = ("For","This".....etc )
#3. Create List of the delimiters exemple delimiter_chars = ",.;:!?"
#4. Split the Text(topic field) into words delimited by whitespace.
#5. Remove unwanted delimiter characters adjoining words.
#6. Remove stopwords.
#7. Remove Duplicate
#8. now i create multiple index object (word="love",occurences = 1,looked = 0,reference on claim 1),(word="muffins",occurences = 2,looked = 0,reference on claim 1),
now whenever i look the word muffins for exemple looked will increase by one and i will move the record up in my database. So my question is the following is this method good ? is it better than database index features ? is there someways to improve things ?
What I think you are looking for is something called a B-Tree. In your case, you would use a 26 (or 54 if you need case sensitivity) branch node in the tree. This will make finding objects very fast. I think the time is nlogn or something. In the node, you would have a pointer to the actual data in an array, list, file, or something else.
However, unless you are willing to put the time in to code something specific for your application, you might be better off using a database such as Oracle, Microsoft SQL Server, or MySQL because these are professionally developed and profiled to get the maximum performance possible.

Split a Value in a Column with Right Function in SSIS

I need an urgent help from you guys, the thing i have a column which represent the full name of a user , now i want to split it into first and last name.
The format of the Full name is "World, hello", now the first name here is hello and last name is world.
I am using Derived Column(SSIS) and using Right Function for First Name and substring function for last name, but the result of these seems to be blank, this where even i am blank. :)
It's working for me. In general, you should provide more detail in your questions on places such as this to help others recreate and troubleshoot your issue. You did not specify whether we needed to address NULLs in this field nor do I know how you'd want to interpret it so there is room for improvement on this answer.
I started with a simple OLE DB Source and hard coded a query of "SELECT 'World, Hello' AS Name".
I created 2 Derived Column Tasks. The first one adds a column to Data Flow called FirstCommaPosition. The formula I used is FINDSTRING(Name,",", 1) If NAME is NULLable, then we will need to test for nullability prior to calling the FINDSTRING function. You'll then need to determine how you will want to store the split data in the case of NULLs. I would assume both first and last are should be NULLed but I don't know that.
There are two reasons for doing this in separate steps. The first is performance. As counter-intuitive as it sounds, doing less in a derived column results in better performance because the SSIS engine can better parallelize the operations. The other is more simple - I will need to use this value to make the first and last name split so it will be easier and less maintenance to reference a column than to copy paste a formula.
The second Derived Column is going to actually perform the split.
My FirstNameUnicode column uses this formula (FirstCommaPosition > 0) ? RTRIM(LTRIM(RIGHT(Name,FirstCommaPosition))) : "" That says "If we found a comma in the preceding step, then slice out everything from the comma's position to the end of the string and apply trim operations. If we didn't find a comma, then just return a blank string. The default string type for expressions will be the Unicode (DT_WSTR) so if that is not your need, you will need to cast the resultant into the correct string codepage (DT_STR)
My LastNameUnicode column uses this formula (FirstCommaPosition > 0) ? SUBSTRING(Name,1,FirstCommaPosition -1) : "" Similar logic as above except now I use the SUBSTRING operation instead of RIGHT. Users of the 2012 release of SSIS and beyond, rejoice fo you can use the LEFT function instead of SUBSTRING. Also note that you will need to back off 1 position to remove the comma.

Oracle Text Search doesn't work on some words

I am using Oracle' Text Search for my project. I created a ctxsys.context index on my column and inserted one entry "Would you like some wine???". I executed the query
select guid, text, score(10) from triplet where contains (text, 'Would', 10) > 0
it gave me no results. Querying 'you' and 'some' also return zero results. Only 'like' and 'wine' matches the record. Does Oracle consider you, would, some as stop words?? How can I let Oracle match these words? Thank you.
so,
i found that the query's output is perfect according to the stop word lists that is in the oracle.
those words can be found in the ctxsys package, and you could query for the stoplist and the stop words using
SELECT * FROM CTX_STOPLISTS;
SELECT * FROM ctx_stopwords;
and yes, the oracle consider 'you', 'would' in your query as stop words.
The following lists are the default stop words.
a did in only then where
all do into onto there whether
almost does is or therefore which
also either it our these while
although for its ours they who
an from just s this whose
and had ll shall those why
any has me she though will
are have might should through with
as having Mr since thus would
at he Mrs so to yet
be her Ms some too you
because here my still until your
been hers no such ve yours
both him non t very
but his nor than was
by how not that we
can however of the were
could i on their what
d if one them when
if you need to remove some specified words (or add stop words),
(you need **GRANT EXECUTE ON CTXSYS.CTX_DDL to you **)
then, you've to execute a procedure,
example:
begin
ctx_ddl.remove_stopword('mystop_list','some');
ctx_ddl.remove_stopword('mystop_list','you');
end;
refer link for various functions in ctx_ddl package
you could get full description about the created ctx index by querying,
select ctx_report.describe_index('yourindex_name') from dual;
Look at the docs
In paragraph "4.1.5 Querying Stopwords" you can get some useful info :)

How do I query for when a field doesn't begin with a letter?

I'm tasked with adding an option to our search, which will return results where a given field doesn't begin with a letter of the alphabet. (The .StartsWith(letter) part wasn't so hard).
But I'm rather unsure about how to get the results that don't fall within the A-Z set, and equally hoping it generates some moderately efficient SQL underneath.
Any help appreciated - thanks.
In C# use the following construct, assuming db as a data context:
var query = from row in db.SomeTable
where !System.Data.Linq.SqlClient.SqlMethods.Like(row.SomeField, "[A-Z]%")
select row;
This is only supported in LINQ to SQL queries. All rules of the T-SQL LIKE operator apply.
You could also use less effective solution:
var query = from row in db.SomeTable
where row.SomeField[0] < 'A' || row.SomeField[0] > 'Z'
select row;
This gets translated into SUBSTRING, CAST, and UNICODE constructs.
Finally, you could use VB, where there appears to be a native support for the Like method.
Though SQL provides the ability to check a range of characters in a LIKE statement using bracket notation ([a-f]% for example), I haven't seen a linq to sql construct that supports this directly.
A couple thoughts:
First, if the result set is relatively small, you could do a .ToList() and filter in memory after the fact.
Alternatively, if you have the ability to change the data model, you could set up additional fields or tables to help index the data and improve the search.
--EDIT--
Made changes per Ruslan's comment below.
Well, I have no idea if this will work because I have never tried it and don't have a compiler nearby to try it, but the first thing I would try is
var query = from x in db.SomeTable
where x.SomeField != null &&
x.SomeField.Length >= 1 &&
x.SomeField.Substring(0, 1).All(c => !Char.IsLetter(c))
select x;
The possiblility exists that LINQ to SQL fails to convert this to SQL.

Resources