MariaDB fulltext search with special chars and "word starts with" - full-text-search

I can do a MariaDB fulltext query which searches for the word beginning like this:
select * from mytable
where match(mycol) against ('+test*' in boolean mode)>0.0;
This finds words like "test", "tester", "testing".
If my search string contains special characters, I can put the search string in quotes:
select * from mytable
where match(mycol) against ('+"test-server"' in boolean mode)>0.0;
This will find all rows which contain the string test-server.
But it seems I cannot combine both:
select * from mytable
where match(mycol) against ('+"test-serv"*' in boolean mode)>0.0;
This results in an error:
Error: (conn:7) syntax error, unexpected $end, expecting FTS_TERM or FTS_NUMB or '*'
SQLState: 42000
ErrorCode: 1064
Placing the ´*´ in the quoted string will return no results (as expected):
select * from mytable
where match(mycol) against ('+"test-serv*"' in boolean mode)>0.0;
Does anybody know whether this is a limitation of MariaDB? Or a bug?
My MariaDB version is 10.0.31

WHERE MATCH(mycol) AGAINST('+test +serv*' IN BOOLEAN MODE)
AND mycol LIKE '%test_serv%'
The MATCH will find the desired rows plus some that are not desired. Then the LIKE will filter out the duds. Since the LIKE is being applied to only some rows, its slowness is masked.
(Granted, this does not work in all cases. And it requires some manual manipulation.)
d'Artagnan - Use
WHERE MATCH(mycol) AGAINST("+Arta*" IN BOOLEAN MODE)
AND mycol LIKE '%d\'Artagnan%'
Note that I used the suitable escaping for getting the apostrophe into the LIKE string.
So, the algorithm for your code goes something like:
Break the string into "words" the same way FULLTEXT would.
Toss any strings that are too short.
If no words are left, then you cannot use FULLTEXT and are stuck with a slow LIKE.
Stick * after the last word (or each word?).
Build the AGAINST with those word(s).
Add on AND LIKE '%...%' with the original phrase, suitably escaped.

Related

How can I use 'update where' select in FoxPro?

I am totally new to FoxPro (and quite fluent with MySQL).
I am trying to execute this query in FoxPro:
update expertcorr_memoinv.dbf set 'Memo' = (select 'Memo' from expertcorr_memoinv.dbf WHERE Keymemo='10045223') WHERE Keydoc like "UBOA"
I got the error:
function name is missing )
How can I fix it?
In FoxPro SQL statements you would not 'single-quote' column names. In Visual FoxPro version 9 the following sequence would run without errors:
CREATE TABLE expertcorr_memoinv (keydoc Char(20), keymemo M, Memo M)
Update expertcorr_memoinv.dbf set Memo = (select Memo from expertcorr_memoinv.dbf WHERE Keymemo='10045223') WHERE Keydoc like "UBOA"
If you would provide a few sample data and an expected result, we could see whether the line you posted would do what you want after correcting the single-quoted 'Memo' names.
NB 1: "Memo" is a reserved word in FoxPro.
NB 2: As you know, the ";" semicolon is a line-continuation in Visual FoxPro, so that a longer SQL statement can be full; of; those;
So that the Update one-liner could be written as:
Update expertcorr_memoinv ;
Set Memo = (Select Memo From expertcorr_memoinv ;
WHERE Keymemo='10045223') ;
WHERE Keydoc Like "UBOA"
NB 3: Alternatively, you can SQL Update .... From... in Visual FoxPro, similar to the Microsoft SQL Server feature. See How do I UPDATE from a SELECT in SQL Server?
I would do that just as Stefan showed.
In VFP, you also have a chance to use non-SQL statements which make it easier to express yourself. From your code it feels like KeyMemo is a unique field:
* Get the Memo value into an array
* where KeyMemo = '10045223'
* or use that as a variable also
local lcKey
lcKey = '10045223'
Select Memo From expertcorr_memoinv ;
WHERE Keymemo=m.lcKey ;
into array laMemo
* Update with that value
Update expertcorr_memoinv ;
Set Memo = laMemo[1] ;
WHERE Keydoc Like "UBOA"
This is only for divide & conquer strategy that one may find easier to follow. Other than that writing it with a single SQL is just fine.
PS: In VFP you don't use backticks at all.
Single quotes, double quotes and opening closing square brackets are not used as identifiers but all those three are used for string literals.
'This is a string literal'
"This is a string literal"
[This is a string literal]
"My name is John O'hara"
'We need 3.5" disk'
[Put 3.5" disk into John's computer]
There are subtle differences between them, which I think is an advanced topic and that you may never need to know.
Also [] is used for array indexer.
Any one of them could also be used for things like table name, alias name, file name ... (name expression) - still they are string literals, parentheses make it a name expression. ie:
select * from ('MyTable') ...
copy to ("c:\my folder\my file.txt") type delimited

Space characters inside Oracle's Contains() function

I needed to use Oracle 11g's Contains() function to search some exact text contained in some field typed by the user. I was asked not to use the 'like' operator.
According to the Oracle documentation, for everything to work you need to:
Double } characters
Put the whole input between {}
This works in most cases except for a few ones. Below it a test case:
create table theme
(name varchar2(300 char) not null);
insert into theme (name)
values ('a');
insert into theme (name)
values ('b');
insert into theme (name)
values ('a or b');
insert into theme (name)
values ('Pdz344_1_b');
create index name_index on theme(name) indextype is ctxsys.context;
If the 'or' operator was interpreted, I would get all four results, which is hopefully not the case. Now if I run the following, I would expect is to only find 'a or b'.
select * from theme
where contains(name, '{a or b}')>0;
However I also get 'Pdz344_1_b'. But there's no 'a', 'o' not 'r' and I find it very surprising that this text is matched. Is there something I don't get about contains()'s syntax?
CONTAINS is not like LIKE operator at all. Since it using ORACLE TEXT search engine (something like google search), not just string matching.
{} - is an escape marker. Means everything you put inside should be treated as text to escape.
Therefore you issue query to find text that looks like a or b not like a or b.
So your query get matched against Pdz344_1_b because it has b char in it.
Row with only a character ain't matched because a character exists in the default stop list.
Why just b ain't matched? Because your match sequence actually looks like a\ or\ b.
So we have 3 tokens a _or _b (underscores represents spaces). a in stop list, and we have no string _b in the b row, because there only single character. But we do have this combination in the Pdz344_1_b row, because non-alphabetic characters are treated as whitespace. If you remove {} or query for {b or a} then you'll get matches against b as well.

SPHINX field search operator issue

I am using sphinx 2.0.4-release with SPH_MATCH_EXTENDED2 query syntax. When I have an "empty value" in my query i.e.:
blah & ''
sphinx ignores it and searches just "blah". It still works the same way when i use field search operator and an empty value comes last:
#field1 blah #field2 ''
But this query:
#field1 '' #field2 blah
causes error: syntax error, unexpected TOK_FIELDLIMIT near ' '' #field2 blah'. Of course i can trim empty values, but this behaviour seems illogical to me... Am i doing something wrong? Or is it actually a bug?
Sphinx uses an inverted index. It breaks up the text into words and stores (hashes of) them.
As such it doesnt index 'nothing' (its not a word) - so you can't search an empty string.
All of those queries are strictly a syntax error - and nonsense. But in some cases sphinx will just dispose of invalid syntax silently (because it then falls back and thinks its word char, which are then not in charset_table and so go) - and in so doing come up with a 'valid' query (just not what you intended)
The solution is to simply turn an empty field into a 'word' at indexing time, then you can search for the empty string!
eg
sql_query = SELECT id, title, IF(field1 = '','EMPTY_STRING',field1) AS field1, ....
Then you can just query as
#field1 EMPTY_STRING #field2 blah
What you use as 'EMPTY_STRING' is completely arbitrary.

Whats the XPath equivalent to SQL In query?

I would like to know whats the XPath equivalent to SQL In query. Basically in sql i can do this:
select * from tbl1 where Id in (1,2,3,4)
so i want something similar in XPath/Xsl:
i.e.
//*[#id= IN('51417','1121','111')]
Please advice
(In XPath 2,) the = operator always works like in.
I.e. you can use
//*[#id = ('51417','1121','111')]
A solution is to write out the options as separate conditions:
//*[(#id = '51417') or (#id = '1121') or (#id = '111')]
Another, slightly less verbose solution that looks a bit like a hack, though, would be to use the contains function:
//*[contains('-51417-1121-111-', concat('-', #id, '-'))]
Literally, this means you're checking whether the value of the id attribute (preceeded and succeeded by a delimiter character) is a substring of -51417-1121-111-. Note that I am using a hyphen (-) as a delimiter of the allowable values; you can replace that with any character that will not appear in the id attribute.

SQL Sorting and hyphens

Is there a way to easily sort in SQL Server 2005 while ignoring hyphens in a string field? Currently I have to do a REPLACE(fieldname,'-','') or a function to remove the hyphen in the sort clause. I was hoping there was a flag I could set at the top of the stored procedure or something.
Access and the GridView default sorting seems to ignore the hypen in strings.
I learned something new, just like you as well
I believe the difference is between a "String Sort" vs a "Word Sort" (ignores hyphen)
Sample difference between WORD sort and STRING sort
http://andrusdevelopment.blogspot.com/2007/10/string-sort-vs-word-sort-in-net.html
From Microsoft
http://support.microsoft.com/kb/322112
For example, if you are using the SQL
collation
"SQL_Latin1_General_CP1_CI_AS", the
non-Unicode string 'a-c' is less than
the string 'ab' because the hyphen
("-") is sorted as a separate
character that comes before "b".
However, if you convert these strings
to Unicode and you perform the same
comparison, the Unicode string N'a-c'
is considered to be greater than N'ab'
because the Unicode sorting rules use
a "word sort" that ignores the hyphen.
I did some sample code
you can also play with the COLLATE to find the one to work with your sorting
DECLARE #test TABLE
(string VARCHAR(50))
INSERT INTO #test SELECT 'co-op'
INSERT INTO #test SELECT 'co op'
INSERT INTO #test SELECT 'co_op'
SELECT * FROM #test ORDER BY string --COLLATE SQL_Latin1_General_Cp1_CI_AS
--co op
--co-op
--co_op
SELECT * FROM #test ORDER BY CAST(string AS NVARCHAR(50)) --COLLATE SQL_Latin1_General_Cp1_CI_AS
--co op
--co_op
--co-op

Resources