I'm using the Jquery quicksearch plugin.
It's a great plugin, but I have an issue on one of my website.
In a table, I have strings that contains characters with accents, like ò, ü, ä, etc.
The problem is that users will not necessary search with the accents.
What I would like to achieve is that when someone types in "ola" for example, the search results would also display results containing "ölä"
Any idea on how to do it?
You need to store your values in table in 2 columns: 1) raw 2) clean.
1) Raw - words as thay are like ò, ü, ä, etc.
2) Clean - without accents like o,u,a.
To make "clean" you need a transition function, while updating or inserting into table, it won't be so hard to do it.
So while the users are typing in terms for search, you can look through both columns, "clean" and "raw".
Related
I have a file I am reading into a blob via datafactory.
Its formatted in excel. Some of the column headers have special characters and spaces which isn't good if want to take it to csv or parquet and then SQL.
Is there a way to correct this in the pipeline?
Example
"Activations in last 15 seconds high+Low" "first entry speed (serial T/a)"
Thanks
Normally, Data Flow can handle this for you by adding a Select transformation with a Rule:
Uncheck "Auto mapping".
Click "+ Add mapping"
For the column name, enter "true()" to process all columns.
Enter an appropriate expression to rename the columns. This example uses regular expressions to remove any character that is not a letter.
SPECIAL CASE
There may be an issue with this is the column name contains forward slashes ("/"). I accidentally came across this in my testing:
Every one of the columns not mapped contains forward slashes. Unfortunately, I cannot explain why this would be the case as Data Flow is clearly aware of the column name. It can be addressed manually by adding a Fixed rule for EACH offending column, which is obviously less than ideal:
ANOTHER OPTION
The other thing you could try is to pre-process the text file with another Data Flow using a Source dataset that has no delimiters. This would give you the contents of each row as a single column. If you could get a handle on the just first row, you could remove the special characters.
I have a large table with a clob column (+100,000 rows) from which I need to search for specific words within a certain timeframe.
{select id, clob_field, dbms_lob.instr(clob_field, '.doc',1,1) as doc, --ideally want .doc
dbms_lob.instr(clob_field, '.docx',1,1) as docx, --ideally want .docx
dbms_lob.instr(clob_field, '.DOC',1,1) as DOC, --ideally want .DOC
dbms_lob.instr(clob_field, '.DOCX',1,1) as DOCX --ideally want .DOCX
from clob_table, search_words s
where (to_char(date_entered, 'DD-MON-YYYY')
between to_date('01-SEP-2018') and to_date('30-SEP-2018'))
AND (contains(clob_field, s.words )>0) ;}
The set of words are '.doc', '.DOC', '.docx', and '.docx'. When I use
CONTAINS() it seems to ignore the dot and so provides me with lots of rows, but not with the document extensions in it. It finds emails with .doc as part of the address, so the doc will have a period on either side of it.
i.e. mail.doc.george#here.com
I don't want those occurrences. I have tried it with a space at the end of the word and it ignores the spaces. I have put these in a search table I created, as shown above, and it still ignores the spaces. Any suggestions?
Thanks!!
Here's two suggestions.
The simple, inefficient way is to use something besides CONTAINS. Context indexes are notoriously tricky to get right. So instead of the last line, you could do:
AND regexp_instr(clob_field, '\.docx', 1,1,0,'i') > 0
I think that should work, but it might be very slow. Which is when you'd use an index. But Oracle Text indexes are more complicated than normal indexes. This old doc explains that punctuation characters (as defined in the index parameters) are not indexed, because the point of Oracle Text is to index words. If you want special characters to be indexed as part of the word, you need to add it to the set of printjoin characters. This doc explains how, but I'll paste it here. You need to drop your existing CONTEXT index and re-create it with this preference:
begin
ctx_ddl.create_preference('mylex', 'BASIC_LEXER');
ctx_ddl.set_attribute('mylex', 'printjoins', '._-'); -- periods, underscores, dashes can be parts of words
end;
/
CREATE INDEX myindex on clob_table(clob_field) INDEXTYPE IS CTXSYS.CONTEXT
parameters ('LEXER mylex');
Keep in mind that CONTEXT indexes are case-insensitive by default; I think that's what you want, but FYI you can change it by setting the 'mixed_case' attribute to 'Y' on the lexer, right below where you set the printjoins attribute above.
Also it seems like you're trying to search for words which end in .docx, but CONTAINS isn't INSTR - by default it matches entire words, not strings of characters. You'd probably want to modify your query to do AND contains(clob_field, '%.docx')>0
Lets say i have 3 table A,B,C.
In every table i have some insert query.
I want to using Find "ctrl+f" to find every insert query with some format.
Example: i want to find code that contain "insert [table_name] value" no matter what is the table name (A or B or C), so i want to search some code but skip the word in the middle of it.
I have googling with any keyword, but i doesn't get any solution that even close to what i want.
Is it possible to do something like this.?
You need to use what are known as "wildcard" characters.
In the find window, you'll notice there is a check box called "Use Pattern Matching".
If you check this, then you can use some special characters to expand your search.
? is a wildcard that indicates any character can take this place.
* is a wildcard that indicates a string of any length could take this place
eg. ca? would match cat, car, cam etc
ca* would match cat, car, catastrophe, called ... etc
So something along the lines of insert * value should find what you are interested in.
I need to search over a DB table using some kind of fuzzy search like the one from oracle and using indexes since I do not want a table scan(there is a lot of data).
I want to ignore case, language special stuff(ñ, ß, ...) and special characters like _, (), -, etc...
Search for "maria (cool)" should get "maria- COOL" and "María_Cool" as matches.
Is that possible in Oracle in some way?
About the case, I think it can be solved created the index directly in lower case and searching always lower-cased. But I do not know how to solve the special chars stuff.
I thought about storing the data without special chars in a separated column and searching on that returning the real one, but I am not 100% sure where that is the perfect solution.
Any ideas?
Maybe UTL_MATCH can help.
But you can also create a function based index on, lets say, something like this:
regexp_replace(your_column, '[^0-9a-zA-Z]+', ' ')
And try to match like this:
...
WHERE regexp_replace(your_column, '[^0-9a-zA-Z]+', ' ') =
regexp_replace('maria (cool)' , '[^0-9a-zA-Z]+', ' ')
Here is a sqlfiddle demo It's not complete, but can be a start
I have got ASCII files and want to convert them into maybe excel or tab/csv delimited text file. The file is a table with field name and field attributes. It also includes index name, table name and field(s) to index if required depending on the software. I don't think it is necessary to think of this. Well, field name and field attributes are enough, I hope so. I just want the information hidden inside. Can you all experts help me to get this done.
The lines are something like this:
10000001$"WORD" WORD$10001890$$$$495.7$$$N$$
10000002$11-word-word word$10000002$$$$$$$Y$$
10000003$11-word word word$10033315$0413004$$$$$$N$$
10000004$11-word word word$10033315$$$$$$$Y$017701$
The general answer, before knowing your ascii file in details, operating system, and so on, would be:
1 - cut the top n-lines, that containg the information you don't want. Leave the filds names, if you want to.
2 - check if the fields are separated by a common character, for example, one comma ,
3 - import the file inside a spreadsheet program, like Excel or OpenOffice Calc. In OOCalc, choose to import the file, then select the correct separating character
that's all.