Apache Derby - case (in)sensitivity - derby

i googled around a bit about case insensitive search in apache derby. all google results are very old (2007 the latest). i found that is impossible to search case insensitive without loosing the index ("LOWER" don't uses the index).
Is this still true? Or is there a way to get case insensitive search on indexed varchar/text columns?
thx in advance

Have a look at collation:
You could use TERRITORY_BASED:SECONDARY` when creating the connector, this was the only way I was able to achieve this:
TERRITORY_BASED:SECONDARY: Territory based with collation strength SECONDARY.
SECONDARY typically means that differences in base letters or accents are
considered significant, whereas differences in case are not considered
significant.
Example:
jdbc:derby:MexicanDB;create=true;collation=TERRITORY_BASED:SECONDARY
Apparently it is not possible in Derby to create an index over a function:
https://issues.apache.org/jira/browse/DERBY-455
Another possibility is to store the same value in a lower case column and search in that.

Related

Elasticsearch Index wildcard Performance

log-1
log-2
log-3
If there is an index, I use "log-"
But suppose that the data I want is only in log-1.
Is there a difference in actual operation and performance between using it as log- and using it as log-1?
Search commands will surely be executed on log-1 and log-2 indexes.
It's a command that doesn't look up anything, but what's the actual operation?
Documents are pre indexed before inserting.
I hope you already have proper analyzer like standard analyzer to make the searching process simpler.
In that case, the full text search should be faster.
Sometimes, its based on the type of data you store. So you should run tests based on your data and environment for the two cases you have mentioned to come up for a conclusion.

(Oracle DSEE) LDAP browsing index with parameters in the vlvFilter

I'm running into some problems creating a browsing index for VLV searches.
The oracle docs (https://docs.oracle.com/cd/E19693-01/819-0995/bcatq/index.html) state that
The vlvFilter is the same LDAP filter that is used in the client search operations.
The filter I am using for the VLV searches however is parameterised, e.g.
(&(objectclass=MySpecialObjectClass)(modificationTimestamp>=$someDynamicValue))
So any ideas what should be put in the vlvFilter attribute for this browsing index?
Thanks!
You cannot use VLVIndex for filters that are constantly changing. VLVIndex are meant to provide consistent scrolling lists of results.
VLV must also come with a Sorting order.
It looks like to me that you may just need to index the modifyTimeStamp (for ordering) and use the Page Result Control to avoid getting all entries at once.

How to search for multiple strings in very large database

I want to search for multiple strings in a very large database. These strings are part of different attributes of database table. I have tried string search using LIKE in sql query. But it is taking a lot of time to get results. I have used Oracle database.
Should I use indexing of database? I found that Lucene can be used for it.
I also got some suggestions of using big data concepts. Which approach should I use?
The easiest way is:
1.) adding an index to the columns you like to search trough
2.) using oracle text as #lalitKumarB wrote
The most powerful way is:
3.) use an separate search engine (solr, elaticsearch).
But, probably you have to change you application in order to explicit use the search index for searching trough the data,...
I had the same situation some years before. Trying to search text in an big database. After a wile I found out, that database based search will never reach the performance of an dedicate search engine. And: you will have much more search features working out of the box, if you use solr (for example), like spelling correction, "More like this", ...
One option is to hold the data on orcale, searching in solr and return the ID of the document in order to only load the one row form oracle, the is referenced by the ID.
2nd option is to keep oracle as base datapool for your search engine and search in solr (or elasticsearch) in order to return the whole document/row from solr, not only the ID. So you don't need to load the data from the database any more.
The best option depends on your needs.
You have the choice between elasticsearch, solr or lucene

apache cassandra query/full text search

I've been playing around with apache's cassandra project. Done a fair bit of readin and i have some fairly complex examples that i've done, including inserting single and batch sets of data, retrieving a single and multiple data sets based on keys.
Some of the articles i've looked at include
http://www.rackspacecloud.com/blog/2010/05/12/cassandra-by-example
http://github.com/digg/lazyboy
http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model
http://www.sodeso.nl/?p=80
I've got a fairly good grasp of the concepts explained and have even implemented a simple app.
None of the articles describe how one would go about performing a query where, for eg, the query is a search term a user has typed in.
Does anyone know how or can suggest how i'd go about performing such a query?
Or perhaps a way to create a searchable index, full text search or anything even remotely close?
You will probably split text into words, and than use these words as keys to your "index". Each word will contain timestamp ordered column family with list of IDs to your articles, messages etc. So you can only perform simple searches over keys (words).
When searching more than one word, use intersection over these column families.
This is very simple approach, if you need more complex queries look at Lucandra - http://github.com/tjake/Lucandra - Lucandra is a fulltext search engine with Cassandra as backend storage.

Searching and and ampersand

I have a php/mysql directory. If someone searches a company name like "Johnson & Johnson" and it's it the DB as "Johnson and Johnson" it doesn't match.
I'm doing a NAME LIKE '% var %' kind of search currently. Is there an easy way to get this to work? I'm not sure if it's just a matter of setting up the table as INNODB with full text on the column or if there's more involved.
Thanks,
Don
Yeah, you need a more sophisticated search capable of tokenising the search terms and searching through a tokenised index. You could probably get some of the way there with a full text search in the InnoDB table engine, but you could also look at other options. Some that you could consider:
Sphinx
Lucene
Solr
Nutch
All of these are more sophisticated full text indexers and searchers than you will get built into a database engine, but will require more work to get set up and going than a mysql full text search too, so it depends on the features you need.
Replacement of & by and is not really a trivial task in my book.
You may fare better by doing that kind of replacement beforehand, using a set of pre-defined rules (e.g. "&" and "+" become "and").

Resources