Full Text Search in Oracle [duplicate] - oracle

Is there an Oracle equivalent to MS SQL's full text search service?
If so, has anyone implemented it and had good / bad experiences?

Oracle Text is the equivalent functionality.
I've had good experiences with it. Assuming that you are maintaining the text index asynchronously, that tends to be the first source of problems, since it may be a bit between a change being made and the index getting updated, but that's normally quite reasonable during normal operation.

In addition to what Justin said, you can find more information about Oracle Text here.

And further to what Justin said, it is possible to create the index so it updates on commit, although this is not recommended for large amounts of text.
It offers much more power than a simple LIKE compare against %string%.

Related

Microsoft Access equivalent of explain in MySQL

I'm working on a very large query, in a inherited application. This is a large insert-query, that takes 4 tables with well over a million records. I know, I would also rather have this in SQL-server, but there is no infrastructure at this customer to do this :-)
This query has worked for over a year. However, the source-tables keep on growing, and last week it threw the dreaded 'out of system resources'-error. Bummer...!
I think it is possible to optimize this query. Working in MySQL, I would use the explain-command, to see where optimalisation might occur. Is there a equivalent of this in Access? I cannot seem to find it....
kind regards,
Paul
Probably Jet ShowPlan is closest to what you want. You will have to set a registry key. Then query plan information gets dumped to a text file named SHOWPLAN.OUT. You can read about the details in this article on TechRepublic: Use Microsoft Jet's ShowPlan to write more efficient queries
Also try the Performance Analyzer wizard. You can ask it to examine your query alone, or also ask it to examine table or other queries used by that query.
If you haven't compacted the database recently, see whether that improves performance. Compacting also updates index statistics which allows the engine to make better decisions for the query plan.

Sample Database for Full Text Searching

I am looking to do some benchmarking on Full Text Search indexes in PostgreSQL, SQLServer and Lucene.
Any ideas on where to find a good big sample database to perform queries against?
Thanks a lot in advance.
I think the great source would be wikipedia's database dump, since they contains really great amount of text. They are available here: http://dumps.wikimedia.org/
You could also try usenet archive, but there's harder to pick target language and the quality of language used is also lower.

What's your solution for free-text search and sort?

AFAIK,MySQL performs really bad at this,
what's your solution?
BTW,what's the solution of SO?
EDIT
Please pay attention that free-text search itself is pretty fast in MySQL,
but not the case when the result also needs to be sorted on an attribute!
Apache SOLR (Lucene) is pretty capable.
I think stack overflow uses SQL Server in the background with the built in fulltext search capabilities offered by the database. Oracle offers Oracle intermedia (Oracle 9i), later called Oracle Text, which is very well integrated and efficient. Postgresql offers a standard built-in module called tsearch2. I'm not sure about MySql, but looking at the other 3 databases I've mentioned, fulltext is something that is certainly complex and takes time to mature as a feature.
I recommend Sphinx Search : needs to be configured and some modifications to your code, but really worth it.
On a forum with 1+ million messages, a full-text search takes just a few milliseconds.
SO uses the full-text search capabilities of Microsoft SQL Server, it's been mentioned several times in the podcast and on the blog (ex: https://blog.stackoverflow.com/2008/11/sql-2008-full-text-search-problems/) In this blog entry, Jeff mentions possibly moving to Lucene.net in the future.
I'm currently evaluating Haystack and Solr for searching. in a couple of projects.

Does Oracle support full text search?

Is there an Oracle equivalent to MS SQL's full text search service?
If so, has anyone implemented it and had good / bad experiences?
Oracle Text is the equivalent functionality.
I've had good experiences with it. Assuming that you are maintaining the text index asynchronously, that tends to be the first source of problems, since it may be a bit between a change being made and the index getting updated, but that's normally quite reasonable during normal operation.
In addition to what Justin said, you can find more information about Oracle Text here.
And further to what Justin said, it is possible to create the index so it updates on commit, although this is not recommended for large amounts of text.
It offers much more power than a simple LIKE compare against %string%.

Hbase / Hadoop Query Help

I'm working on a project with a friend that will utilize Hbase to store it's data. Are there any good query examples? I seem to be writing a ton of Java code to iterate through lists of RowResult's when, in SQL land, I could write a simple query. Am I missing something? Or is Hbase missing something?
I think you, like many of us, are making the mistake of treating bigtable and HBase like just another RDBMS when it's actually a column-oriented storage model meant for efficiently storing and retrieving large sets of sparse data. This means storing, ideally, many-to-one relationships within a single row, for example. Your queries should return very few rows but contain (potentially) many datapoints.
Perhaps if you told us more about what you were trying to store, we could help you design your schema to match the bigtable/HBase way of doing things.
For a good rundown of what HBase does differently than a "traditional" RDBMS, check out this awesome article: Matching Impedance: When to use HBase by Bryan Duxbury.
If you want to access HBase using a query language and a JDBC driver it is possible. Paul Ambrose has released a library called HBQL at hbql.com that will help you do this. I've used it for a couple of projects and it works well. You obviously won't have access to full SQL, but it does make it a little easier to use.
I looked at Hadoop and Hbase and as Sean said, I soon realised it didn't give me what I actually wanted, which was a clustered JDBC compliant database.
I think you could be better off using something like C-JDBC or HA-JDBC which seem more like what I was was after. (Personally, I haven't got farther with either of these other than reading the documentation so I can't tell which of them is any good, if any.)
I'd recommend taking a look at Apache Hive project, which is similar to HBase (in the sense that it's a distributed database) which implements a SQL-esque language.
Thanks for the reply Sean, and sorry for my late response. I often make the mistake of treating HBase like a RDBMS. So often in fact that I've had to re-write code because of it! It's such a hard thing to unlearn.
Right now we have only 4 tables. Which, in this case, is very few considering my background. I was just hoping to use some RDBMS functionality while mostly sticking to the column-oriented storage model.
Glad to hear you guys are using HBase! I'm not an expert by any stretch of the imagination, but here are a couple of things that might help.
HBase is based on / inspired by BigTable, which happens to be exposed by AppEngine as their db api, so browsing their docs should help a great deal if you're working on a webapp.
If you're not working on a webapp, the kind of iterating you're describing is usually handled with via map/reduce (don't emit the values you don't want). Skipping over values using iterators virtually guarantees your application will have bottlenecks with HBase-sized data sets. If you find you're still thinking in SQL, check out cloudera's pig tutorial and hive tutorial.
Basically the whole HBase/SQL mental difference (for non-webapps) boils down to "Send the computation to the data, don't send the data to the computation" -- if you keep that in mind while you're coding you'll do fine :-)
Regards,
David

Resources