Case insensitive search on Sybase - performance

I have been sick and tired Googling the solution for doing case-insensitive search on Sybase ASE (Sybase data/column names are case sensitive). The Sybase documentation proudly says that there is only one way to do such search which is using the Upper and Lower functions, but the adage goes, it has performance problems. And believe me they are right, if your table has huge data the performance is so awkward you are never gonna use Upper and Lower again. My question to fellow developers is: how do you guys tackle this?
P.S. Don't advise to change the sort-order or move to any other Database please, in real world developers don't control the databases.

Try creating a functional index, like
Create Index INDX_MY_SEARCH on TABLE_NAME(LOWER(#MySearch)

Add additional upper or lower case column in your select statement. Example:
select col1, upper(col1) upp_col1 from table1 order by upp_col1

If you cannot change the sort-order on the database(best option), then the indexes on unknown case fields will not help. There is a way to do this and keep performance if the number of fields is manageable. You make an extra column MyFieldLower. You use a trigger to keep the field filled with a lower case of MyField.
Then the query is:
WHERE MyFieldLower = LOWER(#MySearch)
This will use indexing.

Related

Indexing for the columns in ORACLE

I have below query, because of the huge data in the MATTER Table, it is taking huge time for LIKE statement to execute, so I was thinking of using the CONTEXT Index and using CONTAIN.
Shall I do indexing only on Matter_title or some other column as well,. Based on the below select query
Inputs highly appreciated
SELECT DISTINCT dm.MATTER_SEQ
FROM MATTER dm
,MATTER_TYPE dmt
,MATTER_SUBTYPE dms
,STATUS ds
,FILING df
WHERE dm.MATTER_TYPE_SEQ=dmt.MATTER_TYPE_SEQ
AND dm.MATTER_SUBTYPE_SEQ=dms.MATTER_SUBTYPE_SEQ
AND dm.STATUS_CODE NOT IN ('abc','jkl','xyz')
AND dm.STATUS_CODE = DS.STATUS_CODE
AND dm.IS_EXTERNAL='1'
AND dm.IS_DELETED='0'
AND dm.MATTER_SEQ = df.MATTER_SEQ
AND trunc(dm.CREATED_DATE) between '01-NOV-95' AND '02-OCT-18'
AND upper(dm.MATTER_TITLE) like(upper (q'{%jdasuidhajsndjahs%}'))
It sounds like you're already aware that LIKE with a leading wildcard ('%ABC') is notoriously inefficient since it typically can't use indexes and does a full table scan.
If the other optimizing suggestions don't help much, you probably would see better performance with a Context index. Be sure to set the SUBSTRING_INDEX preference so it'll specifically prepare the index for infix searches like yours. See this Ask Tom for more details. (If you will also have wildcards in the middle of strings ('ABC%DEF'), you might also want to set the PREFIX options.)
begin
ctx_ddl.create_preference('SUBSTRING_PREF','BASIC_WORDLIST');
ctx_ddl.set_attribute('SUBSTRING_PREF','SUBSTRING_INDEX','TRUE');
end;
create index matter_title_idx on MATTER(MATTER_TITLE)
indextype is ctxsys.context
parameters ('wordlist SUBSTRING_PREF');
Also note that Context indexes are case-insensitive by default, so you don't need to do UPPER(). I haven't tried using q'' literals with contains, so I'm not sure how this'll work.
AND CONTAINS(dm.MATTER_TITLE, q'{%jdasuidhajsndjahs%}') > 0
Try creating function Indexes upper(dm.MATTER_TITLE) and second trunc(dm.CREATED_DATE).
Also I am considering that the columns in the Join conditions already have indexes. If not have them indexed.

will index be used when UPPER() the variable first?

I may have encountered a full table scan in Oracle database. I can't excute the explain command in the database, simply put, I don't have the permission.
And I'm trying to figure out the following question.
If I have an index on NAME in table
With this query:
select OID
from table
where NAME=UPPER(v1)
and TYPE=v2
and PID=v3
and OID<>v4
and PID =v5`
(v1 is a variable)
Will the oracle use index on name to select OID?
I have read some material, and it says with a function in where condition the NAME index won't be used. But the upper() is a special function, so I'm not quiet sure about the material I saw before.
And here is the second question after the answer of #mathguy:
If I create an index using create index INDEX_NAME on table(upper(NAME));
will the query:
select OID,PID
from table
where PID=v1
and NAME=UPPER(v2)
use the index INDEX_NAME?
OR the index will be used in the above question, and the query is just not efficient so they take much time to execute?
If you have an index on name, then the optimizer MAY use the index in the example you gave. It may choose not to use it (for example if it estimates that a relatively large fraction of rows will be returned anyway); but if say only 0.1% of rows would be returned, by all means the index will be used. (If that still doesn't happen, make sure statistics are up-to-date.)
What will prevent the use of an index is if you wrapped name within upper(). What happens on the right-hand side - whether you have v1 or upper(v1) or even a much more complicated expression - is irrelevant as long as name doesn't also appear in that complicated expression on the right-hand side.
Perhaps this will help...
In Oracle, you can create an index on a function (a function index), so if you created your index on the function UPPER(NAME) instead of just NAME, Oracle may be more likely to use the index (although it still might choose not to depending on other factors.)
Here's a link that describes function indexes

Dynamics AX 2012 Subquery in a View

AX allows you to enter basic SQL into View ranges. For example, in an AOT view's range, for the match value, you could enter (StatRepInterval.Name == 'Weekly'). This works nicely.
However, I need to do a more advanced lookup on a View, using a subquery. Can anyone suggest a way to do this?
This is what I would like to use, but I receive an error: "Query extended range failure: Syntax error near 34."
(StatRepInterval.Name == (SELECT FIRSTONLY StatRepInterval.Name FROM StatRepInterval WHERE StatRepInterval.PrintDirection == 1 ORDER BY StatRepInterval.Name DESC))
I've tried a lot of different variants of the subquery, from straight T-SQL to X++ SQL, but nothing seems to work.
Thanks for the help.
Sub-queries are not supported in query expressions.
This may be solved by using additional datasources with inner or outer joins as you observed.
See the spec and Axaptapedida on query expressions.
I found a way to do this. It isn't pretty, and I'm going to leave the question unanswered for a bit, should someone else have a more graceful solution.
Create a source View that contains all fields I wish to return, plus calculated fields that contain my subquery results.
Create a second View that uses the first as a data source, and applies all the necessary ranges.
Works pretty nicely.
Probably inefficient if there were large tables of data, but this is in a relatively small section of AX.

Improve SQL Server 2005 Query Performance

I have a course search engine and when I try to do a search, it takes too long to show search results. You can try to do a search here
http://76.12.87.164/cpd/testperformance.cfm
At that page you can also see the database tables and indexes, if any.
I'm not using Stored Procedures - the queries are inline using Coldfusion.
I think I need to create some indexes but I'm not sure what kind (clustered, non-clustered) and on what columns.
Thanks
You need to create indexes on columns that appear in your WHERE clauses. There are a few exceptions to that rule:
If the column only has one or two unique values (the canonical example of this is "gender" - with only "Male" and "Female" the possible values, there is no point to an index here). Generally, you want an index that will be able to restrict the rows that need to be processed by a significant number (for example, an index that only reduces the search space by 50% is not worth it, but one that reduces it by 99% is).
If you are search for x LIKE '%something' then there is no point for an index. If you think of an index as specifying a particular order for rows, then sorting by x if you're searching for "%something" is useless: you're going to have to scan all rows anyway.
So let's take a look at the case where you're searching for "keyword 'accounting'". According to your result page, the SQL that this generates is:
SELECT
*
FROM (
SELECT TOP 10
ROW_NUMBER() OVER (ORDER BY sq.name) AS Row,
sq.*
FROM (
SELECT
c.*,
p.providername,
p.school,
p.website,
p.type
FROM
cpd_COURSES c, cpd_PROVIDERS p
WHERE
c.providerid = p.providerid AND
c.activatedYN = 'Y' AND
(
c.name like '%accounting%' OR
c.title like '%accounting%' OR
c.keywords like '%accounting%'
)
) sq
) AS temp
WHERE
Row >= 1 AND Row <= 10
In this case, I will assume that cpd_COURSES.providerid is a foreign key to cpd_PROVIDERS.providerid in which case you don't need an index, because it'll already have one.
Additionally, the activatedYN column is a T/F column and (according to my rule above about restricting the possible values by only 50%) a T/F column should not be indexed, either.
Finally, because searching with a x LIKE '%accounting%' query, you don't need an index on name, title or keywords either - because it would never be used.
So the main thing you need to do in this case is make sure that cpd_COURSES.providerid actually is a foreign key for cpd_PROVIDERS.providerid.
SQL Server Specific
Because you're using SQL Server, the Management Studio has a number of tools to help you decide where you need to put indexes. If you use the "Index Tuning Wizard" it is actually usually pretty good at tell you what will give you the good performance improvements. You just cut'n'paste your query into it, and it'll come back with recommendations for indexes to add.
You still need to be a little bit careful with the indexes that you add, because the more indexes you have, the slower INSERTs and UPDATEs will be. So sometimes you'll need to consolidate indexes, or just ignore them altogether if they don't give enough of a performance benefit. Some judgement is required.
Is this the real live database data? 52,000 records is a very small table, relatively speaking, for what SQL 2005 can deal with.
I wonder how much RAM is allocated to the SQL server, or what sort of disk the database is on. An IDE or even SATA hard disk can't give the same performance as a 15K RPM SAS disk, and it would be nice if there was sufficient RAM to cache the bulk of the frequently accessed data.
Having said all that, I feel the " (c.name like '%accounting%' OR c.title like '%accounting%' OR c.keywords like '%accounting%') " clause is problematic.
Could you create a separate Course_Keywords table, with two columns "courseid" and "keyword" (varchar(24) should be sufficient for the longest keyword?), with a composite clustered index on courseid+keyword
Then, to make the UI even more friendly, use AJAX to apply keyword validation & auto-completion when people type words into the keywords input field. This gives you the behind-the-scenes benefit of having an exact keyword to search for, removing the need for pattern-matching with the LIKE operator...
Using CF9? Try using Solr full text search instead of %xxx%?
You'll want to create indexes on the fields you search by. An index is a secondary list of your records presorted by the indexed fields.
Think of an old-fashioned printed yellow pages - if you want to look up a person by their last name, the phonebook is already sorted in that way - Last Name is the clustered index field. If you wanted to find phone numbers for people named Jennifer or the person with the phone number 867-5309, you'd have to search through every entry and it would take a long time. If there were an index in the back with all the phone numbers or first names listed in order along with the page in the phonebook that the person is listed, it would be a lot faster. These would be the unclustered indexes.
I would try changing your IN statements to an EXISTS query to see if you get better performance on the Zip code lookup. My experience is that IN statements work great for small lists but the larger they get, you get better performance out of EXISTS as the query engine will stop searching for a specific value the first instance it runs into.
<CFIF zipcodes is not "">
EXISTS (
SELECT zipcode
FROM cpd_CODES_ZIPCODES
WHERE zipcode = p.zipcode
AND 3963 * (ACOS((SIN(#getzipcodeinfo.latitude#/57.2958) * SIN(latitude/57.2958)) +
(COS(#getzipcodeinfo.latitude#/57.2958) * COS(latitude/57.2958) *
COS(longitude/57.2958 - #getzipcodeinfo.longitude#/57.2958)))) <= #radius#
)
</CFIF>

How to Sort Data Table like FogBugz Cases Table

Anyone ever see how fogbugz sorts their tables? When you click to sort the column, they actually break the table up into many small tables that have each category of info.
Wondering if anyone knows how they do this?
Looking to implement this feature.
If you take a look through the cases page, and sort you can see what I mean.
Any help would be AWESOME!
Still Haven't figured this one out.
EDIT: #Peter, I don't want to postback and recreate a table every time the header title is clicked for a sort. I also want to know if their is a generic solution for this. If I click on the header to sort, by the way of javascript, it seperates the "one" table into many and I want to know if their is any generic solution for this because its just a MUCH better way of viewing a sorted Table.
EDIT: I do need a javascript sorter, but if you look right down at the implementation of fogbugz, it produces a different result...
Yup, Rich got it (I coded this feature into FogBugz a long while back).
If you have to do this on the client you have no choice but to sort the data, iterate through it generating table row after table row, and every time you hit a new sort value you create a new thead w/ the appropriate information.
To be honest it would be a pretty cool modification to this jQuery plugin: http://tablesorter.com/docs/ and you'd be able to leverage a lot of their work. If you're going to put in the time and create a general solution, might as well make it accessible to the community.
Without knowing specifically how Fog Creek accomplishes this, the way that I would do it is to output a table header, then iterate through the list, outputting a footer and a new header each time the group value changed.
Not sure what answer do you expect. SQL query for this would simply use ordering on selected column, and UI would start new table each time this value changes.
Here is screenshot of FogBugz with this sorting, after clicking on Priority column.
http://img297.imageshack.us/img297/6974/76755363ee3.png
Of course, starting new table doesn't make sense for every column (title, case #).
Edit: If I understand correctly, you're looking for a way how to do this in a browser without loading new page. If this is the case, I would suggest at least some server-side support, which would return your data in correct order, and properly structured for subtables (in xml/json/whatever you use). Your javascript will use this data to recreate tables. I am sure others with more web-ui experience will provide you with better answers.
I've used the Sortable Tables script from Kryogenix with some good results.
I don't know if it is relevant, but we store the results of a query in a temporary table in SQL, and then reference current-row-less-one to see if a Category has changed, and indicate this in the resulset.
In some instances we "indicate" this with a column containing
<tr><td colspan=999>Category Heading</td></tr>
so that the web page can just "inject" that into the table it is building.
SELECT Col1, Col2, ...,
[CATEGORY] = CASE WHEN T1.CategoryCol <> COALESCE(T2.CategoryCol, '')
THEN '<tr><td colspan=999>' + T1.CategoryCol + '</td></tr>'
ELSE ''
END
FROM #MyTempTable AS T1
LEFT OUTER JOIN #MyTempTable AS T2
ON T2.ID = T1.ID - 1

Resources