How to ignore collation when comparing Á to a - utf-8

I'm working on a migration of a SQL from mysql to redshift. One of the columns is a utf8 characters (not mb4). For simplicity the SQL looks like:
SELECT *
FROM a
JOIN b
ON a.name = b.name;
The table's data looks like:
a:
Name
a
b:
Name
Á
With the given data and SQL above, in MySQL the SQL will return one row. Where redshift will not. I was trying to do collate(a.name,'case_insensitive') = collate(b.name,'case_insensitive') but it made no difference.
Really hope anyone have an advice for me how I can make this same behavior as MySQL?

I did not find builtin solution, so I made one:
CREATE OR REPLACE FUNCTION unaccent_string(text)
RETURNS text
IMMUTABLE
LANGUAGE SQL
AS $$
SELECT translate(
$1,
'ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿĀāĂ㥹ĆćĈĉĊċČčĎďĐđĒēĔĕĖėĘęĚěĜĝĞğĠġĢģĤĥĦħĨĩĪīĬĭĮįİıIJijĴĵĶķĸĹĺĻļĽľĿŀŁłŃńŅņŇňʼnŊŋŌōŎŏŐőŒœŔŕŖŗŘřŚśŜŝŞşŠšŢţŤťŦŧŨũŪūŬŭŮůŰűŲųŴŵŶŷŸŹźŻżŽžſ',
'AAAAAAACEEEEIIIIENOOOOOOUUUUYPsaaaaaaaceeeeiiiienoooooouuuuypyAaAaAaCcCcCcCcDdDdEeEeEeEeEeGgGgGgGgHhHhIiIiIiIiIiIiJjKkkLlLlLlLlLlNnNnNnnNnOoOoOoOoRrRrRRSsSsSsSsTtTtTtUuUuUuUuUuUuWwYyYZzZzZzs'
);
$$;
Then I could do:
SELECT *
FROM a
JOIN b
ON upper(unaccent_string(a.name)) = upper(unaccent_string(b.name));

Related

ScriptUtils.readScript() with DEFAULT_STATEMENT_SEPARATOR is not working

I am trying to execute a sql script from file using ScriptUtils.readScript method:
sql = ScriptUtils.readScript(fileReader,
ScriptUtils.DEFAULT_COMMENT_PREFIX,
ScriptUtils.DEFAULT_STATEMENT_SEPARATOR);
getJdbcTemplate().update(sql);
But I get the error org.springframework.jdbc.BadSqlGrammarException: StatementCallback; bad SQL grammar and from the logs I see that semicolon * in the sql statement is not ignored even I am using ScriptUtils.DEFAULT_STATEMENT_SEPARATOR why it isn't working? What's wrong here?
Edit: I know that I can solve this by using:
getJdbcTemplate().update(sql.replace(";", ""));
but maybe there is another solution?
Edit2: Here is example of sql that I need to execute:
INSERT
INTO MYTABLE
(
ID,
MYNUMBER,
MYVALUE
)
SELECT
ID,
0,
B.MYVALUE
FROM ATABLE A,
BTABLE B
WHERE A.ID = B.ID
AND NOT EXISTS
(SELECT 1 FROM MYTABLE M WHERE M.ID = A.ID
);
I don't think you're using ScriptUtils.readScript the right way. The javadocs themselves state:
Mainly for internal use within the framework.
Looking at the source code, it seems that all this function does is load all the lines from a file into a single string, with some logic around comments. The use of the separator in this method is minor and appears only to be relevant if there is a whitespace at the end of it.
If you want to ignore the separator, you'll need to remove it the way that you suggested (with a replace).

Is there a Hive equivalent of SQL “LIKE ANY ( SUBQUERY )”

While Hive doesn't supports multi-value LIKE queries which are supported in SQL : ex.
SELECT * FROM user_table WHERE first_name LIKE ANY ( 'root~%' , 'user~%' );
We can convert it into equivalent HIVE queries as :
SELECT * FROM user_table WHERE first_name LIKE 'root~%' OR first_name LIKE 'user~%'
Does anyone know an equivalent solution that Hive does support in case sub-query is used with LIKE ? Have a look at below example :
SELECT * FROM user_table WHERE first_name LIKE ANY ( SELECT expr FROM exprTable);
As It doesn't have values in expression, I can't use same approach for generating multiple LIKE expression separated with OR / AND operator. Initially I thought to write HIVE UDF for it ? Can you please help me supporting such expression and finding HIVE equivalent ?
You can use Hive's RLIKE relational operator as shown below,
SELECT * FROM user_table WHERE first_name RLIKE 'root~|user~|admin~';
Hope this helps!
This is a case involving theta joins in Hive. There is a wiki page for this and a jira request. Please go through the details here on this page: https://cwiki.apache.org/confluence/display/Hive/Theta+Join
Your case is similar to the Side-Table Similarity case given on the page.
You need to convert the expr values into a map and then use regular expression to find the like. Alternatively you can also use union all with all the like expressions in separate SQL - the query might become tedious so you can programatically generate it.
What about this using EXISTS
SELECT * FROM user_table WHERE EXISTS ( SELECT * FROM exprTable WHERE first_name LIKE expr );

query xmltable oracle 10g

I have a query that used XML input to generate a XML table, I give that table an alias "XMLalias". How can I query this table in some other select statement, which is part of same batch.
I want to do something like " select * from XMLalias ".
I am new to oracle so please excuse if this is something really simple.
thanks.
im not sure what you need exactly as i figure what you want is one of this two:
select * from
(select * from XMLalias ) insider
where insider.col1 /*.....*/
Or you wanted someting like that
select *
from XMLalias a,
XMLalias b
where a.key_col=b.other_key_col
and a.col1 = /*...... */
and b.col2 = /*...... */

What the linq query for SQL like and Soudex for sql server 2008?

In sql server, we can issue sql to get data like
select * from table where column like '%myword%'
select * from person where Soundex(LastName) = Soundex('Ann')
what's the linq query to match above sql?
from t in table
where t.column.Contains("myword")
select t
In .Net 4.0 you can use the SoundCode function, probably like this:
from p in person
where SqlFunctions.SoundCode(p.LastName) == SqlFunctions.SoundCode('Ann')
select p
you may want to use the difference function
http://msdn.microsoft.com/en-us/library/system.data.objects.sqlclient.sqlfunctions.difference%28VS.100%29.aspx
you could also create your own
https://web.archive.org/web/1/http://blogs.techrepublic%2ecom%2ecom/programming-and-development/?p=656

Is it possible to refer to column names via bind variables in Oracle?

I am trying to refer to a column name to order a query in an application communicating with an Oracle database. I want to use a bind variable so that I can dynamically change what to order the query by.
The problem that I am having is that the database seems to be ignoring the order by column.
Does anyone know if there is a particular way to refer to a database column via a bind variable or if it is even possible?
e.g my query is
SELECT * FROM PERSON ORDER BY :1
(where :1 will be bound to PERSON.NAME)
The query is not returning results in alphabetical order, I am worried that the database is interpreting this as:-
SELECT * FROM PERSON ORDER BY 'PERSON.NAME'
which will obviously not work.
Any suggestions are much appreciated.
No. You cannot use bind variables for table or column names.
This information is needed to create the execution plan. Without knowing what you want to order by, it would be impossible to figure out what index to use, for example.
Instead of bind variables, you have to directly interpolate the column name into the SQL statement when your program creates it. Assuming that you take precautions against SQL injection, there is no downside to that.
Update: If you really wanted to jump through hoops, you could probably do something like
order by decode(?, 'colA', colA, 'colB', colB)
but that is just silly. And slow. Don't.
As you are using JDBC. You can rewrite your code, to something without bind variables. This way you can also dynamically change the order-by e.g.:
String query = "SELECT * FROM PERS ";
if (condition1){
query = query+ " order by name ";
// insert more if/else or case statements
} else {
query = query+ " order by other_column ";
}
Statement select = conn.createStatement();
ResultSet result = select.executeQuery(query);
Or even:
String columnName = getColumnName(input);
Statement select = conn.createStatement();
ResultSet result = select.executeQuery("SELECT * FROM PERS ORDER BY "+columnName);
ResultSet result = select.executeQuery(
"SELECT * FROM PERS ORDER BY " + columnName
);
will always be a new statement to the database.
That means it is, like Thilo already explained, impossible to "reorder" an already bound, calculated, prepared, parsed statement. When using this result set over and over in your application and the only thing, which changes over time is the order of the presentation, try to order the set in your client code.
Otherwise, dynamic SQL is fine, but comes with a huge footprint.

Resources