Is there anyway to remove a character from a given position?
Let's say my word is:
PANCAKES
And I want to remove the 2nd letter (in this case, 'A'), so i want PNCAKES as my return.
Translate doesnt work for this.
Replace doesnt work for this.
Regex is damn complicated...
Ideas?
Example:
SUBSTR('PANCAKES', 0, INSTR('PANCAKES', 'A', 1, 1)-1) || SUBSTR('PANCAKES', INSTR('PANCAKES', 'A', 1, 1)+1)
I don't have an Oracle instance to test with, might have to tweak the -1/+1 to get the position correct.
References:
INSTR
SUBSTR
Concatenating using pipes "||"
You should strongly consider using regexp_replace. It is shorter and not so complicated as it seems at a first glance:
SELECT REGEXP_REPLACE( S, '^(.{1}).', '\1' )
FROM (
SELECT 'PANCAKES'
FROM DUAL
)
The pattern ^(.{1}). searches from the start of the string ( denoted by ^ ) for exactly one ( .{1} ) of printable or uprintable characters followed by again just one of those characters ( . ). The "exact" part is closed in parenthesis so it can be referenced as match group by it's number in the third function's argument ( \1 ). So the whole substring matched by regexp is 'PA', but we reference only 'P'. The rest of the string remains untouched. So the result is 'PNCAKES'.
If you want to remove N-th character from the string just replace number 'one' in the pattern (used to remove second character) with the value of N-1.
It's good for programmer or any kind of IT specialist to get familiar with regular expressions as it gives him or her a lot of power to work with text entries.
Or use a custom made SplitAtPos function using SUBSTR. Advantage is that it still works on Oracle v9.
set serveroutput on
declare
s1 varchar2(1000);
s2 varchar2(1000);
function SplitAtPos(s in out varchar2, idx pls_integer)
return varchar2
is
s2 varchar2(1000);
begin
s2:=substr(s,1,idx-1);
s:=substr(s,idx,length(s)-idx+1);
return s2;
end;
begin
s1:='Test123';
s2:=SplitAtPos(s1,1);
dbms_output.put_line('s1='||s1||' s2='||s2);
s1:='Test123';
s2:=SplitAtPos(s1,5);
dbms_output.put_line('s1='||s1||' s2='||s2);
s1:='Test123';
s2:=SplitAtPos(s1,7);
dbms_output.put_line('s1='||s1||' s2='||s2);
s1:='Test123';
s2:=SplitAtPos(s1,8);
dbms_output.put_line('s1='||s1||' s2='||s2);
s1:='Test123';
s2:=SplitAtPos(s1,0);
dbms_output.put_line('s1='||s1||' s2='||s2);
end;
yes REPLACE and SUBSTR in the proper order will do the trick.
the end result should be a concatenation of the SUBSTR before the removed char to the SUBSTR after the char.
if the entire column is only one word, then you can just do an update, if the word is in another string, then you could use REPLACE as a wrapper.
You can use something like this in pl/SQL
DECLARE
v_open NUMBER;
v_string1 VARCHAR2(10);
v_string2 VARCHAR2(10);
v_word VARCHAR2(10);
BEGIN
v_open := INSTR('PANCAKES' ,'A',1);
v_string1 := SUBSTR('PANCAKES' ,1, 1);
v_string2 := SUBSTR('PANCAKES' ,v_open+1);
v_word := v_string1||v_string2;
END;
Related
How to write a simple function that returns in parameter changed so it doesn't contain certain symbols anymore?
(č=>c, ć=>c, š=>s, đ=>d, ž=>z..)
e.g. *đurđević* => '*djurdjevic*'
e.g. *kuća* => *kuca*
e.g. *čaćkati* => *cackati*
I have no code so far. I am very new at this and am trying to learn something.
For Croatian characters you can use a combination of TRANSLATE and REPLACE, since TRANSLATE doesn't support one to many translation (e.g. đ => dj).
SELECT REPLACE(TRANSLATE('đak žvakaća čičak šuma','žćčš', 'zccs'), 'đ', 'dj') out FROM dual;
And the output:
out
--------------------------
djak zvakaca cicak suma
Edit
So here are two wrapper functions which implements this feature. The first one uses the built-ins while the other one has its own custom implementation.
-- first
CREATE OR REPLACE FUNCTION f_translate(p_string IN VARCHAR)
RETURN VARCHAR
AS
BEGIN
RETURN REPLACE(TRANSLATE(p_string,'žćčš', 'zccs'), 'đ', 'dj');
END;
-- second
CREATE OR REPLACE FUNCTION f_translate_custom(p_string IN VARCHAR)
RETURN VARCHAR
AS
v_current varchar(1);
v_retval VARCHAR(255);
BEGIN
FOR i IN 1..LENGTH(p_string) LOOP
v_current := SUBSTR(p_string, i, 1);
v_retval := v_retval || CASE v_current
WHEN 'č' THEN 'c'
WHEN 'ć' THEN 'c'
WHEN 'ž' THEN 'z'
WHEN 'š' THEN 's'
WHEN 'đ' THEN 'dj'
ELSE v_current
END;
END LOOP;
RETURN v_retval;
END;
And some test code.
SET SERVEROUTPUT ON;
BEGIN
DBMS_OUTPUT.PUT_LINE('Built-in: ' || f_translate('đak žvakaća čičak šuma'));
DBMS_OUTPUT.PUT_LINE('Custom: ' || f_translate_custom('đak žvakaća čičak šuma'));
END;
Check out translate
https://www.techonthenet.com/oracle/functions/translate.php
Example:
TRANSLATE('1tech23', '123', '456')
Result: '4tech56'
You can try the convert function.
SELECT CONVERT('Ä Ê Í Õ Ø A B C D E ', 'US7ASCII', 'WE8ISO8859P1')
FROM DUAL;
CONVERT('ÄÊÍÕØABCDE'
---------------------
A E I ? ? A B C D E ?
This is the closest you can get without creating any structure:
SELECT utl_raw.cast_to_varchar2((nlssort('čaćkati', 'nls_sort=binary_ai')))
FROM dual;
If you have a lot of combinations, I suggest on creating a table with possible combinations and using the TRANSLATE function.
I've a string with the values - 'a','b','c','d,'e'. I need to convert each of the values to the text null - null,null,null,null,null. If there are 10 values in quotes and separated by comma, then 10 null will appear. I tried using REGEXP_REPLACE but failed to get the result.
declare
a varchar2(32767) := q'#'a','b','c'#';
c varchar2(32767);
begin
c := replace(REGEXP_REPLACE (a, <don't know what pattern should be here>, 'null'),'''');
dbms_output.put_line(c);
end;
/
There are many ways to do this. For example, you could use the regular expression '[^,]+' - which means any number (one or more) of consecutive non-comma characters. Then every occurrence of that pattern will be replaced with the replacement string (while the commas will stay in place).
declare
a varchar2(32767) := q'#'a','b','c'#';
c varchar2(32767);
begin
c := REGEXP_REPLACE (a, '[^,]+', 'null'); -- notice the regular expression
dbms_output.put_line(c);
end;
/
null,null,null
PL/SQL procedure successfully completed.
As you know Oracle POSIX implementation of regexes does not support word boundaries. One workaround is suggested here:
Oracle REGEXP_LIKE and word boundaries
However it does not work if I want to, for instance select all 4 character strings. Consider this, for example:
myvar:=regexp_substr('test test','(^|\s|\W)[\S]{4}($|\s|\W)')
This obviously selects only the first occurrence. I do not know how to do this in the Oracle world, although normally it is simply (\b)[\S]{4}(\b). The problem is that most woraround rely on some nonexistent feature, like lookaround etc.
select xmlcast(xmlquery('for $token in ora:tokenize(concat(" ",$in)," ")
where string-length($token) = $size
return $token' passing 'test test' as "in", 4 as "size" returning content) as varchar2(2000)) word from dual;
Xquery and FLWOR expresion.
concat(" ",$in) - workaround if input string is null or it has only 1 matchting word.
ora:tokenize - tokenize string by "space"
string-length($token) = $size check if token has appropriate length.
xmlcast - convert xmltype to varchar2
Easy ? Any question:)
DECLARE
str VARCHAR2(200) := 'test test';
pattern VARCHAR2(200) := '(\w+)($|\s+|\W+)';
match VARCHAR2(200);
BEGIN
FOR i IN 1 .. REGEXP_COUNT( str, pattern ) LOOP
match := REGEXP_SUBSTR( str, pattern, 1, i, NULL, 1 );
IF LENGTH( match ) = 4 THEN
DBMS_OUTPUT.PUT_LINE( match );
END IF;
END LOOP;
END;
/
or (without using REGEXP_COUNT or the 6th parameter of REGEXP_SUBSTR that was introduced in 11G):
DECLARE
str VARCHAR2(200) := 'test test';
pattern CONSTANT VARCHAR2(3) := '\w+';
match VARCHAR2(200);
i NUMBER(4,0) := 1;
BEGIN
match := REGEXP_SUBSTR( str, pattern, 1, i );
WHILE match IS NOT NULL LOOP
IF LENGTH( match ) = 4 THEN
DBMS_OUTPUT.PUT_LINE( match );
END IF;
i := i + 1;
match := REGEXP_SUBSTR( str, pattern, 1, i );
END LOOP;
END;
/
Output:
test
test
If you want to use this in SQL then you can easily translate it into a pipelined function or a function that returns a collection.
I have a function that takes in 1 parameter, abc(parameter1 IN varchar2)
In the parameter I will be taking in a string that is comma delimited:
E.g Abc('1,2,a')
Type vartype is varray(10) of varchar2(50);
X1 vartype:= vartype (parameter1);
For X in X1.count loop
Dbms_output.put_line(x1(X));
End loop;
The DBMS Output gives me
1,2,a
Instead of
1
2
A
Is there anyway I can solve this?
For my understanding your function parameter will be single value.
If you are mentioned varray, you should give format like ('1','2','a','b')
For example :-
declare
Type vartype is varray(10) of varchar2(50);
X1 vartype:=vartype ('1','2','a','b');
begin
For X in 1..X1.count loop
Dbms_output.put_line(x1(x));
End loop;
end;
/
Above query will help you to understand concepts of Varray
You are passing a varchar2 variable to the varray and it's considered the first paramter; so your array contains only one element (the content of parameter1). You must split the string into substring before passing to the varray.
Here is an extract from Oracle documentation
DECLARE TYPE ProjectList IS VARRAY(50) OF VARCHAR2(16);
accounting_projects ProjectList;
BEGIN
accounting_projects := ProjectList('Expense Report', 'Outsourcing', 'Auditing');
END;
For splitting a string into substring you can check some solutions here
I have a field in a table which holds XML entities for special characters, since the table is in latin-1.
E.g. "Hallöle slovenčina" (the "ö" is in latin-1, but the "č" in "slovenčina" had to be converted to an entity by some application that stores the values into the database)
Now I need to export the table into a utf-8 encoded file by converting the XML entities to their original characters.
Is there a function in Oracle that might handle this for me, or do I really need to create a huge key/value map for that?
Any help is greatly appreciated.
EDIT: I found the function DBMS_XMLGEN.convert, but it only works on <,> and &. Not on &#NNN; :-(
I believe the problem with dbms_xmlgen is that there are technically only five XML entities. Your example has a numeric HTML entity, which corresponds with Unicode:
http://theorem.ca/~mvcorks/cgi-bin/unicode.pl.cgi?start=0100&end=017F
Oracle has a function UNISTR which is helpful here:
select unistr('sloven\010dina') from dual;
I've converted 269 to its hex equivalent 010d in the example above (in Unicode it is U+010D). However, you could pass a decimal number and do a conversion like this:
select unistr('sloven\' || replace(to_char(269, 'xxx'), ' ', '0') || 'ina') from dual;
EDIT: The PL/SQL solution:
Here's an example I've rigged up for you. This should loop over and replace any occurrences for each row you select out of your table(s).
create table html_entities (
id NUMBER(3),
text_row VARCHAR2(100)
);
INSERT INTO html_entities
VALUES (1, 'Hallöle slovenčina Ċ ú');
INSERT INTO html_entities
VALUES (2, 'I like the letter Ċ');
INSERT INTO html_entities
VALUES (3, 'Nothing to change here.');
DECLARE
v_replace_str NVARCHAR2(1000);
v_fh UTL_FILE.FILE_TYPE;
BEGIN
--v_fh := utl_file.fopen_nchar(LOCATION IN VARCHAR2, FILENAME IN VARCHAR2, OPEN_MODE IN VARCHAR2, MAX_LINESIZE IN BINARY_INTEGER);
FOR v_rec IN (select id, text_row from html_entities) LOOP
v_replace_str := v_rec.text_row;
WHILE (REGEXP_INSTR(v_replace_str, '&#[0-9]+;') <> 0) LOOP
v_replace_str := REGEXP_REPLACE(
v_replace_str,
'&#([0-9]+);',
unistr('\' || replace(to_char(to_number(regexp_replace(v_replace_str, '.*?&#([0-9]+);.*$', '\1')), 'xxx'), ' ', '0')),
1,
1
);
END LOOP;
-- utl_file.put_line_nchar(v_fh, v_replace_str);
dbms_output.put_line(v_replace_str);
END LOOP;
--utl_file.fclose(v_fh);
END;
/
Notice that I've stubbed in calls to the UTL_FILE function to write NVARCHAR lines (Oracle's extended character set) to a file on the database server. The dbms_output, while great for debugging, doesn't seem to support extended characters, but this shouldn't be a problem if you use UTL_FILE to write to a file. Here's the DBMS_OUTPUT:
Hallöle slovencina C ú
I like the letter C
Nothing to change here.
You can also just use the internationalization package :
UTL_I18N.unescape_reference ('text')
Works great in changing those html entities to normal characters (such as cleanup after moving a database from iso 8859P1 to UTF-8)
This should probably be done in PL/SQL which I do not know, but I wanted to see how far I could get it with pure SQL. This only replaces the first occurence of the code, so you would have to somehow run it multiple times.
select regexp_replace(s, '&#([0-9]+);', u) from
(select s, unistr('\0' || REPLACE(TO_CHAR(TO_NUMBER(c), 'xxxx'), ' ', '')) u from
(select s, regexp_replace(s, '.*&#([0-9]+);.*', '\1') c from
(select 'Hallöle slovenčina' s from dual)))
Or less readable but more usable:
SELECT
REGEXP_REPLACE(s, '&#([0-9]+);', unistr('\0' || REPLACE(TO_CHAR(TO_NUMBER(regexp_replace(s, '.*?&#([0-9]+);.*$', '\1', 1, 1)), 'xxxx'), ' ', '')), 1, 1)
FROM
(SELECT 'Hallöle slovenčina č Ė' s FROM DUAL)
This (updated) version correctly replaces the first occurrence. You need to apply it until all of them are replaced.