Initcap skip words smaller than 4 characters - oracle
I've got a question about initcap.
Is it posibible to create an initcap statement to skip the change of words that are smaller than 4 characters.
Because i have to change the words with less than 4 characters back to normal, after i've finished the initcap.
So i tought mabye there is a possibility to create an function/procedure/trigger that will just skip the words?? The words are used in an location name like "Son En Breugel", the "En" in the middle must become lower.
The first letter of the string doesn't need to change, only the first small words after a space(Like in the middle of the string)
I've started to create an procedure, but it needs a bit of finetuning
*All strings that don't need to be changed with initcap are changed back
*Initcap with xDutch format
--Still need to find way to change 'S into 's, i think i've deleted the record with this script?
Can somebody assis me with this??
create or replace PROCEDURE Location_Name_Routine IS
BEGIN
DELETE
FROM Location
WHERE Name LIKE '%[^0-9a-zA-Z]%';
UPDATE Location
set Name = nls_initcap(Name, 'NLS_SORT=xDutch');
UPDATE Location
SET Name = REGEXP_REPLACE(Name,' En',' en');
UPDATE Location
SET Name = REGEXP_REPLACE(Name,' Van',' van');
UPDATE Location
SET Name = REGEXP_REPLACE(Name,' De',' de');
UPDATE Location
SET Name = REGEXP_REPLACE(Name,' Den',' den');
UPDATE Location
SET Name = REGEXP_REPLACE(Name,' Over','over');
UPDATE Location
SET Name = REGEXP_REPLACE(Name,' Aan',' aan');
UPDATE Location
SET Name = REGEXP_REPLACE(Name,' Bij',' bij');
END;
There may not be a simple answer to the underlying question. I assume you are trying to properly capitalize addresses in Dutch and this question is related to this other question from yesterday.
Combining the questions, there are at least three special cases so far:
'S GRAVENHAGE => 's Gravenhage
IJSLAND => IJsland
SON EN BREUGEL => Son en Breugel
INITCAP and even NLS_INITCAP('...', 'NLS_SORT=xDutch') fail to properly handle them. Before you start coding you should collect all the requirements. Are these the only rules for Dutch capitalization, or are there many more?
The answers posted so far may help to solve one specific exception. But chances are you cannot simply combine regular expressions and solve them all. You may want to take a more top-down approach here.
UPDATE
Based on wolφi's idead, it is possible to brute-force the problem by using all existing names. NLS_INITCAP alone works 95% of the time. Using the 431 names from the spreadsheet at this link it is possible to build a list of all 25 exceptional cases.
Run this statement once to build a DECODE expression to handle all non-trivial cases:
--Build decode for UPDATE.
select
--Start the decode
'decode(upper(name),'||
--List all the exceptions. Single quotes are a mess, no way around it.
listagg(
--Upper case version to match
''''||upper(replace(column_value, '''', ''''''))||
--Pre-defined init-capped version
''','''||replace(column_value, '''', '''''')||''''
, ','||chr(10)
)
within group (order by column_value)
||
--Default to NLS_INITCAP
',nls_initcap(name, ''NLS_SORT=xDutch''))'
from table(sys.odcivarchar2list('Bellingwedde','Menterwolde','Oldambt','Pekela','Stadskanaal','Veendam','Vlagtwedde','Appingedam','Delfzijl','Loppersum','Bedum','Ten Boer','Eemsmond','Groningen','Grootegast','Haren','Hoogezand-Sappemeer','Leek','De Marne','Marum','Slochteren','Winsum','Zuidhorn','Achtkarspelen','Ameland','het Bildt','Boarnsterhim','Dantumadiel','Dongeradeel','Ferwerderadiel','Franekeradeel','Harlingen','Kollumerland en Nieuwkruisland','Leeuwarden','Leeuwarderadeel','Littenseradiel','Menaldumadeel','Schiermonnikoog','Terschelling','Tytsjerksteradiel','Vlieland','Bolsward','Gaasterlân-Sleat','Lemsterland','Nijefurd','Sneek','Wûnseradiel','Wymbritseradiel','Heerenveen','Ooststellingwerf','Opsterland','Skarsterlân','Smallingerland','Weststellingwerf','Aa en Hunze','Assen','Midden-Drenthe','Noordenveld','Tynaarlo','Borger-Odoorn','Coevorden','Emmen','Hoogeveen','Meppel','Westerveld','De Wolden','Dalfsen','Hardenberg','Kampen','Ommen','Staphorst','Steenwijkerland','Zwartewaterland','Zwolle','Deventer','Olst-Wijhe','Raalte','Almelo','Borne','Dinkelland','Enschede','Haaksbergen','Hellendoorn','Hengelo','Hof van Twente','Losser','Oldenzaal','Rijssen-Holten','Tubbergen','Twenterand','Wierden','Apeldoorn','Barneveld','Ede','Elburg','Epe','Ermelo','Harderwijk','Hattem','Heerde','Nijkerk','Nunspeet','Oldebroek','Putten','Scherpenzeel','Voorst','Wageningen','Buren','Culemborg','Geldermalsen','Lingewaal','Maasdriel','Neder-Betuwe','Neerijnen','Tiel','West Maas en Waal','Zaltbommel','Aalten','Berkelland','Bronckhorst','Brummen','Doetinchem','Lochem','Montferland','Oost Gelre','Oude IJsselstreek','Winterswijk','Zutphen','Arnhem','Beuningen','Doesburg','Druten','Duiven','Groesbeek','Heumen','Lingewaard','Millingen aan de Rijn','Nijmegen','Overbetuwe','Renkum','Rheden','Rijnwaarden','Rozendaal','Ubbergen','Westervoort','Wijchen','Zevenaar','Almere','Dronten','Lelystad','Noordoostpolder','Urk','Zeewolde','Abcoude','Amersfoort','Baarn','De Bilt','Breukelen','Bunnik','Bunschoten','Eemnes','Houten','IJsselstein','Leusden','Loenen','Lopik','Maarssen','Montfoort','Nieuwegein','Oudewater','Renswoude','Rhenen','De Ronde Venen','Soest','Utrecht','Utrechtse Heuvelrug','Veenendaal','Vianen','Wijk bij Duurstede','Woerden','Woudenberg','Zeist','Andijk','Anna Paulowna','Drechterland','Enkhuizen','Harenkarspel','Den Helder','Hoorn','Koggenland','Medemblik','Niedorp','Opmeer','Schagen','Stede Broec','Texel','Wervershoof','Wieringen','Wieringermeer','Zijpe','Alkmaar','Bergen (NH.)','Heerhugowaard','Heiloo','Langedijk','Schermer','Beverwijk','Castricum','Heemskerk','Uitgeest','Velsen','Bloemendaal','Haarlem','Haarlemmerliede en Spaarnwoude','Heemstede','Zandvoort','Wormerland','Zaanstad','Aalsmeer','Amstelveen','Amsterdam','Beemster','Diemen','Edam-Volendam','Graft-De Rijp','Haarlemmermeer','Landsmeer','Oostzaan','Ouder-Amstel','Purmerend','Uithoorn','Waterland','Zeevang','Blaricum','Bussum','Hilversum','Huizen','Laren','Muiden','Naarden','Weesp','Wijdemeren','Hillegom','Kaag en Braassem','Katwijk','Leiden','Leiderdorp','Lisse','Noordwijk','Noordwijkerhout','Oegstgeest','Teylingen','Voorschoten','Zoeterwoude','''s-Gravenhage','Leidschendam-Voorburg','Pijnacker-Nootdorp','Rijswijk','Wassenaar','Zoetermeer','Delft','Midden-Delfland','Westland','Alphen aan den Rijn','Bergambacht','Bodegraven','Boskoop','Gouda','Nieuwkoop','Reeuwijk','Rijnwoude','Schoonhoven','Vlist','Waddinxveen','Albrandswaard','Barendrecht','Bernisse','Binnenmaas','Brielle','Capelle aan den IJssel','Cromstrijen','Dirksland','Goedereede','Hellevoetsluis','Korendijk','Krimpen aan den IJssel','Lansingerland','Maassluis','Middelharnis','Nederlek','Oostflakkee','Oud-Beijerland','Ouderkerk','Ridderkerk','Rotterdam','Rozenburg','Schiedam','Spijkenisse','Strijen','Vlaardingen','Westvoorne','Zuidplas','Alblasserdam','Dordrecht','Giessenlanden','Gorinchem','Graafstroom','Hardinxveld-Giessendam','Hendrik-Ido-Ambacht','Leerdam','Liesveld','Nieuw-Lekkerland','Papendrecht','Sliedrecht','Zederik','Zwijndrecht','Hulst','Sluis','Terneuzen','Borsele','Goes','Kapelle','Middelburg','Noord-Beveland','Reimerswaal','Schouwen-Duiveland','Tholen','Veere','Vlissingen','Bergen op Zoom','Breda','Drimmelen','Etten-Leur','Geertruidenberg','Halderberge','Moerdijk','Oosterhout','Roosendaal','Rucphen','Steenbergen','Woensdrecht','Zundert','Aalburg','Alphen-Chaam','Baarle-Nassau','Dongen','Gilze en Rijen','Goirle','Hilvarenbeek','Loon op Zand','Oisterwijk','Tilburg','Waalwijk','Werkendam','Woudrichem','Bernheze','Boekel','Boxmeer','Boxtel','Cuijk','Grave','Haaren','''s-Hertogenbosch','Heusden','Landerd','Lith','Maasdonk','Mill en Sint Hubert','Oss','Schijndel','Sint Anthonis','Sint-Michielsgestel','Sint-Oedenrode','Uden','Veghel','Vught','Asten','Bergeijk','Best','Bladel','Cranendonck','Deurne','Eersel','Eindhoven','Geldrop-Mierlo','Gemert-Bakel','Heeze-Leende','Helmond','Laarbeek','Nuenen, Gerwen en Nederwetten','Oirschot','Reusel-De Mierden','Someren','Son en Breugel','Valkenswaard','Veldhoven','Waalre','Beesel','Bergen (L.)','Gennep','Horst aan de Maas','Mook en Middelaar','Peel en Maas','Venlo','Venray','Echt-Susteren','Leudal','Maasgouw','Nederweert','Roerdalen','Roermond','Weert','Beek','Brunssum','Eijsden','Gulpen-Wittem','Heerlen','Kerkrade','Landgraaf','Maastricht','Margraten','Meerssen','Nuth','Onderbanken','Schinnen','Simpelveld','Sittard-Geleen','Stein','Vaals','Valkenburg aan de Geul','Voerendaal'))
where column_value <> nls_initcap(column_value, 'NLS_SORT=xDutch');
Use the result from that statement to build an UPDATE like this:
--Update names to properly init-capped name, as defined by:
--http://epp.eurostat.ec.europa.eu/portal/page/portal/nuts_nomenclature/local_administrative_units
update location
set name =
decode(upper(name),'''S-GRAVENHAGE','''s-Gravenhage',
'''S-HERTOGENBOSCH','''s-Hertogenbosch',
'AA EN HUNZE','Aa en Hunze',
'ALPHEN AAN DEN RIJN','Alphen aan den Rijn',
'BERGEN (NH.)','Bergen (NH.)',
'BERGEN OP ZOOM','Bergen op Zoom',
'CAPELLE AAN DEN IJSSEL','Capelle aan den IJssel',
'GILZE EN RIJEN','Gilze en Rijen',
'HAARLEMMERLIEDE EN SPAARNWOUDE','Haarlemmerliede en Spaarnwoude',
'HOF VAN TWENTE','Hof van Twente',
'HORST AAN DE MAAS','Horst aan de Maas',
'KAAG EN BRAASSEM','Kaag en Braassem',
'KOLLUMERLAND EN NIEUWKRUISLAND','Kollumerland en Nieuwkruisland',
'KRIMPEN AAN DEN IJSSEL','Krimpen aan den IJssel',
'LOON OP ZAND','Loon op Zand',
'MILL EN SINT HUBERT','Mill en Sint Hubert',
'MILLINGEN AAN DE RIJN','Millingen aan de Rijn',
'MOOK EN MIDDELAAR','Mook en Middelaar',
'NUENEN, GERWEN EN NEDERWETTEN','Nuenen, Gerwen en Nederwetten',
'PEEL EN MAAS','Peel en Maas',
'SON EN BREUGEL','Son en Breugel',
'VALKENBURG AAN DE GEUL','Valkenburg aan de Geul',
'WEST MAAS EN WAAL','West Maas en Waal',
'WIJK BIJ DUURSTEDE','Wijk bij Duurstede',
'HET BILDT','het Bildt',
nls_initcap(name, 'NLS_SORT=xDutch'));
If the question is about Dutch place names, would it be an option to have a lookup table? According to eurostat, there are 418 "Gemeenten" on NUTS5/LAU2 level. A list is available at http://epp.eurostat.ec.europa.eu/portal/page/portal/nuts_nomenclature/local_administrative_units. If this is not acceptable, you could at least verify your procedure with the official list...
Related
sphinx gettext inserts empty quotes "" in front of previously matching msg
Currently my worflow when I change things in the original file is this: make gettext to update *.pot files sphinx-intl update -p build/gettext -l fr to create *.po files out of it However, this always results in the following behavior: Some longer messages in the *.po files are not correctly updated or to be more correct they are updated although they didn't change. sphinx-intl update will insert quotes "" in front of every paragraph that spans over multiple lines. Here's how that looks: Before: (in some french *.po file): msgid "Some longer paragraph text that spans multiple lines. This text was just" "lying here and matched a sequence within the file before gettext inserted" "unnecessary quotes on top." msgstr "Un texte de paragraphe plus long qui s'étend sur plusieurs lignes. Ce texte se trouvait juste ici et correspondait à une séquence dans le fichier avant que gettext n'insère des guillemets inutiles par-dessus." After: msgid "" "Some longer paragraph text that spans multiple lines. This text was just" "lying here and matched a sequence within the file before gettext inserted" "unnecessary quotes on top." msgstr "" "Un texte de paragraphe plus long qui s'étend sur plusieurs lignes " "Ce texte se trouvait juste ici et correspondait à une séquence dans le " "fichier avant que gettext n'insère des guillemets inutiles par-dessus." This is extremely annoying as it will not match anymore with the text it is supposed to! Only when I remove the leading "" the texts will match again. I wondered if this happens because I tend write my translated msgstr as one line without intermitted quotes (what are they good for anyway?). After sphinx-intl update they are enclosed in quotes... What is going on and how can I prevent this?
How to check the value of several variables in a th: text thymeleaf
I find a lot of tutorial to check if a variable is null or not in a th: text. But I can not find to check several, and change the text. here is my example: th:text="|${item?.startDate} ${item?.endDate} ${item?.startTime} ${item?.endTime}|" See that sometimes one or more of its 4 variables can be null, so sometimes I have null displayed. I therefore want to display From XX to YY at 10 a.m. until 7 p.m. And if possible, be able to use the locales 'From' 'To' 'to' and 'to' so that this line is multi-lingual. Precision: I use internalization well: message1= Starts on(en), Débute le (fr) message2= Ends on( en), se termine le (fr) message 3= begins to (en), commence à(fr) message4= finishes at (en), se termine à (fr) So I want to Show Fr: Débute le item.startDate se termine le item.endDate commence à item.startTime se termine à item.endTime en: Starts on item.startDate ends on item.endDate begins to item.startTime finishes at item.endTime but the problem is that sometimes I have item.startTime and / or item.endTime which is / are null So I want to display the message partially: Fr: Débute le item.startDate se termine le item.endDate en: Starts on item.startDate ends on item.endDate And sometimes I can have item.startTime and item.endTime not null but item.startDate and item.endDate null So I will want to display the following message: Fr: commence à item.startTime se termine à item.endTime en: begins to item.startTime finishes at item.endTime I can't find the correct syntax for this example thanking you
Using a slot-scope to show text, doesn't show ñ character?
I'm using a slot-scope to show text where there could be special characters like ñ or letters with accents, how to make sure those show up the right way? This is the outcome right now Opiniones acerca de las decoraciones navide\u00f1as en las oficinas centrales. tiene un comentario eliminado. This is the expected outcome Opiniones acerca de las decoraciones navideñas en las oficinas centrales. tiene un comentario eliminado. I'm setting the info like this in the controller, to call the info for the table I call Activity::all() activity() ->performedOn($comment) ->withProperty('user', auth()->user()->name) ->log($comment->discussionForum()->pluck('theme') . ' tiene un comentario eliminado.'); I'm showing the text like this in a table <el-table-column> <template slot-scope="scope"> {{ scope.row.description }} </template> </el-table-column> How can I fix this?
You will need to specify the character encoding in your HTML template so that it can read these letters correctly, e.g. <meta charset="UTF-8">. Without the charset set, some characters do not appear how they should. If you have no luck with this, I would try adding the encoding to AppServiceProvider.php in the boot function: \Blade::setEchoFormat('e(utf8_encode(%s))');
This is the answer I came up with, using the foreign keys to call the model and find the name that way. Since I'm adding this activity when the person saves a comment I can use the FK $theme = DiscussionForum::where('id', $request->forum_id)->pluck('theme')->first(); activity() ->performedOn($comment) ->withProperty('user', auth()->user()->name) ->log($theme . ' tiene un nuevo comentario.');
Justify + TabSpaces does not work on CKEDITOR
team! When I use "config.tabSpaces = 20;" in justified text, he gets different spaces. for example: first paragraph: <p style="text-align:justify"> Além disso, constata-se que a parte reclamante recebeu, como última remuneração, valor <strong><u>muito acima do piso dos bancários, diferenciando-o de um SIMPLES CAIXA, quer seja pelas atividades desenvolvida, quer seja pela maior remuneração auferida.</u></strong></p> second paragraph: <p style="text-align:justify"> Nesse passo, verifica-se que era depositada uma confiança acima do comum, além daquela que é inerente a qualquer relação de emprego, <strong>e muito diferenciada das responsabilidades daqueles que exercem cargos de base (caixas, agentes ou atendentes comerciais)</strong>, estes sim, bancários comuns.</p> see that tab spaces were different, microsoft word redistributes spaces as the length of the sentence someone can help me?
CKEditor is not Microsoft Word. CKEditor is focused on editing fragments of HTML documents from the semantical (structural) POV, not from the visual POV. You cannot expect it to work as Microsoft Word which, despite the seeming similarities, is a totally different tool. So the behaviour you encountered is the right behaviour, because it is caused by how HTML and CSS works.
convert certain special characters to ascii in php
I need a php script that convert certain special characters to ascii code( , . / - and all the letter with accent) eg. original: Dingo a accidentellement fait tomber la pièce porte-bonheur de Mickey tout au fond du lac. Le Professeur Von Drake va utiliser son camping-car et le transformer en sous-marin pour explorer les eaux profondes. result: Dingo a accidentellement fait tomber la pièce porte-bonheur de Mickey tout au fond du lac. Le Professeur Von Drake va utiliser son camping-car et le transformer en sous-marin pour explorer les eaux profondes. I've tried htmlspecialchars() doesn't seems work out it only convert the characters which are special significance in HTML
If you look at the documentation of htmlspecialchars() you will see: If you require all input substrings that have associated named entities to be translated, use htmlentities() instead.