PyEnchant : Replace internet friendly words with a english word - filter
I want to identify words like "sooooooooooooooo" and replace them with "so" in Spell Check. How can I achieve this ? What do I write (meaning a Filter, etc.) and Where do I tweak the code for the same ?
Thanks !
You could use store_replacement, however my understanding is that store_replacement needs to be implemented by the underlying provider. If you use the provider Aspell which implements it you can see it working like so: (Note you will need to install Aspell and it's dictionaries to see this working)
import enchant
# Get the broker.
b = enchant.Broker()
# Set the ordering on the broker so aspell gets used first.
b.set_ordering("en_US","aspell,myspell")
# Print description of broker just to see what's available.
print (b.describe())
# Get an US English dictionary.
d=b.request_dict("en_US")
# Print the provider of the US English dictionary.
print (d.provider)
# A test string.
s = 'sooooooooooooooo'
# We will check the word is not in the dictionary not needed if we know it isn't.
print (d.check(s))
# Print suggestions for the string before we change anything.
print (d.suggest(s))
# Store a relacement for our string as "so".
d.store_replacement(s, 'so')
# Print our suggestions again and see "so" appears at the front of the list.
print (d.suggest(s))
[<Enchant: Aspell Provider>, <Enchant: Ispell Provider>, <Enchant: Myspell Provider>, <Enchant: Hspell Provider>]
<Enchant: Aspell Provider>
False
['SO', 'so', 'spoor', 'sou', 'sow', 'soy', 'zoo', 'Soho', 'Soto', 'solo', 'soon', 'soot', 'shoo', 'soar', 'sour', 'shoos', 'sooth', 'sooty', 'Si', 'sootier', 'sough', 'SOP', 'sop', 'S', 'poo', 's', 'sooner', 'soothe', 'sorrow', 'Sir', 'Sui', 'sci', 'sir', 'poos', 'silo', 'soap', 'soil', 'soup', 'SA', 'SE', 'SS', 'SW', 'Se', 'soother', 'SOB', 'SOS', 'SOs', 'SRO', 'Soc', 'Sol', 'Son', 'sob', 'soc', 'sod', 'sol', 'son', 'sot', 'boo', 'coo', 'foo', 'goo', 'loo', 'moo', 'ooh', 'too', 'woo', 'CEO', "S's", 'SSA', 'SSE', 'SSS', 'SSW', 'Sue', 'Zoe', 'saw', 'say', 'sea', 'see', 'sew', 'sue', 'xor', 'Snow', 'Sony', 'Sosa', 'boos', 'bozo', 'coos', 'loos', 'moos', 'oohs', 'ooze', 'oozy', 'orzo', 'ouzo', 'sago', 'scow', 'sloe', 'slow', 'snow', 'soak']
['so', 'SO', 'spoor', 'sou', 'sow', 'soy', 'zoo', 'Soho', 'Soto', 'solo', 'soon', 'soot', 'shoo', 'soar', 'sour', 'shoos', 'sooth', 'sooty', 'Si', 'sootier', 'sough', 'SOP', 'sop', 'S', 'poo', 's', 'sooner', 'soothe', 'sorrow', 'Sir', 'Sui', 'sci', 'sir', 'poos', 'silo', 'soap', 'soil', 'soup', 'SA', 'SE', 'SS', 'SW', 'Se', 'soother', 'SOB', 'SOS', 'SOs', 'SRO', 'Soc', 'Sol', 'Son', 'sob', 'soc', 'sod', 'sol', 'son', 'sot', 'boo', 'coo', 'foo', 'goo', 'loo', 'moo', 'ooh', 'too', 'woo', 'CEO', "S's", 'SSA', 'SSE', 'SSS', 'SSW', 'Sue', 'Zoe', 'saw', 'say', 'sea', 'see', 'sew', 'sue', 'xor', 'Snow', 'Sony', 'Sosa']
Related
How replace each accented characters with non-accented characters foreach word in array in clickhouse?
I have a array of words, ['camión', 'elástico', 'Árbol'] and I want replace accented characters with non-accented characters for each word in array (['camion', 'elastico', 'Arbol']) I'm searching some as this SELECT arrayMap(x -> replaceRegexpAll(x, ['á', 'é', 'í', 'ó', 'ú'], ['a', 'e', 'i', 'o', 'u']), ['camión', 'elástico', 'Árbol']) AS word And I want this result: ['camion', 'elastico', 'arbol'] Replacing each characters accents to withouth accent, but this doesn't work... Any idea from solve? Thanks
SELECT arrayMap(x -> arrayStringConcat( arrayMap(y -> if((indexOf(['á', 'é', 'í', 'ó', 'ú'],y) as i) = 0, y, ['a', 'e', 'i', 'o', 'u'][i] ), extractAll(x,'.'))), ['camión', 'elástico', 'Árbol']) r ┌─r─────────────────────────────┐ │ ['camion','elastico','Árbol'] │ └───────────────────────────────┘
New feature add functions translate(string, from_string, to_string) and translateUTF8(string, from_string, to_string). These functions replace characters in the original string according to the mapping of each character in from_string to to_string. SELECT arrayMap(y -> translateUTF8(y,'áéíóúÁÉÍÓÚ','aeiouAEIOU'), ['camión', 'elástico', 'Árbol']) r r | -----------------------------+ ['camion','elastico','Arbol']|
Ruby array of countries stripe connect accepts?
Is there somewhere I can get this list of countries as a ruby array? I'll need it for a form field e.g. <%= f.label :country %><br> <%= f.select :country, ['Australia', 'Austria', 'etc', 'etc'], required: true %> I could type it up manually (which I'll probably do), but just wanted to check that I'm not reinventing the wheel (it may already exist somewhere)
Okay, so there it is. Note that Mexico's commented out stripe_connect_countries = [ 'Australia', 'Austria', 'Belgium', 'Bulgaria', 'Canada', 'Cyprus', 'Czech Republic', 'Denmark', 'Estonia', 'Finland', 'France', 'Germany', 'Greece', 'Hong Kong SAR China', 'Hungary', 'Ireland', 'Italy', 'Japan', 'Latvia', 'Lithuania', 'Luxembourg', 'Malta', # 'Mexico', 'Netherlands', 'New Zealand', 'Norway', 'Poland', 'Portugal', 'Romania', 'Singapore', 'Slovakia', 'Slovenia', 'Spain', 'Sweden', 'Switzerland', 'United Kingdom', 'United States' ] Also note, that the country argument to Stripe functions typically wants a two-character alphanumeric country code (such as 'US', 'EG', or 'GB'). Here is a full list of country codes Since it turns out these countries are user-friendly, but Stripe functions require the 2 character codes, here's a useful function providing a hash mapping countries to codes (again, Mexico commented out) def stripe_connect_countries {'Australia': 'AU', 'Austria': 'AT', 'Belgium': 'BE', 'Bulgaria': 'BG', 'Canada': 'CA', 'Cyprus': 'CY', 'Czech Republic': 'CZ', 'Denmark': 'DK', 'Estonia': 'EE', 'Finland': 'FI', 'France': 'FR', 'Germany': 'DE', 'Greece': 'GR', 'Hong Kong SAR China': 'HK', 'Hungary': 'HU', 'Ireland': 'IE', 'Italy': 'IT', 'Japan': 'JP', 'Latvia': 'LV', 'Lithuania': 'LT', 'Luxembourg': 'LU', 'Malta': 'MT', # 'Mexico': 'MX', 'Netherlands': 'NL', 'New Zealand': 'NZ', 'Norway': 'NO', 'Poland': 'PL', 'Portugal': 'PT', 'Romania': 'RO', 'Singapore': 'SG', 'Slovakia': 'SK', 'Slovenia': 'SI', 'Spain': 'ES', 'Sweden': 'SE', 'Switzerland': 'CH', 'United Kingdom': 'GB', 'United States': 'US' } end
Sort a dictionary into two different lists based on group
import requests from operator import itemgetter foods = [{'name': 'Daisy' , 'group': 'A', 'eating': 'yes', 'feasting': 'yes', 'fasting': 'no', 'sleeping': 'no'}, {'name': 'Donny', 'group': 'B', 'eating': 'maybe', 'feasting':'maybe', 'fasting':'maybe', 'sleeping': 'maybe'}, {'name': 'Dwane', 'group': 'A', 'eating': 'no', 'feasting':'yes', 'fasting': 'no', 'sleeping': 'yes'}, {'name': 'Diana', 'group': 'B', 'eating': 'never', 'feasting':'never', 'fasting':'never', 'sleeping':'never'}] def main(): group = sorted(foods, key=itemgetter('group')) group_a = [] group_b = [] print(group) main() Hi there, I need help with the next step of this code. I would like to place the two dictionaries with group "A" in the empty list group_a. I would also like to place the two dictionaries with group "B" in the empty list group_b. I am not sure how to go about this. Previously I tried: for row in foods: if 'A' in row: group_a.append(row) else: group_b.append(row) How ever that did not work. Does anyone have an idea of how to populate these two empty lists according to group?
Getting the user's Country ISO or Country name [duplicate]
I am trying to access the Control Panel: Region and Language: Location: Current location setting using Ruby. I am only interested in the country code. The closest I have got is the country code from the System Locale but that is not quite what I was after. `systeminfo | findstr /B /C:"System Locale"`.to_s.upcase.strip[30..31] I hope that someone out there might know. Thanks.
Using the Win32 API: require 'Win32API' # Set up some Win32 constants GEOCLASS_NATION = 16 GEO_ISO2 = 4 GEO_FRIENDLYNAME = 8 # Set up some API calls GetUserGeoID = Win32API.new('kernel32', 'GetUserGeoID', ['L'], 'L') GetGeoInfo = Win32API.new('kernel32', 'GetGeoInfoA', ['L', 'L', 'P', 'L', 'L'], 'L') # Get user's GEOID geoid = GetUserGeoID.call(GEOCLASS_NATION) => 77 # Get ISO name buffer = " " * 100 GetGeoInfo.call(geoid, GEO_ISO2, buffer, buffer.length, 0) geo_iso = buffer.strip => "FI" # Get friendly name buffer = " " * 100 GetGeoInfo.call(geoid, GEO_FRIENDLYNAME, buffer, buffer.length, 0) geo_name = buffer.strip => "Finland" Documentation for GetUserGeoID: http://msdn.microsoft.com/en-us/library/dd318138.aspx Documentation for GetGeoInfo: https://learn.microsoft.com/en-us/windows/desktop/api/winnls/nf-winnls-getgeoinfoa To convert a GEOID to a location name you can also use this table: http://msdn.microsoft.com/en-us/library/dd374073.aspx
Plone Translations - i18ndude Preferred Language
I am hoping this is something simple I am just overlooking. We have 3 Plone sites that are supposed to be exactly the same in their core setup, only differing with certain products installed and the actual content. I noticed our translations are working on one site, and not on the other two. So far I can't find any differences. We are using i18ndude (version 3.3.3) with Plone 4.3.2. We do have custom products/types with our own domain, but it is more than just those not working, it is everything in the site. For testing, I have tried just grabbing and printing the browser's language. I did it with both context.REQUEST['LANGUAGE'] and context.portal_languages.getPreferredLanguage(). I set my browser language in each attempt to 'es', 'en', and 'pt', as those are the languages we are currently supporting. The Site Language in each site is set to English. Here are my test results: Browser Language set to 'es': Site A: returned 'es' Site B: returned 'en' Site C: returned 'en' Browser Language set to 'en': Site A: returned 'en' Site B: returned 'en' Site C: returned 'en' Browser Language set to 'pt': Site A: returned 'en' Site B: returned 'en' Site C: returned 'en' Site A and B are both on the same server, so I don't believe its a missing server package. The buildouts are almost identical for those two, but the differences are just in a couple eggs that are seemingly unrelated to this issue. I just don't understand why it isn't even detecting the updated browser language at all, it just defaults back to the site's preferred language it seems. Except for one scenario in one site. What is strange is, these all used to work to the best of my knowledge, and I am not sure when they stopped. I did check context.portal_languages.getAvailableLanguages() just to make sure the ones I am using are in there, and they are. I also checked the ownership and permissions of the locales & i18n directories, those are all a match across sites and set accurately. EDIT This is a script I quickly wrote to see what all values Plone is getting: pl = context.portal_languages langs = [str(language) for language in pl.getAvailableLanguages().keys()] print langs print "Preferred: ", pl.getPreferredLanguage() ts = context.translation_service print "Request Language: ", context.REQUEST['LANGUAGE'] print "Accept Language: ", context.REQUEST['HTTP_ACCEPT_LANGUAGE'] return printed This is my browser language setup when running this, listed by highest priority first: pt-br pt es en en-us And this is my result (site A, which seems to recognize Spanish, but not Portuguese): ['gv', 'gu', 'gd', 'ga', 'gn', 'gl', 'lg', 'lb', 'ty', 'ln', 'tw', 'tt', 'tr', 'ts', 'li', 'tn', 'to', 'tl', 'lu', 'tk', 'th', 'ti', 'tg', 'as', 'te', 'ta', 'yi', 'yo', 'de', 'ko', 'da', 'dz', 'dv', 'qu', 'kn', 'lv', 'el', 'eo', 'en', 'zh', 'ee', 'za', 'uk', 'eu', 'zu', 'es', 'ru', 'rw', 'kl', 'rm', 'rn', 'ro', 'bn', 'be', 'bg', 'ba', 'wa', 'wo', 'bm', 'jv', 'bo', 'bh', 'bi', 'br', 'bs', 'ja', 'om', 'oj', 'la', 'oc', 'kj', 'lo', 'os', 'or', 'xh', 'ch', 'co', 'ca', 'ce', 'cy', 'cs', 'cr', 'cv', 'cu', 'ps', 'pt', 'lt', 'pa', 'pi', 'ak', 'pl', 'hz', 'hy', 'an', 'hr', 'am', 'ht', 'hu', 'hi', 'ho', 'ha', 'he', 'mg', 'uz', 'ml', 'mo', 'mn', 'mi', 'mh', 'mk', 'ur', 'mt', 'ms', 'mr', 'ug', 'my', 'ki', 'aa', 'ab', 'ae', 've', 'af', 'vi', 'is', 'vk', 'iu', 'it', 'vo', 'ii', 'ay', 'ik', 'ar', 'km', 'io', 'et', 'ia', 'az', 'ie', 'id', 'ig', 'ks', 'nl', 'nn', 'no', 'na', 'nb', 'nd', 'ne', 'ng', 'ny', 'kw', 'nr', 'nv', 'kv', 'fr', 'ku', 'fy', 'fa', 'kk', 'ff', 'fi', 'fj', 'ky', 'fo', 'ka', 'kg', 'ss', 'sr', 'sq', 'sw', 'sv', 'su', 'st', 'sk', 'kr', 'si', 'sh', 'so', 'sn', 'sm', 'sl', 'sc', 'sa', 'sg', 'se', 'sd'] Preferred: es Request Language: es Accept Language: pt-br,pt;q=0.8,es;q=0.6,en;q=0.4,en-us;q=0.2 And results for Site B and C: ['en-mp', 'gv', 'gu', 'fr-dj', 'fr-gb', 'en-na', 'en-ng', 'en-nf', 'zh-hk', 'gd', 'pt-br', 'ga', 'gn', 'gl', 'en-nu', 'en-fm', 'en-ag', 'ms-my', 'ty', 'tw', 'tt', 'tr', 'ts', 'ko-kp', 'tn', 'to', 'tl', 'tk', 'th', 'ti', 'tg', 'te', 'zh-sg', 'ta', 'fr-mq', 'de', 'da', 'ar-ae', 'es-ni', 'dz', 'en-kn', 'fr-ml', 'dv', 'en-ms', 'fr-mg', 'fr-sc', 'fr-vu', 'qu', 'ar-qa', 'es-bo', 'en-nz', 'fr-bj', 'en-ws', 'fr-bi', 'zh', 'en-lr', 'fr-ch', 'fr-bf', 'za', 'fr-be', 'en-lc', 'fr-rw', 'zu', 'ch-mp', 'ar-ly', 'en-gb', 'en-nr', 'es-pr', 'tr-bg', 'en-gh', 'en-gi', 'fr-km', 'es-py', 'en-gm', 'es-pe', 'es-pa', 'en-gu', 'en-gy', 'sw-tz', 'ms-sg', 'wa', 'pt-st', 'wo', 'pt-ao', 'jv', 'fr-cd', 'ja', 'en-vu', 'es-ar', 'fr-td', 'fr-tg', 'da-dk', 'ch', 'co', 'en-vg', 'en-bz', 'ca', 'en-us', 'ce', 'en-ai', 'en-bm', 'en-vi', 'cy', 'en-bn', 'cs', 'cr', 'fr-ci', 'cv', 'cu', 'en-bb', 'ps', 'ln-cg', 'pt', 'en-au', 'zh-tw', 'es-mx', 'de-de', 'pa', 'es-ve', 'en-as', 'en-er', 'pi', 'de-dk', 'pl', 'en-sb', 'ch-gu', 'es-hn', 'en-sc', 'fr-nc', 'it-hr', 'ar-eg', 'mg', 'pt-pt', 'ml', 'mo', 'mn', 'mi', 'mh', 'mk', 'mt', 'ms', 'mr', 'fr-fr', 'hu-si', 'my', 'sv-fi', 'fr-re', 'en-pk', 've', 'vi', 'is', 'vk', 'iu', 'it', 'vo', 'ii', 'ik', 'en-io', 'fr-cm', 'io', 'ia', 'ie', 'id', 'ig', 'es-cu', 'hu-hu', 'es-cr', 'es-cl', 'es-co', 'fr-wf', 'pt-mz', 'en-il', 'it-it', 'de-be', 'fr', 'en-ke', 'fr-ga', 'fr-pf', 'es-do', 'ar-ps', 'fy', 'fr-gn', 'fr-pm', 'en-ki', 'en-ug', 'fa', 'fr-gp', 'ff', 'fi', 'fj', 'fo', 'ar-kw', 'bn-sg', 'ss', 'sr', 'sq', 'sw', 'sv', 'su', 'st', 'sk', 'si', 'sh', 'so', 'sn', 'sm', 'sl', 'sc', 'sa', 'sg', 'se', 'sd', 'bn-in', 'fr-mc', 'sv-se', 'ar-bh', 'lg', 'lb', 'la', 'ln', 'lo', 'ss-za', 'li', 'lv', 'lt', 'lu', 'sw-ke', 'en-bw', 'yi', 'en-ph', 'en-pn', 'yo', 'en-ie', 'en-pg', 'pt-cv', 'hr-ba', 'bn-bd', 'en-pr', 'en-pw', 'ss-sz', 'ar-iq', 'de-ch', 'ar-il', 'es-sv', 'el', 'eo', 'en', 'ar-dz', 'ee', 'tn-bw', 'es-gq', 'fr-gf', 'es-gt', 'eu', 'et', 'de-lu', 'es', 'ru', 'rw', 'zh-cn', 'ar-td', 'nl-nl', 'it-sm', 'it-si', 'rm', 'rn', 'ro', 'ar-sa', 'be', 'bg', 'ur-pk', 'ba', 'fr-ca', 'bm', 'bn', 'bo', 'bh', 'bi', 'fr-cg', 'fr-cf', 'es-us', 'el-cy', 'en-vc', 'sd-pk', 'ta-sg', 'br', 'bs', 'nl-an', 'sd-in', 'cs-cz', 'om', 'oj', 'fr-lb', 'en-fk', 'en-fj', 'oc', 'ln-cd', 'fr-lu', 'ar-om', 'de-at', 'os', 'or', 'tr-cy', 'xh', 'el-gr', 'de-li', 'ar-sy', 'en-jm', 'es-ec', 'ar-so', 'it-ch', 'en-ls', 'ar-sd', 'es-es', 'en-rw', 'tn-za', 'ar-jo', 'en-ky', 'en-bs', 'hz', 'ar-ma', 'da-gl', 'hy', 'en-mt', 'en-mu', 'nl-aw', 'en-mw', 'hr', 'en-tt', 'en-zw', 'ht', 'hu', 'en-to', 'ar-mr', 'hi', 'en-tk', 'ho', 'hr-hr', 'ha', 'en-tc', 'pt-gw', 'he', 'en-dm', 'fr-it', 'uz', 'en-et', 'ur-in', 'ur', 'tr-tr', 'uk', 'ms-bn', 'ug', 'aa', 'en-so', 'en-sl', 'ab', 'ae', 'en-sh', 'af', 'en-sg', 'ak', 'am', 'ko-kr', 'an', 'as', 'ar', 'en-sz', 'nl-be', 'ay', 'az', 'ar-lb', 'nl', 'nn', 'no', 'na', 'nb', 'nd', 'ne', 'ng', 'ny', 'ta-in', 'fr-yt', 'en-za', 'nr', 'nv', 'ar-ye', 'ar-tn', 'en-cm', 'en-ck', 'sr-ba', 'en-ca', 'ka', 'kg', 'en-gd', 'es-uy', 'kk', 'kj', 'ki', 'ko', 'kn', 'km', 'kl', 'ks', 'kr', 'fr-ad', 'kw', 'kv', 'ku', 'en-zm', 'ky', 'fr-ht', 'nl-sr'] Preferred: en Request Language: en Accept Language: pt-br,pt;q=0.8,es;q=0.6,en;q=0.4,en-us;q=0.2 I just noticed that the list of available languages from portal_languages is different between those sites. Adding to the strange, but maybe a hint to the culprit? Sorry for the long post, just trying to give as much info as I can!
My suspicions were right about it being something simple I am overlooking. Posting my find here. In the ZMI, go to portal_languages and check these settings: Default Language Allowed Languages ALL supported languages should be selected. Negotiation Scheme Make sure "Use browser language request negotiation" is checked My issue was that only the Default language was selected in the Allowed Languages selection list. I am not sure why it go reset like this or how. When using the Language Settings Control Panel I did not see the Allowed Languages option, had to go to ZMI for it. Apparently the changes mentioned by hvelarde did not update this setting either.
Search the instance part of your buildout for the environment variable zope_i18n_allowed_languages; it is used to restrict the languages for which po files are loaded to speed up Zope startup time and use less memory. In your case, you should set it as follows: [instance] ... environment-vars = PTS_LANGUAGES en es pt zope_i18n_allowed_languages en es pt zope_i18n_compile_mo_files true For more information check Maurits van Rees' Internationalization in Plone 3.3 and 4.0.