In South Africa we have 11 official languages:
af = Afrikaans
en = English
nr = Ndebele
nso = Pedi / Nothern Sotho
ss = Swati
st = South Sotho
tn = Tswana
ts = Tsonga
ve = Venda
xh = Xhosa
zu = Zulu
To what extent does Plone support the languages mentioned?
You may find Plone's existing translations at: https://github.com/collective/plone.app.locales/tree/master/plone/app/locales/locales
Of those on your list, in a quick review I only saw English and Afrikaans. There are several active South African Plone Developers (ZA was the site of a recent sprint), and you might want to see if there are possibilities for collaborative work on other translations. The Plone SA Group: https://groups.google.com/forum/?fromgroups#!forum/plonesa
Related
page1 and page2 URL. I want to fetch all the content from the 1st URL and only the main text from the 2nd URL and append it to the main text of 1st URL. This is only one article. function parse_indianexpress_archive_links() contains a list of news articles URLs. I m getting all the results from page1 but the next_maintext column from page2 results output <GET http://archive.indianexpress.com/news/congress-approves-2010-budget-plan/442712/2>
class spider_indianexpress(scrapy.Spider):
name = 'indianexpress'
start_urls = parse_indianexpress_archive_links()
def parse(self,response):
items = ScrapycrawlerItem()
separator = ''
#article_url = response.xpath("//link[#rel = 'canonical']/#href").extract_first()
article_url = response.request.url
date_updated = max(response.xpath("//div[#class = 'story-date']/text()").extract() , key=len)[-27:] #Call max(list, key=len) to return the longest string in list by comparing the lengths of all strings in a list
if len(date_updated) <=10:
date_updated = max(response.xpath("//div[#class = 'story-date']/p/text()").extract() , key=len)[-27:]
headline = response.xpath("(//div[#id = 'ie2013-content']/h1//text())").extract()
headline=separator.join(headline)
image_url = response.css("div.storybigpic.ssss img").xpath("#src").extract_first()
maintext = response.xpath("//div[#class = 'ie2013-contentstory']//p//text()").extract()
maintext = ' '.join(map(str, maintext))
maintext = maintext.replace('\r','')
contd = response.xpath("//div[#class = 'ie2013-contentstory']/p[#align = 'right']/text()").extract_first()
items['date_updated'] = date_updated
items['headline'] = headline
items['maintext'] = maintext
items['image_url'] = image_url
items['article_url'] = article_url
next_page_url = response.xpath("//a[#rel='canonical']/#href").extract_first()
if next_page_url :
items['next_maintext'] = scrapy.Request(next_page_url , callback = self.parse_page2)
yield items
def parse_page2(self, response):
next_maintext = response.xpath("//div[#class = 'ie2013-contentstory']//p//text()").extract()
next_maintext = ' '.join(map(str, next_maintext))
next_maintext = next_maintext.replace('\r','')
yield {next_maintext}
Output:
article_url,date_publish,date_updated,description,headline,image_url,maintext,next_maintext
http://archive.indianexpress.com/news/congress-approves-2010-budget-plan/442712/,,"Fri Apr 03 2009, 14:49 hrs ",,Congress approves 2010 budget plan,http://static.indianexpress.com/m-images/M_Id_69893_Obama.jpg,"The Democratic-controlled US Congress on Thursday approved budget blueprints embracing President Barack Obama's agenda but leaving many hard choices until later and a government deeply in the red. With no Republican support, the House of Representatives and Senate approved slightly different, less expensive versions of Obama's $3.55 trillion budget plan for fiscal 2010, which begins on October 1. The differences will be worked out over the next few weeks. Obama, who took office in January after eight years of the Republican Bush presidency, has said the Democrats' budget is critical to turning around the recession-hit US economy and paving the way for sweeping healthcare, climate change and education reforms he hopes to push through Congress this year. Obama, traveling in Europe, issued a statement praising the votes as ""an important step toward rebuilding our struggling economy."" Vice President Joe Biden, who serves as president of the Senate, presided over that chamber's vote. Democrats in both chambers voted down Republican alternatives that focused on slashing massive deficits with large cuts to domestic social spending but also offered hefty tax breaks for corporations and individuals. ""Democrats know that those policies are the wrong way to go,"" House Majority Leader Steny Hoyer told reporters. ""Our budget lays the groundwork for a sustained, shared and job-creating recovery."" But Republicans have argued the Democrats' budget would be a dangerous expansion of the federal government and could lead to unnecessary taxes that would only worsen the country's long-term fiscal situation. ""The Democrat plan to increase spending, to increase taxes, and increase the debt makes no difficult choices,"" said House Minority Leader John Boehner. ""It's a roadmap to disaster."" The budget measure is nonbinding but it sets guidelines for spending and tax bills Congress will consider later this year. BIPARTISANSHIP ABSENT AGAIN Obama has said he hoped to restore bipartisanship when he arrived in Washington but it was visibly absent on Thursday. ... contd.",<GET http://archive.indianexpress.com/news/congress-approves-2010-budget-plan/442712/2>
This is not how Scrapy works (I mean next_page request) How to fetch the Response object of a Request synchronously on Scrapy?.
But in fact you don't need synchronous requests. All you need is to check for a next page and pass current state (item) to the callback that will process your next page. I'm using cb_kwargs (it's a recommended way now). You may need to use request.meta if you have an old version.
import scrapy
class spider_indianexpress(scrapy.Spider):
name = 'indianexpress'
start_urls = ['http://archive.indianexpress.com/news/congress-approves-2010-budget-plan/442712/']
def parse(self,response):
item = {}
separator = ''
#article_url = response.xpath("//link[#rel = 'canonical']/#href").extract_first()
article_url = response.request.url
date_updated = max(response.xpath("//div[#class = 'story-date']/text()").extract() , key=len)[-27:] #Call max(list, key=len) to return the longest string in list by comparing the lengths of all strings in a list
if len(date_updated) <=10:
date_updated = max(response.xpath("//div[#class = 'story-date']/p/text()").extract() , key=len)[-27:]
headline = response.xpath("(//div[#id = 'ie2013-content']/h1//text())").extract()
headline=separator.join(headline)
image_url = response.css("div.storybigpic.ssss img").xpath("#src").extract_first()
maintext = response.xpath("//div[#class = 'ie2013-contentstory']//p//text()").extract()
maintext = ' '.join(map(str, maintext))
maintext = maintext.replace('\r','')
contd = response.xpath("//div[#class = 'ie2013-contentstory']/p[#align = 'right']/text()").extract_first()
item['date_updated'] = date_updated
item['headline'] = headline
item['maintext'] = maintext
item['image_url'] = image_url
item['article_url'] = article_url
next_page_url = response.xpath('//a[#rel="canonical"][#id="active"]/following-sibling::a[1]/#href').extract_first()
if next_page_url :
yield scrapy.Request(
url=next_page_url,
callback = self.parse_next_page,
cb_kwargs={
'item': item,
}
)
else:
yield item
def parse_next_page(self, response, item):
next_maintext = response.xpath("//div[#class = 'ie2013-contentstory']//p//text()").extract()
next_maintext = ' '.join(map(str, next_maintext))
next_maintext = next_maintext.replace('\r','')
item["maintext"] += next_maintext
next_page_url = response.xpath('//a[#rel="canonical"][#id="active"]/following-sibling::a[1]/#href').extract_first()
if next_page_url :
yield scrapy.Request(
url=next_page_url,
callback = self.parse_next_page,
cb_kwargs={
'item': item,
}
)
else:
yield item
doc = '''Andrew Yan-Tak Ng is a Chinese American computer scientist.He is the former chief scientist at Baidu, where he led the company's
Artificial Intelligence Group. He is an adjunct professor (formerly associate professor) at Stanford University. Ng is also the co-founder
and chairman at Coursera, an online education platform. Andrew was born in the UK on 27th Sep 2.30pm 1976. His parents were both from Hong Kong.'''
# tokenize doc
tokenized_doc = nltk.word_tokenize (doc)
# tag sentences and use nltk's Named Entity Chunker
tagged_sentences = nltk.pos_tag (tokenized_doc)
ne_chunked_sents = nltk.ne_chunk (tagged_sentences)
When you process and extract chucks..I see we only get
[('Andrew', 'PERSON'), ('Chinese', 'GPE'), ('American', 'GPE'), ('Baidu', 'ORGANIZATION'), ("company's Artificial Intelligence Group", 'ORGANIZATION'), ('Stanford University', 'ORGANIZATION'), ('Coursera', 'ORGANIZATION'), ('Andrew', 'PERSON'), ('UK', 'ORGANIZATION'), ('Hong Kong', 'GPE')]
I need to get the time and date too?
Please suggest...
Thank you.
You need a more sophisticated tagger like the Stanford's Named Entity Tagger. Once you have it installed and configured, you can run it:
from nltk.tag import StanfordNERTagger
from nltk.tokenize import word_tokenize
stanfordClassifier = '/path/to/classifier/classifiers/english.muc.7class.distsim.crf.ser.gz'
stanfordNerPath = '/path/to/jar/stanford-ner/stanford-ner.jar'
st = StanfordNERTagger(stanfordClassifier, stanfordNerPath, encoding='utf8')
doc = '''Andrew Yan-Tak Ng is a Chinese American computer scientist.He is the former chief scientist at Baidu, where he led the company's Artificial Intelligence Group. He is an adjunct professor (formerly associate professor) at Stanford University. Ng is also the co-founder and chairman at Coursera, an online education platform. Andrew was born in the UK on 27th Sep 2.30pm 1976. His parents were both from Hong Kong.'''
result = st.tag(word_tokenize(doc))
date_word_tags = [wt for wt in result if wt[1] == 'DATE' or wt[1] == 'ORGANIZATION']
print date_word_tags
Where the output would be:
[(u'Artificial', u'ORGANIZATION'), (u'Intelligence', u'ORGANIZATION'), (u'Group', u'ORGANIZATION'), (u'Stanford', u'ORGANIZATION'), (u'University', u'ORGANIZATION'), (u'Coursera', u'ORGANIZATION'), (u'27th', u'DATE'), (u'Sep', u'DATE'), (u'2.30pm', u'DATE'), (u'1976', u'DATE')]
You will probably run into some issues when trying to install and set up everything, but I think it's worth the hassle.
Let me know if it helps.
Am using the twitter gem to connect to the twitter streaming api.
When i run the code in the console in sublime text 2, everything works as it should and am getting the results from the api. However when i try to run the script from the terminal i get this error:
/Users/username/.rbenv/versions/2.1.4/lib/ruby/gems/2.1.0/gems/twitter-5.15.0/lib/twitter/streaming/connection.rb:16:in `initialize': Can't assign requested address - connect(2) for "199.16.156.217" port (Errno::EADDRNOTAVAIL)
Am only using the example code from the github page of the twitter gem.
https://github.com/sferik/twitter
client = Twitter::Streaming::Client.new do |config|
config.consumer_key = "YOUR_CONSUMER_KEY"
config.consumer_secret = "YOUR_CONSUMER_SECRET"
config.access_token = "YOUR_ACCESS_TOKEN"
config.access_token_secret = "YOUR_ACCESS_SECRET"
end
client.sample do |object|
puts object.text if object.is_a?(Twitter::Tweet)
end
Do anyone know why i get this error, and how i can fix this?
require 'twitter'
while true
config = {
:consumer_key => CONSUMER_KEY,
:consumer_secret => COMSUMER_SECRET,
:access_token => ACCESS_TOKEN,
:access_token_secret => ACCESS_TOKEN_SECRET,
}
sClient = Twitter::Streaming::Client.new(config)
topics = ['edelweiss', 'rose']
sClient.filter(:track => topics.join(',')) do |tweet|
if tweet.is_a?(Twitter::Tweet)
puts "#{tweet.user.screen_name}: #{tweet.text}"
end
end
end
running code
$ ruby lasswi.rb
Suphatra_Rfc: RT #GGiftfyy: Rose Gold ในมือนั้นอิจเเรงงงง เครื่องเก่าโยนมาทางนี้ก็ได้นะเพ่~ น้องพร้อมเสมอ😂😂 #งานซูมต้องมา #อยากได้อ่ะอยากได้ 😆 http://t.…
CBullsfans: Jimmy Butler Reportedly Doesn't Respect Derrick Rose's Work Ethic http://t.co/3Ikjvvjuth #Bulls #NBA
sobinasalvez: RT #iPhoneTeam: Rose gold everything http://t.co/1DLhXokknu
magicearth_: RT #magazine_wmw: Rose-ringed parakeets in flight on their way to roost in an urban cemetery in London, England.
Photograph: Sam Hobson htt…
demoo2012: Rose, use this pic 👍
#Razana96 http://t.co/uAwS9JdHyl
EndearingImages: New artwork for sale! - "Grace" - http://t.co/ugpIaxABqg #fineartamerica http://t.co/jg0e3eDNll
LoveKnitting: Great rose workshop with #NickyKnits at #TheKnittingandStitchingShow such a lovely lady! #twistedthread http://t.co/rV0Bsjg63t
camarillonican4: Brand New Sealed - Apple iPhone 6S Plus - 64GB - Rose Gold - UNLOCKED http://t.co/ygFPoN7pLO http://t.co/mqXKJmmrNz
Dekho00: RT #PAPIGFUNK: Giveaway ENDING on Sunday! Enter Now- iPhone 6S Plus - Rose Gold - Unboxing + Giveaway! https://t.co/02ONZ6D8IS #iPhone6SPlu…
souravmishra1: RT #RHIndia: "Only in art will the lion lie down with the lamb, and the rose grow without thorn." - Martin Amis #RandomAmis http://t.co/MT…
exol_lzw0112: RT #DOThFanclub: [Preview] 151009 ONE K Concert (cr.Like a star, Lovely Rose, Chibimori)
อันนยอง~~ http://t.co/w8vvE40dEK
BruhninhaD: Livro: Hugo & Rose da Editora Agir
Será o correto deixar a realidade para viver um sonho?http://t.co/OfxkKrzBog #books #book #livros #blog
This was a known issue with the twitter gem, using an updated version from GitHub solved the problem.
https://github.com/sferik/twitter/issues/709
I am trying to build an access process to add contacts to an outlook folder. I have linked the folder and can add, update and delete records. But not all of the fields are showing up correctly in outlook. Namely the address field.
I have added a test contact and added an address, went back into access and mimicked the data perfectly, but no address shows up in outlook.
Is there something that needs to be done in order for addresses to show up in outlook?
Here is my data:
First Last Title Company Department Office Post Office Box Address City State Zip/Postal Code Country/Region Phone
John Test superduper 500 west T Test City MI 99999 United States of America 1 800 555 5555
Bill Test Awesomedawesome 600 East G Test City MI 99999 United States of America 1 800 666 6666
The first record is outlook added, the lower one is access added.
Here is the view I get in outlook:
I ended up going the code route:
Dim olCI As Outlook.ContactItem
Set olCI = mf.Items.Add(olContactItem)
With olCI
.FullName = Trim(rs!Name)
.Title = Trim(rs!Salutation)
.JobTitle = Trim(rs!Title)
.Email1Address = Trim(rs!Email)
.CompanyName = Trim(rs!AccountName)
.BusinessAddressStreet = Trim(rs!MailingStreet)
.BusinessAddressCity = Trim(rs!MailingCity)
.BusinessAddressPostalCode = Trim(rs!MailingZipCode)
.BusinessAddressCountry = Trim(rs!MailingCountry)
.BusinessFaxNumber = Trim(rs!Fax)
.BusinessTelephoneNumber = Trim(rs!Phone)
.OtherTelephoneNumber = Trim(rs!OtherPhone)
.BusinessHomePage = ""
.MobileTelephoneNumber = Trim(rs!MobilePhone)
.Birthday = IIf(IsNull(rs!Birthdate), 0, rs!Birthdate)
.Department = rs!Department
.Save
End With
I have a non-database-backed class in Ruby:
class User
attr_accessor :countries
end
I want countries to simply be an array of ISO country codes (US, GB, CA, AU, etc) and I don't want to build a separate model to hold each. Is there a magic way to make Ruby understand that :countries is an array and treat it accordingly, or do I need to write the countries and countries= methods?
I tried just setting the countries array with user.countries = ['US'], and I'm getting a NoMethodError.
The type of a variable doesn't matter in Ruby.
attr_accessor just creates getter and setter methods that set and return instance variables; #countries in this case. You can set the instance variable to your array, or use the setter:
class User
attr_accessor :countries
def initialize
#countries = %w[Foo Bar Baz]
# Or...
self.countries = %w[Foo Bar Baz]
end
end
> puts User.new.countries
=> ["Foo", "Bar", "Baz"]
Personally I prefer using the instance variable instead of self.xxx; it's too easy to forget the self. bit and you end up setting a local variable, leaving the instance variable nil. I also think it's ugly.
If the countries won't be changing between instances, why not a constant?
Edit/Clarification
Tadman's point is well-taken, e.g., this diatribe on state. The circumtances under which I don't care about that are limited to small, self-controlled, stand-alone classes. There are inherent risks in making those assumptions, the level of those risks is project-dependent.
Looks like countries should be a constant:
class User
COUNTRIES = %w(
AF AX AL DZ AS AD AO AI AQ AG AR AM AW AU AT AZ BS BH BD BB BY BE BZ BJ BM
BT BO BQ BA BW BV BR IO BN BG BF BI KH CM CA CV KY CF TD CL CN CX CC CO KM
CG CD CK CR CI HR CU CW CY CZ DK DJ DM DO EC EG SV GQ ER EE ET FK FO FJ FI
FR GF PF TF GA GM GE DE GH GI GR GL GD GP GU GT GG GN GW GY HT HM VA HN HK
HU IS IN ID IR IQ IE IM IL IT JM JP JE JO KZ KE KI KP KR KW KG LA LV LB LS
LR LY LI LT LU MO MK MG MW MY MV ML MT MH MQ MR MU YT MX FM MD MC MN ME MS
MA MZ MM NA NR NP NL NC NZ NI NE NG NU NF MP NO OM PK PW PS PA PG PY PE PH
PN PL PT PR QA RE RO RU RW BL SH KN LC MF PM VC WS SM ST SA SN RS SC SL SG
SX SK SI SB SO ZA GS SS ES LK SD SR SJ SZ SE CH SY TW TJ TZ TH TL TG TK TO
TT TN TR TM TC TV UG UA AE GB US UM UY UZ VU VE VN VG VI WF EH YE ZM ZW
).freeze
end
User::COUNTRIES.include? "US" #=> true
freeze prevents modifications:
User::COUNTRIES.delete "US" #=> RuntimeError: can't modify frozen Array
Update
The problem here is that your countries array has to be persisted somehow. You are mentioning has_many so Rails seems to be involved. You can use ActiveRecord's serialize method:
class User < ActiveRecord::Base
serialize :countries
end
This will save the countries attribute to the database as an object and retrieve it as such:
u = User.new
u.countries = ["US", "CA"]
u.save
u = User.last
u.countries
#=> ["US", "CA"]
It's converted to and from YAML internally, so the users table looks like:
mysql> SELECT * FROM users;
+----+-------------------+---------------------+---------------------+
| id | countries | created_at | updated_at |
+----+-------------------+---------------------+---------------------+
| 1 | ---\n- US\n- CA\n | 2013-09-24 18:24:03 | 2013-09-24 18:24:03 |
+----+-------------------+---------------------+---------------------+
1 row in set (0,00 sec)