Julia doing something strange with assignments - debugging

I am trying to learn Julia by repeating some of the easy ProjectEuler problems in Julia. Everything has been really smooth so far, up until I encountered this frustrating problem. I spent some time debugging my code, and here's what I found:
(Hopefully I'm not missing something really stupid here)
function is_abundant(n::Int) #just a function
return prod(map(x->int((x[1]^(x[2]+1)-1)/(x[1]-1)),factor(n))) > 2 * n
end
abundants=[12] #there should be a better way to initialize an Array
for i=13:28120
if is_abundant(i)
push!(abundants,i)
end
end
le=abundants; #The following lines are the problems
ri=abundants;
d=length(abundants)
println(d)
pop!(le)
shift!(ri)
println(le==ri, " ", endof(ri), " ", endof(abundants))
The output I get is:
6964
true 6962 6962
which means that Julia has changed all three sets of le , ri and abundants with each of pop! and shift! commands. I was able to work around this bug/problem by using a dumb extra identity mapping:
le=map(x->x,abundants)
ri=map(x->x,abundants)
Now the output would change to what I initially expected:
6964
false 6963 6964
My question is, if this is not a bug, why is Julia keeping an equivalence relation between le , ri and abundants sets in the first place? Also, can anyone reproduce this behaviour? I am using Julia "Version 0.3.0-rc3+14 (2014-08-13 16:01 UTC)" on Ubuntu 14.04.

le and ri both point to the same list that abundants points to, so this is expected behavior - they are all operating on the same memory. This part of the manual might help you understand. Or possibly the MATLAB differences section, as it is different in MATLAB (but most other languages are like Julia).
For
abundants=[12] #there should be a better way to initialize an Array
how about
abundants = {} # Vector of anything
or
abundants = Int[] # Vector of ints
and instead of your map(x->x,...), you can just use copy.

Related

gensim/models/ldaseqmodel.py:217: RuntimeWarning: divide by zero encountered in double_scalars

/Users/Barry/anaconda/lib/python2.7/site-packages/gensim/models/ldaseqmodel.py:217: RuntimeWarning: divide by zero encountered in double_scalars
convergence = np.fabs((bound - old_bound) / old_bound)
#dynamic topic model
def run_dtm(num_topics=18):
docs, years, titles = preprocessing(datasetType=2)
#resort document by years
Z = zip(years, docs)
Z = sorted(Z, reverse=False)
years_new, docs_new = zip(*Z)
#generate time slice
time_slice = Counter(years_new).values()
for year in Counter(years_new):
print year,' --- ',Counter(years_new)[year]
print '********* data set loaded ********'
dictionary = corpora.Dictionary(docs_new)
corpus = [dictionary.doc2bow(text) for text in docs_new]
print '********* train lda seq model ********'
ldaseq = ldaseqmodel.LdaSeqModel(corpus=corpus, id2word=dictionary, time_slice=time_slice, num_topics=num_topics)
print '********* lda seq model done ********'
ldaseq.print_topics(time=1)
Hey guys, I'm using the dynamic topic models in gensim package for topic analysis, following this tutorial, https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/ldaseqmodel.ipynb, however I always got the same unexpected error. Can anyone give me some guidance? I'm really puzzled even thought I have tried some different dataset for generating corpus and dictionary.
The error is like this:
/Users/Barry/anaconda/lib/python2.7/site-packages/gensim/models/ldaseqmodel.py:217: RuntimeWarning: divide by zero encountered in double_scalars
convergence = np.fabs((bound - old_bound) / old_bound)
The np.fabs error means it is encountering an error with NumPy. What NumPy and gensim versions are you using?
NumPy no longer supports Python 2.7, and Ldaseq was added to Gensim in 2016, so you might just not have a compatible version available. If you are recoding a Python 3+ tutorial to a 2.7 variant, you obviously understand a little bit about the version differences - try running it in a, say, 3.6.8 environment (you will have to upgrade sometime anyway, 2020 is the end of 2.7 support from Python itself). That might already help, I've gone through the tutorial and did not encounter this with my own data.
That being said, I have encountered the same error before when running LdaMulticore, and it was caused by an empty corpus.
Instead of running your code fully in a function, can you try to go through it line by line (or look at you DEBUG level log) and check whether your output has the expected properties: that, for example your corpus is not empty (or contains empty documents)?
If that happens, fix the preprocessing steps and try again - that at least helped me and helped with the same ldamodel error in the mailing list.
PS: not commenting because I lack the reputation, feel free to edit this.
This is the issue with the source code of ldaseqmodel.py itself.
For the latest gensim package(version 3.8.3) I am getting the same error at line 293:
ldaseqmodel.py:293: RuntimeWarning: divide by zero encountered in double_scalars
convergence = np.fabs((bound - old_bound) / old_bound)
Now, if you go through the code you will see this:
enter image description here
You can see that here they divide the difference between bound and old_bound by the old_bound(which is also visible from the warning)
Now if you analyze further you will see that at line 263, the old_bound is initialized with zero and this is the main reason that you are getting this warning of divide by zero encountered.
enter image description here
For further information, I put a print statement at line 294:
print('bound = {}, old_bound = {}'.format(bound, old_bound))
The output I received is: enter image description here
So, in a single line you are getting this warning because of the source code of the package ldaseqmodel.py not because of any empty document. Although if you do not remove the empty documents from your corpus you will receive another warning. So I suggest if there are any empty documents in your corpus remove them and just ignore the above warning of division by zero.

Native Vim Random number script

I know that there are various ways to get random numbers, eg, from the shell. However, I'm running vim on an android phone with very little compiled in. Also, it does not have to be rigorously random. The point is, what's an interesting, or concise, or fast (that is, with vim native functions), or short way to get a sequence of reasonably good random numbers in Vim?
Try something like
function Rand()
return str2nr(matchstr(reltimestr(reltime()), '\v\.#<=\d+')[1:])
endfunction
. I know no better option then using some of the time functions (there are two of them: reltime() and localtime(), but the latter is updated only each second). I would prefer to either avoid random numbers or use pyeval('random.randint(1, 10)') (preceded by python import random), because shell is slow and I don’t trust time-based solutions.
Note: documentation says that format of the item returned by reltime() depends on the system, thus I am using reltimestr(), not doing something with reltime()[1] which looks like if it contains nanoseconds.
I've recently played around with random numbers in Vim script myself. Here are some resources that I found in the process.
No Vim script
By all means, use an external random number generator if you can. As a rule, they are better and faster than anything that could be done in Vim script.
For example, try
:python import random; print random.randrange(1, 7)
:echo system('echo $RANDOM')
another scripting language, for example Ruby
Libraries
Vim script libraries. These hopefully strive to provide decent quality RNG implementations.
vital.vim is an excellent and comprehensive library created by the vim-jp user group. Their random number generator sports an impressive array of functionality and is the best pure Vim script RNG I know of. vital.vim uses an Xorshift algorithm. Check it out!
Rolling a die with vital.vim:
let Random = vital#of('vital').import('Random')
echo Random.range(1, 7)
vim-rng is a small random number generator plugin. It exports a couple of global functions that rely on a multiply-with-carry algorithm. This project seems to be a work in progress.
Rolling a die with rng:
echo RandomNumber(1, 6)
magnum.vim is my own little big integer library. I've recently added a random number generator that generates integers of any size. It uses the XORSHIFT-ADD algorithm.
Rolling a die with magnum.vim:
let six = magnum#Int(6)
echo magnum#random#NextInt(six).Add(magnum#ONE).Number()
Rndm has been around for much longer than the other libraries. Its functionality is exposed as a couple of global functions. Rolling a die with Rndm:
echo Urndm(1, 6)
Discussion and snippets
Finally, a few links to insightful discussion and Vim script snippets.
ZyX's reltime snippet on this page.
loreb's vimprng project on GitHub has an impressive number of RNG implementations in Vim script. Very useful.
This old mailing list discussion has a couple of Vim script snippets. The first one given by Bee-9 is limited to 16 bit but I found it quite effective. Here it is:
let g:rnd = localtime() % 0x10000
function! Random(n) abort
let g:rnd = (g:rnd * 31421 + 6927) % 0x10000
return g:rnd * a:n / 0x10000
endfunction
Another script, found in a person named Bart's personal config files.
Episode 57 on Vimcasts.org discusses Vim's 'expression register' and refers to random number examples throughout. Refers to this Stackoverflow question and ZyX's snippet. Recommended.
The Vim wiki on wikia has an article 'Jump to a random line' that has a few resources not mentioned yet.
Based on others' answers and other resources from the internet, I have written
two functions to generate a random integer in the given range [Low, High].
Both the two functions receive two arguments: Low and High and return a
random number in this range.
Combine Python and Vim script
The first function combines Python and Vim script.
" generate a random integer from range [Low, High] using Python
function! RandInt(Low, High) abort
" if you use Python 3, the python block should start with `python3` instead of
" `python`, see https://github.com/neovim/neovim/issues/9927
python3 << EOF
import vim
import random
# using vim.eval to import variable outside Python script to python
idx = random.randint(int(vim.eval('a:Low')), int(vim.eval('a:High')))
# using vim.command to export variable inside Python script to vim script so
# we can return its value in vim script
vim.command("let index = {}".format(idx))
EOF
return index
endfunction
Pure Vim script
The second function I propose uses pure vim script:
function! RandInt(Low, High) abort
let l:milisec = str2nr(matchstr(reltimestr(reltime()), '\v\.\zs\d+'))
return l:milisec % (a:High - a:Low + 1) + a:Low
endfunction
Use luaeval() (Neovim only)
The third way to generate random number is to use lua via luaeval().
" math.randomseed() is need to make the random() function generate different numbers
" on each use. Otherwise, the first number it generate seems same all the time.
luaeval('math.randomseed(os.time())')
let num = luaeval('math.random(1, 10)')
If you want to generate random number in non-serious occasions, you may use the
these methods as a starter.

Applying a diff-patch to a string/file

For an offline-capable smartphone app, I'm creating a one-way text sync for Xml files. I'd like my server to send the delta/difference (e.g. a GNU diff-patch) to the target device.
This is the plan:
Time = 0
Server: has version_1 of Xml file (~800 kiB)
Client: has version_1 of Xml file (~800 kiB)
Time = 1
Server: has version_1 and version_2 of Xml file (each ~800 kiB)
computes delta of these versions (=patch) (~10 kiB)
sends patch to Client (~10 kiB transferred)
Client: computes version_2 from version_1 and patch <= this is the problem =>
Is there a Ruby library that can do this last step to apply a text patch to files/strings? The patch can be formatted as required by the library.
Thanks for your help!
(I'm using the Rhodes Cross-Platform Framework, which uses Ruby as programming language.)
Your first task is to choose a patch format. The hardest format for humans to read (IMHO) turns out to be the easiest format for software to apply: the ed(1) script. You can start off with a simple /usr/bin/diff -e old.xml new.xml to generate the patches; diff(1) will produce line-oriented patches but that should be fine to start with. The ed format looks like this:
36a
<tr><td class="eg" style="background: #182349;"> </td><td><tt>#182349</tt></td></tr>
.
34c
<tr><td class="eg" style="background: #66ccff;"> </td><td><tt>#xxxxxx</tt></td></tr>
.
20,23d
The numbers are line numbers, line number ranges are separated with commas. Then there are three single letter commands:
a: add the next block of text at this position.
c: change the text at this position to the following block. This is equivalent to a d followed by an a command.
d: delete these lines.
You'll also notice that the line numbers in the patch go from the bottom up so you don't have to worry about changes messing up the lines numbers in subsequent chunks of the patch. The actual chunks of text to be added or changed follow the commands as a sequence of lines terminated by a line with a single period (i.e. /^\.$/ or patch_line == '.' depending on your preference). In summary, the format looks like this:
[line-number-range][command]
[optional-argument-lines...]
[dot-terminator-if-there-are-arguments]
So, to apply an ed patch, all you need to do is load the target file into an array (one element per line), parse the patch using a simple state machine, call Array#insert to add new lines and Array#delete_at to remove them. Shouldn't take more than a couple dozen lines of Ruby to write the patcher and no library is needed.
If you can arrange your XML to come out like this:
<tag>
blah blah
</tag>
<other-tag x="y">
mumble mumble
</other>
rather than:
<tag>blah blah</tag><other-tag x="y">mumble mumble</other>
then the above simple line-oriented approach will work fine; the extra EOLs aren't going to cost much space so go for easy implementation to start.
There are Ruby libraries for producing diffs between two arrays (google "ruby algorithm::diff" to start). Combining a diff library with an XML parser will let you produce patches that are tag-based rather than line-based and this might suit you better. The important thing is the choice of patch formats, once you choose the ed format (and realize the wisdom of the patch working from the bottom to the top) then everything else pretty much falls into place with little effort.
I know this question is almost five years old, but I'm going to post an answer anyway. When searching for how to make and apply patches for strings in Ruby, even now, I was unable to find any resources that answer this question satisfactorily. For that reason, I'll show how I solved this problem in my application.
Making Patches
I'm assuming you're using Linux, or else have access to the program diff through Cygwin. In that case, you can use the excellent Diffy gem to create ed script patches:
patch_text = Diffy::Diff.new(old_text, new_text, :diff => "-e").to_s
Applying Patches
Applying patches is not quite as straightforward. I opted to write my own algorithm, ask for improvements in Code Review, and finally settle on using the code below. This code is identical to 200_success's answer except for one change to improve its correctness.
require 'stringio'
def self.apply_patch(old_text, patch)
text = old_text.split("\n")
patch = StringIO.new(patch)
current_line = 1
while patch_line = patch.gets
# Grab the command
m = %r{\A(?:(\d+))?(?:,(\d+))?([acd]|s/\.//)\Z}.match(patch_line)
raise ArgumentError.new("Invalid ed command: #{patch_line.chomp}") if m.nil?
first_line = (m[1] || current_line).to_i
last_line = (m[2] || first_line).to_i
command = m[3]
case command
when "s/.//"
(first_line..last_line).each { |i| text[i - 1].sub!(/./, '') }
else
if ['d', 'c'].include?(command)
text[first_line - 1 .. last_line - 1] = []
end
if ['a', 'c'].include?(command)
current_line = first_line - (command=='a' ? 0 : 1) # Adds are 0-indexed, but Changes and Deletes are 1-indexed
while (patch_line = patch.gets) && (patch_line.chomp! != '.') && (patch_line != '.')
text.insert(current_line, patch_line)
current_line += 1
end
end
end
end
text.join("\n")
end

Fastest way to skip lines while parsing files in Ruby?

I tried searching for this, but couldn't find much. It seems like something that's probably been asked before (many times?), so I apologize if that's the case.
I was wondering what the fastest way to parse certain parts of a file in Ruby would be. For example, suppose I know the information I want for a particular function is between lines 500 and 600 of, say, a 1000 line file. (obviously this kind of question is geared toward much large files, I'm just using those smaller numbers for the sake of example), since I know it won't be in the first half, is there a quick way of disregarding that information?
Currently I'm using something along the lines of:
while buffer = file_in.gets and file_in.lineno <600
next unless file_in.lineno > 500
if buffer.chomp!.include? some_string
do_func_whatever
end
end
It works, but I just can't help but think it could work better.
I'm very new to Ruby and am interested in learning new ways of doing things in it.
file.lines.drop(500).take(100) # will get you lines 501-600
Generally, you can't avoid reading file from the start until the line you are interested in, as each line can be of different length. The one thing you can avoid, though, is loading whole file into a big array. Just read line by line, counting, and discard them until you reach what you look for. Pretty much like your own example. You can just make it more Rubyish.
PS. the Tin Man's comment made me do some experimenting. While I didn't find any reason why would drop load whole file, there is indeed a problem: drop returns the rest of the file in an array. Here's a way this could be avoided:
file.lines.select.with_index{|l,i| (501..600) === i}
PS2: Doh, above code, while not making a huge array, iterates through the whole file, even the lines below 600. :( Here's a third version:
enum = file.lines
500.times{enum.next} # skip 500
enum.take(100) # take the next 100
or, if you prefer FP:
file.lines.tap{|enum| 500.times{enum.next}}.take(100)
Anyway, the good point of this monologue is that you can learn multiple ways to iterate a file. ;)
I don't know if there is an equivalent way of doing this for lines, but you can use seek or the offset argument on an IO object to "skip" bytes.
See IO#seek, or see IO#open for information on the offset argument.
Sounds like rio might be of help here. It provides you with a lines() method.
You can use IO#readlines, that returns an array with all the lines
IO.readlines(file_in)[500..600].each do |line|
#line is each line in the file (including the last \n)
#stuff
end
or
f = File.new(file_in)
f.readlines[500..600].each do |line|
#line is each line in the file (including the last \n)
#stuff
end

Ruby delete method (string manipulation)

I'm new to Ruby, and have been working my way through Mr Neighborly's Humble Little Ruby Guide. There have been a few typos in the code examples along the way, but I've always managed to work out what's wrong and subsequently fix it - until now!
This is really basic, but I can't get the following example to work on Mac OS X (Snow Leopard):
gone = "Got gone fool!"
puts "Original: " + gone
gone.delete!("o", "r-v")
puts "deleted: " + gone
Output I'm expecting is:
Original: Got gone fool!
deleted: G gne fl!
Output I actually get is:
Original: Got gone fool!
deleted: Got gone fool!
The delete! method doesn't seem to have had any effect.
Can anyone shed any light on what's going wrong here? :-\
The String.delete method (Documented here) treats its arguments as arrays and then deletes characters based upon the intersection of its arrays.
The intersection of 2 arrays is all characters that are common to both arrays. So your original delete of gone.delete!("o", "r-v") would become
gone.delete ['o'] & ['r','s','t','u','v']
There are no characters present in both arrays so the deletion would get an empty array, hence no characters are deleted.
I changed
gone.delete!("o", "r-v")
to
gone.delete!("or-v")
and it works fine.
You get same o/p using some different way like gsub
puts "deleted: " + gone.gsub('o', '')
o/p
deleted: Got gone fool!

Resources