Why does slicing from higher negative index to smaller negative index gives empty string? - slice

can someone please explain to me why this gives empty string as its output and not 'oht';
word='python'
word[-2:-5]

You will get your expected result by running the below Code
>>word[-2:1:-1]
'oht'
Usually, The slicing happens from left to right. For Example,
word='python'
word[-2:]
Results
'on'
word[:-5]
Will Results
'p'
While Doing to attempt it together
word[-2:-5]
There will be no output to give you a result. The first index should be smaller than the second index. in your question -2 is greater than -5.

Related

ElasticSearch threshold for search results

I have an elasticsearch query that returns me the correct results in sorted order (the highest relevancy is at the top and is accurate). However, the query also returns me a lot of results and beyond the top 4 or 5, the results seem less relevant.
My question is :
How to set a threshold such that only the most relevant results are
returned by the query
You can use the size param in your elasticsearch query to return your configured number of results. So in your example, if you think only top 5 results are relevant for you then, you can set this size param to 5.
Note, As you might know already that elasticsearch results are sorted according to their score already, hence using size 5 means top 5 relevant documents are returned to you.

How does one calculate the integral image from original?

I tried separating the individual channels of the image and then calculate using the recursive function. At the end, I joined the three channels:
function [ii] = computeIntegralImage(image)
%function to compute integral from original image
iip=zeros(size(image,1)+1,size(image,2)+1);
jjp=zeros(size(image,1)+1,size(image,2)+1);
kkp=zeros(size(image,1)+1,size(image,2)+1);
for i=2:size(iip,1)
for j=2:size(iip,2)
iip(i,j)=image(i-1,j-1,1)+iip(i,j-1)+iip(i-1,j)-iip(i-1,j-1);
end
end
for i=2:size(jjp,1)
for j=2:size(jjp,2)
jjp(i,j)=image(i-1,j-1,2)+jjp(i,j-1)+jjp(i-1,j)-jjp(i-1,j-1);
end
end
for i=2:size(kkp,1)
for j=2:size(kkp,2)
kkp(i,j)=image(i-1,j-1,3)+kkp(i,j-1)+kkp(i-1,j)-kkp(i-1,j-1);
end
end
ii= cat(3,iip,jjp,kkp);
The matlab output for function integralImage is completely white:
My output is a colorful image:
The integral image can be easily computed by first integrating over one axis, then integrating the result over the other axis. This 1D integral is computed with cumsum:
out = cumsum(image,1);
out = cumsum(out,2);
Note that if image is of an integer type, this is likely going to lead to overflow. You should convert such an array to double first.
Finally, to display the result you need to use
imshow(out,[])
otherwise you don’t see the full range of the data, and anything above 1 becomes white, as you saw with MATLAB’s result.
Regarding your code:
The problem is overflow. Convert the value taken from input to double first. In MATLAB, uint8(150)+150 == uint8(255). This leads to alternating rows and columns like you see: one step you subtract some large value from the partial sums, leading to a small value, the next step you subtract a small value leading to a large value, etc.
At first I was confused by your first row and column in the output, which remain at 0. But then I noticed that the output is one larger than the input, and you use this first column to avoid special cases.
Consider cropping the first row and column from your output.
Regarding loop order: It is faster when the inner loop is over the first dimension, as then the data is accessed in storage order and therefore uses the cache better. This should not affect the result, just the timing.

Does a reverse key index help if i use an incremental sequence to insert subsequent values

I understood the basic rationale for a reverse key index that it will reduce index contention. Now if I have 3 numbers in the index: 12345, 27999, 30632, i can see that if i reverse these numbers, the next number in the sequence won't always hit the same leaf block.
But if the numbers were like :12345,12346,12347, then the next numbers 12348,12349 (incremented by 1) would hit the same leaf block even if the index is reversed:
54321,64321,74321,84321,94321.
So how is the reverse index helping me? It was supposed to help particularly while using sequences
If we're talking about a sequence-generated value, you can't look at 5 values and draw too many conclusions. You need to think about the data that has already been inserted and the data that will be inserted in the future.
Assuming that your sequence started at 12345, the first 5 values would be inserted sequentially. But then the sixth value will be 12350. Reverse that and you get 05321 which would go to the far left of the index. Then you'd generate 12351. Reverse that to get 15321 and that's again toward the left-hand side of the index between the first value you generated (54321) and the most recent value (05321). As the sequence generates new values, they'll go further to the right until everything resets every 10 numbers and you're inserting into the far left-hand side of the index again.

Find Next and Previous Timestamps Within Ruby Array of Timestamps for a Given Timestamp

So if I have an array of timestamps like so (many more than this in reality):
2013-07-27 18:02:59.865572
2013-07-27 18:29:00.132601
2013-07-27 19:00:00.081585
2013-07-27 19:29:00.273857
2013-07-27 20:00:00.011761
And I wanted to find which two timestamps 2013-07-27 19:13:00.081585 falls between, what would be the most elegant way with Ruby?
I can envision an ugly bunch of loops and if statements to do this, but being a novice Ruby programmer I suspect there is a much more elegant way to do this (that I absolutely cannot find!).
Thanks!
It depends on a few things.
Whether the timestamp you're looking for is known to be in the array.
What between means.
Whether elements in the array are unique.
Let's assume the array is sorted, or that you'll sort it yourself beforehand.
If your_timestamp is known to be in the array, you can find its index with timestamp_array.index(your_timestamp). Logically, the elements your_timestamp is between will have indexes immediately above and below. There are two things to watch for.
Falling off either end of the array.
Duplicate timestamps.
If your_timestamp is either the first or last element in the array, you won't have an element with an index immediately below the first or immediately above the last.
If your array contains duplicate timestamps, you're liable to return your_timestamp as one of the values. It seems like you don't want to do that, but there isn't strictly a right or wrong answer here. It's application-dependent.
If you don't know whether your_timestamp is in the array, or if you don't want your_timestamp as one of the values (unless it's the first or last element of the sorted array, that is), then this might be a better approach.
timestamp_array.sort.each_cons(2){ |ts|
# If your desired timestamp is in the timestamp array, you'll
# get at least two pairs of timestamps.
answer.concat ts if your_desired_timestamp.between?(ts[0], ts[1])
}
# If you have more than 2 elements, return only the first and last element.
if answer.length > 2
answer = answer.first, answer.last
end
p answer
["2013-07-27 18:29:00.132601", "2013-07-27 19:29:00.273857"]
This works correctly for duplicate timestamps, and there's no danger of falling off either end of the array.
Some optimizations are available. For example, you can switch to a binary search (bsearch method), which might be worthwhile if you have very large arrays; you can eliminate the conditional if answer.length > 2; etc.
So someone else left an answer and then redacted it for some reason, I think it was because there was an error, but it led me in the right direction, as did #squiguy.
timestamp_array.sort.each_cons(2).select{ |a,b|
puts a
if a < your_desired_timestamp and b > your_desired_timestamp)
puts 'this is the valid range for ' + your_desired_timestamp.to_s
end
puts b
}
Thanks Guys and Gals!

Hash Table and Substring Matching

I have hundreds of keys for example like:
redapple
maninred
foraman
blueapple
i have data related to these keys, data is a string and has related key at the end.
redapple: the-tree-has-redapple
maninred: she-saw-the-maninred
foraman: they-bought-the-present-foraman
blueapple: it-was-surprising-but-it-was-a-blueapple
i am expected to use hash table and hash function to record the data according to keys and i am expected to be able to retieve data from table.
i know to use hash function and hash table, there is no problem here.
But;
i am expected to give the program a string which takes place as a substring and retrieve the data for the matching keys.
For example:
i must give "red" and must be able to get
redapple: the-tree-has-redapple
maninred: she-saw-the-maninred
as output.
or
i must give "apple" and must be able to get
redapple: the-tree-has-redapple
blueapple: it-was-surprising-but-it-was-a-blueapple
as output.
i only can think to search all keys if they has a matching substring, is there some other solution? If i search all the key strings for every query, use of hashing is unneeded, meaningless, is it?
But, searching all keys for substring is O(N), i am expected to solve the problem with O(1).
With hashing i can hash a key e.g. "redapple" to e.g. 943, and "maninred" to e.g. 332.
And query man give the string "red" how can i found out from 943 and 332 that the keys has "red" substring? It is out of my cs thinking skills.
Thanks for any advise, idea.
Possible you should use the invert index for n-gramm, the same approach is used for spell correction. For word redapple you will have following set of 3-gramms red, eda, dap, app, ppl, ple. For each n-gramm you will have a list of string in which contains it. For example for red it will be
red -> maninred, redapple
words in this list must be ordered. When you want to find the all string that contains a a give substring, you dived the substring on n-gramm and intercept the list of words for n-gramm.
This alogriphm is not O(n), but it practice it has enough speed.
It cannot be nicely done in a hash table. Given a a substring - you cannot predict the hashed result of the entire string1
A reasonable alternative is using a suffix tree. Each terminal in the suffix tree will hold list of references of the complete strings, this suffix is related to.
Given a substring t, if it is indeed a substring of some s in your collection, then there is a suffix x of s - such that t is a prefix of x. By traversing the suffix tree while reading t, and find all the terminals reachable from the the node you reached from there. These terminals contain all the needed strings.
(1) assuming reasonable hash function, if hashCode() == 0 for each element, you can obviously predict the hash value.
I have researched this problem recently and i'm sure that this can not be done. I hope hash table will help me improve speed of searching like you but it makes me disapointed.

Resources