I'm building a Sinatra based app for deployment on Heroku. You can imagine it like a standard URL shortener but where old shortcodes expire and become available for new URLs (I realise this is a silly concept but its easier to explain this way). I'm representing the shortcode in my database as an integer and redefining its reader to give a nice short and unique string from the integer.
As some rows will be deleted, I've written code that goes thru all the shortcode integers and picks the first free one to use just before_save. Unfortunately I can make my code create two rows with identical shortcode integers if I run two instances very quickly one after another, which is obviously no good! How should I implement a locking system so that I can quickly save my record with a unique shortcode integer?
Here's what I have so far:
Chars = ('a'..'z').to_a + ('A'..'Z').to_a + ('0'..'9').to_a
CharLength = Chars.length
class Shorts < ActiveRecord::Base
before_save :gen_shortcode
after_save :done_shortcode
def shortcode
i = read_attribute(:shortcode).to_i
return '0' if i == 0
s = ''
while i > 0
s << Chars[i.modulo(CharLength)]
i /= 62
end
s
end
private
def gen_shortcode
shortcode = 0
self.class.find(:all,:order=>"shortcode ASC").each do |s|
if s.read_attribute(:shortcode).to_i != shortcode
# Begin locking?
break
end
shortcode += 1
end
write_attribute(:shortcode,shortcode)
end
def done_shortcode
# End Locking?
end
end
This line:
self.class.find(:all,:order=>"shortcode ASC").each
will do a sequential search over your entire record collection. You'd have to lock the entire table so that, when one of your processes is scanning for the next integer, the others will wait for the first one to finish. This will be a performance killer. My suggestion, if possible, is for the process to be as follows:
Add a column that indicates when a record has expired (do you expire them by time of creation? last use?). Index this column.
When you need to find the next lowest usable number, do something like
Shorts.find(:conditions => {:expired => true},:order => 'shortcode')
This will have the database doing the hard work of finding the lowest expired shortcode. Recall that, in the absence of the :all parameter, the find method will only return the first matching record.
Now, in order to prevent race conditions between processes, you can wrap this in a transaction and lock while doing the search:
Shorts.transaction do
Shorts.find(:conditions => {:expired => true},:order => 'shortcode', :lock => true)
#Do your thing here. Be quick about it, the row is locked while you work.
end #on ending the transaction the lock is released
Now when a second process starts looking for a free shortcode, it will not read the one that's locked (so presumably it will find the next one). This is because the :lock => true parameter gets an exclusive lock (both read/write).
Check this guide for more on locking with ActiveRecord.
Related
I have a Ruby Discord (discordrb) bot written to manage D&D characters. I notice when multiple players submit the same command, at the same time, the results they each receive are not independent. The request of one player (assigning a weapon to their character) ends up being assigned to other characters who submitted the same request at the same time. I expected each request to be executed separately, in sequence. How do I prevent crossing requests?
bot.message(contains:"$Wset") do |event|
inputStr = event.content; # this should contain "$Wset#" where # is a single digit
check_user_or_nick(event); pIndex = nil; #fetch the value of #user & set pIndex
(0..(#player.length-1)).each do |y| #find the #player pIndex within the array using 5 char of #user
if (#player[y][0].index(#user.slice(0,5)) == 0) then pIndex = y; end; #finds player Index Value (integer or nil)
end;
weaponInt = Integer(inputStr.slice(5,1)) rescue false; #will detect integer or non integer input
if (pIndex != nil) && (weaponInt != false) then;
if weaponInt < 6 then;
#player[pIndex][1]=weaponInt;
say = #player[pIndex][0].to_s + " weapon damage has be set to " + #weapon[(#player[pIndex][1])].to_s;
else;
say = "Sorry, $Wset requires this format: $Wset? where ? is a single number ( 0 to 5 )";
end;
else
say = "Sorry, $Wset requires this format: $Wset? where ? is a single number ( 0 to 5 )";
end;
event.respond say;
end;
To avoid race conditions in multithreaded code like this, the main thing you want to look for are side effects.
Think about the bot.message(contains:"$Wset") do |event| block as a mini program or a thread. Everything in here should be self contained - there should be no way for it to effect any other threads.
Looking through your code initially, what I'm searching for are any shared variables. These produce a race condition if they are read/written by multiple threads at the same time.
In this case, there are 2 obvious offenders - #player and #user. These should be refactored to local variables rather than instance variables. Define them within the block so they don't affect any other scope, for example:
# Note, for this to work, you will have to change
# the method definition to return [player, user]
player, user = check_user_or_nick(event)
Sometimes, making side effects from threads is unavoidable (say you wanted to make a counter for how many times the thread was run). To prevent race conditions in these scenarios, a Mutex is generally the solution but also sometimes a distributed lock if the code is being run on multiple machines. However, from the code you've shown, it doesn't look like you need either of these things here.
So, What I'm trying to do is make calls to a Reporting API to filter by all possible breakdowns (breakdown the reports by site, avertiser, ad type, campaign, etc...). But, one issue is that the breakdowns can be unique to each login.
Example:
user1: alice123's reporting breakdowns are ["site","advertiser","ad_type","campaign","line_items"]
user2: bob789's reporting breakdowns are ["campaign","position","line_items"]
When I first built the code for this reporting API, I only had one login to test with, so I hard coded the loops for the dimensions (["site","advertiser","ad_type","campaign","line_items"]). So what I did was pinged the API for a report by sites. Then for each site, pinged for advertisers, and each advertiser, I pinged for the next dimension and so on..., leaving me with a nested loop of ~6 layers.
basically what I'm doing:
sites = mechanize.get "#{base_ur}/report?dim=sites"
sites = Yajl::Parser.parse(sites.body) # json parser
sites.each do |site|
advertisers = mechanize.get "#{base_ur}/report?site=#{site.fetch("id")}&dim=advertiser"
advertisers = Yajl::Parser.parse(advertisers.body) # json parser
advertisers.each do |advertiser|
ad_types = mechanize.get "#{base_ur}/report?site=#{site.fetch("id")}&advertiser=#{advertiser.fetch("id")}&dim=ad_type"
ad_types = Yajl::Parser.parse(ad_types.body) # json parser
ad_types.each do |ad_type|
...and so on...
end
end
end
GET <api_url>/?dim=<dimension to breakdown>&site=<filter by site id>&advertiser=<filter by advertiser id>...etc...
At the end of the nested loop, I'm left with a report that's broken down as much granularity as possible.
This works now since I only thought that there was one path of breaking down, but apparently each account could have different dimensions breakdowns.
So what I'm asking is if given an array of breakdowns, how can I set up a nested loop to traverse down dynamically do the granularity singularity?
Thanks.
I'm not sure what your JSON/GET returns exactly but for a problem like this you would need recursion.
Something like this perhaps? It's not very elegant and can definitely be optimised further but should hopefully give you an idea.
some_hash = {:id=>"site-id", :body=>{:id=>"advertiser-id", :body=>{:id=>"ad_type-id", :body=>{:id=>"something-id"}}}}
#breakdowns = ["site", "advertiser", "ad_type", "something"]
def recursive(some_hash, str = nil, i = 0)
if #breakdowns[i+1].nil?
str += "#{#breakdowns[i]}=#{some_hash[:id]}"
else
str += "#{#breakdowns[i]}=#{some_hash[:id]}&dim=#{#breakdowns[i + 1]}"
end
p str
some_hash[:body].is_a?(Hash) ? recursive(some_hash[:body], str.gsub(/dim.*/, ''), i + 1) : return
end
recursive(some_hash, 'base-url/report?')
=> "base-url/report?site=site-id&dim=advertiser"
=> "base-url/report?site=site-id&advertiser=advertiser-id&dim=ad_type"
=> "base-url/report?site=site-id&advertiser=advertiser-id&ad_type=ad_type-id&dim=something"
=> "base-url/report?site=site-id&advertiser=advertiser-id&ad_type=ad_type-id&something=something-id"
If you are just looking to map your data, you can recursively map to a hash as another user pointed out. If you are actually looking to do something with this data while within the loop and want to dynamically recreate the loop structure you listed in your question (though I would advise coming up with a different solution), you can use metaprogramming as follows:
require 'active_support/inflector'
# Assume we are given an input of breakdowns
# I put 'testarr' in place of the operations you perform on each local variable
# for brevity and so you can see that the code works.
# You will have to modify to suit your needs
result = []
testarr = [1,2,3]
b = binding
breakdowns.each do |breakdown|
snippet = <<-END
eval("#{breakdown.pluralize} = testarr", b)
eval("#{breakdown.pluralize}", b).each do |#{breakdown}|
END
result << snippet
end
result << "end\n"*breakdowns.length
eval(result.join)
Note: This method is probably frowned upon, and as I've said I'm sure there are other methods of accomplishing what you are trying to do.
I am receiving some data which is parsed in a Ruby script, a sample of the parsed data looks like this;
{"address":"00","data":"FF"}
{"address":"01","data":"00"}
That data relates to the status (on/off) of plant items (Fans, coolers, heaters etc.) the address is a HEX number to tell you which set of bits the data refers to. So in the example above the lookup table would be; Both of these values are received as HEX as in this example.
Bit1 Bit2 Bit3 Bit4 Bit5 Bit6 Bit7 Bit8
Address 00: Fan1 Fan2 Fan3 Fan4 Cool1 Cool2 Cool3 Heat1
Address 01: Hum1 Hum2 Fan5 Fan6 Heat2 Heat3 Cool4 Cool5
16 Addresses per block (This example is 00-0F)
Data: FF tells me that all items in Address 00 are set on (high/1) I then need to output the result of the lookup for each individual bit e.g
{"element":"FAN1","data":{"type":"STAT","state":"1"}}
{"element":"FAN2","data":{"type":"STAT","state":"1"}}
{"element":"FAN3","data":{"type":"STAT","state":"1"}}
{"element":"FAN4","data":{"type":"STAT","state":"1"}}
{"element":"COOL1","data":{"type":"STAT","state":"1"}}
{"element":"COOL2","data":{"type":"STAT","state":"1"}}
{"element":"COOL3","data":{"type":"STAT","state":"1"}}
{"element":"HEAT1","data":{"type":"STAT","state":"1"}}
A lookup table could be anything up to 2048 bits (though I don't have anything that size in use at the moment - this is maximum I'd need to scale to)
The data field is the status of the all 8 bits per address, some may be on some may be off and this updates every time my source pushes new data at me.
I'm looking for a way to do this in code ideally for the lay-person as I'm still very new to doing much with Ruby. There was a code example here, but it was not used in the end and has been removed from the question so as not to confuse.
Based on the example below I've used the following code to make some progress. (note this integrates with an existing script all of which is not shown here. Nor is the lookup table shown as its quite big now.)
data = [feeder]
data.each do |str|
hash = JSON.parse(str)
address = hash["address"]
number = hash["data"].to_i(16)
binary_str = sprintf("%0.8b", number)
binary_str.reverse.each_char.with_index do |char, i|
break if i+1 > max_binary_digits
mouse = {"element"=>+table[address][i], "data"=>{"type"=>'STAT', "state"=>char}}
mousetrap = JSON.generate(mouse)
puts mousetrap
end
end
This gives me an output of {"element":"COOL1","data":{"type":"STAT","state":"0"}} etc... which in turn gives the correct output via my node.js script.
I have a new problem/query having got this to work and captured a whole bunch of data from last night & this morning. It appears that now I've built my lookup table I need some of the results to be modified based on the result of the lookup. I have other sensors which need to generate a different output to feed my SVG for example;
FAN objects need to output {"element":"FAN1","data":{"type":"STAT","state":"1"}}
DOOR objects need to output {"element":"DOOR1","data":{"type":"LAT","state":"1"}}
SWIPE objects need to output {"element":"SWIPE6","data":{"type":"ROUTE","state":"1"}}
ALARM objects need to output {"element":"PIR1","data":{"type":"PIR","state":"0"}}
This is due to the way the SVG deals with updating - I'm not in a position to modify the DOM stuff so would need to fix this in my Ruby script.
So to address this what I ended up doing was making an exact copy of my existing lookup table and rather than listing the devices I listed the type of output like so;
Address 00: STAT STAT STAT ROUTE ROUTE LAT LAT PIR
Address 01: PIR PIR STAT ROUTE ROUTE LAT LAT PIR
This might be very dirty (and it also means I have to duplicate my lookup table, but it actually might be better for my specific needs as devices within the dataset could have any name (I have no control over the received data) Having built a new lookup table I modified the code I had been provided with below and already used for the original lookup but I had to remove these 2 lines. Without removing them I was getting the result of the lookup output 8 times!
binary_str.reverse.each_char.with_index do |char, i|
break if i+1 > max_binary_digits
The final array was built using the following;
mouse = {"element"=>+table[address][i], "data"=>{"type"=>typetable[address][i], "state"=>char}}
mousetrap = JSON.generate(mouse)
puts mousetrap
This gave me exactly the output I require and was able to integrate with both the existing script, node.js websocket & mongodb 'state' database (which is read on initial load)
There is one last thing I'd like to try and do with this code, when certain element states are set to 1 I'd like to be able to look something else up (and then use that result) I'm thinking this may be best done with a find query to my MongoDB and then just use the result. Doing that would hit the db for every query, but there would only ever be a handful or results so most things would return null which is fine. Am I along the right method of thinking?
require 'json'
table = {
"00" => ["Fan1", "Fan2", "Fan3"],
"01" => ["Hum1", "Hum2", "Fan5"],
}
max_binary_digits = table.first[1].size
data = [
%Q[{"address": "00","data":"FF"}],
%Q[{"address": "01","data":"00"}],
%Q[{"address": "01","data":"03"}],
]
data.each do |str|
hash = JSON.parse(str)
address = hash["address"]
number = hash["data"].to_i(16)
binary_str = sprintf("%0.8b", number)
p binary_str
binary_str.reverse.each_char.with_index do |char, i|
break if i+1 > max_binary_digits
puts %Q[{"element":#{table[address][i]},"data":{"type":"STAT","state":"#{char}"}}}]
end
puts "-" * 20
end
--output:--
"11111111"
{"element":Fan1,"data":{"type":"STAT","state":"1"}}}
{"element":Fan2,"data":{"type":"STAT","state":"1"}}}
{"element":Fan3,"data":{"type":"STAT","state":"1"}}}
--------------------
"00000000"
{"element":Hum1,"data":{"type":"STAT","state":"0"}}}
{"element":Hum2,"data":{"type":"STAT","state":"0"}}}
{"element":Fan5,"data":{"type":"STAT","state":"0"}}}
--------------------
"00000011"
{"element":Hum1,"data":{"type":"STAT","state":"1"}}}
{"element":Hum2,"data":{"type":"STAT","state":"1"}}}
{"element":Fan5,"data":{"type":"STAT","state":"0"}}}
--------------------
My answer assumes Bit1 in your table is the least significant bit, if that is not the case remove .reverse in the code.
You can ask me anything you want about the code.
Currently trying to generate a random number in a specific range;
and ensure that it would be unique against others stored records.
Using Mysql. Could be like an id, incremented; but can't be it.
Currently testing other existing records in an 'expensive' manner;
but I'm pretty sure that there would be a clean 1/2 lines of code to use
Currently using :
test = 0
Order.all.each do |ord|
test = (0..899999).to_a.sample.to_s.rjust(6, '0')
if Order.find_by_number(test).nil? then
break
end
end
return test
Thanks for any help
Here your are my one-line solution. It is also the quicker one since calls .pluck to retrieve the numbers from the Order table. .select instantiates an "Order" object for every record (that is very costly and unnecessary) while .pluck does not. It also avoids to iterate again each object with a .map to get the "number" field. We can avoid the second .map as well if we convert, using CAST in this case, to a numeric value from the database.
(Array(0...899999) - Order.pluck("CAST('number' AS UNSIGNED)")).sample.to_s.rjust(6, '0')
I would do something like this:
# gets all existing IDs
existing_ids = Order.all.select(:number).map(&:number).map(&:to_i)
# removes them from the acceptable range
available_numbers = (0..899999).to_a - existing_ids
# choose one (which is not in the DB)
available_numbers.sample.to_s.rjust(6, '0')
I think, you can do something like below :
def uniq_num_add(arr)
loop do
rndm = rand(1..15) # I took this range as an example
# random number will be added to the array, when the number will
# not be present
break arr<< "%02d" % rndm unless arr.include?(rndm)
end
end
array = []
3.times do
uniq_num_add(array)
end
array # => ["02", "15", "04"]
I need to find the next available ID (or key) from a fixed list of possible IDs. In this case valid IDs are from 1 to 9999, inclusive. When finding the next available ID we start looking just after the last assigned ID, wrap around at the end - only once, of course - and need to check if each ID is taken before we return it as an available ID.
I have some code that does this but I think it is neither elegant nor efficient and am interested in a simpler way to accomplish the same thing. I'm using Ruby but my question is not specific to the language, so if you'd like to write an answer using any other language I will be just as appreciative of your input!
I have elided some details about checking if an ID is available and such, so just take it as a given that the functions incr_last_id, id_taken?(id), and set_last_id(id) exist. (incr_last_id will add 1 to the last assigned ID in a data store (Redis) and return the result. id_taken?(id) returns a boolean indicating if the ID is available or not. set_last_id(id) updates the data store with the new last ID.)
MaxId = 9999
def next_id
id = incr_last_id
# if this ID is taken or out of range, find the next available id
if id > MaxId || id_taken?(id)
id += 1 while id < MaxId && id_taken?(id)
# wrap around if we've exhausted the ID space
if id > MaxId
id = 1
id += 1 while id < MaxId && id_taken?(id)
end
raise NoAvailableIdsError if id > MaxId || id_taken?(id)
set_last_id(id)
end
id
end
I'm not really interested in solutions that require me to build up a list of all possible IDs and then get the set or list difference between the assigned IDs and the available IDs. That doesn't scale. I realize that this is a linear operation no matter how you slice it and that part is fine, I just think the code can be simplified or improved. I don't like the repetition caused by having to wrap around but perhaps there's no way around that.
Is there a better way? Please show me!
Since you've already searched from incr_last_id to MaxId in the first iteration, there isn't really a need to repeat it again.
Searching from 1 to incr_last_id on the second round at least reduces the search to exactly O(n) instead of a worse case of O(2n)
If you want to do it in a single loop, use modulo,
MaxId = 9999
def next_id
id = incr_last_id
limit = id - 1 #This sets the modulo test to the id just before your start point
id += 1 while (id_taken?(id) && (i % MaxId) != limit)
raise NoAvailableIdsError if id_taken?(id)
set_last_id(id)
id
end
Using a database table (MySQL in this example):
SELECT id FROM sequences WHERE sequence_name = ? FOR UPDATE
UPDATE sequences SET id = id + 1 WHERE sequence_name = ?
The FOR UPDATE gains an exclusive lock on the table, ensuring you can be the only possible process doing the same operation at the same time.
Using an in-memory fixed list:
# somewhere global, done once
#lock = Mutex.new
#ids = (0..9999).to_a
def next_id
#lock.synchronize { #ids.shift }
end
Using redis:
LPOP list_of_ids
Or just:
INCR some_id
Redis takes care of the concurrency concerns for you.
The usual answer to improve this sort of algorithm is to keep a list of "free objects" handy; you could use just a single object in the list, if you don't really want the extra effort of maintaining a list. (This would reduce the effectiveness of the free object cache, but the overhead of managing a large list of free objects might grow to be a burden. It Depends.)
Because you're wrapping your search around when you've hit MaxId, I presume there is a function give_up_id that will return the id to the free pool. Instead of simply putting a freed id back into the big pool you keep track of it with a new variable #most_recently_free or append it to a list #free_ids[].
When you need a new id, take one off the list, if the list has one. If the list doesn't have one, begin your search as you currently do.
Here's a sketch in pseudo-code:
def give_up_id(id)
#free_ids.push(id)
end
def next_id
if #free_ids.empty?
id = old_next_id()
else
id = #free_ids.pop()
return id
end
If you allow multiple threads of execution to interact with your id allocation / free routines, you'll need to protect these routines too, of course.