Ruby backup gem - shared configurations?

Ruby backup gem - shared configurations? - ruby

I'm using meskyanichi's backup gem. By and large it does what I need it to, but I need to have multiple backups (e.g., hourly, daily, weekly). The configurations are mostly the same but have a few differences, so I need to have multiple configuration files. I'm having trouble finding a sane way to manage the common bits of configurations (i.e., not repeat the common parts).
The configuration files use a lot of block structures, and from what I can tell, each backup needs to have a separate config file (e.g. config/backup/hourly.rb, config/backup/daily.rb, etc). A typical config file looks like this:
Backup::Model.new(:my_backup, 'My Backup') do
database MySQL do |db|
db.name = "my_database"
db.username = "foo"
db.password = "bar"
# etc
end
# similar for other config options
end
Then the backup is executed a la bundle exec backup perform -t my_backup -c path/to/config.rb.
My first swag at enabling a common config was to define methods that I could call from the blocks:
def my_db_config db
db.name = "my_database"
# etc
end
Backup::Model.new(:my_backup, 'My Backup') do
database MySQL do |db|
my_db_config db
end
#etc
end
But this fails with an undefined method 'my_db_config' for #<Backup::Database::MySQL:0x10155adf0>.
My intention was to get this to work and then split the common config functions into another file that I could require in each of my config files. I also tried creating a file with the config code and requireing it into the model definition block:
# common.rb
database MySQL do |db|
db.name = "my_database"
#etc
end
# config.rb
Backup::Model.new(:my_backup, 'My Backup') do
require "common.rb" # with the right path, etc
end
This also doesn't work, and from subsequent research I've discovered that that's just not the way that require works. Something more in line with the way that C/C++'s #include works (i.e., blindly pasting the contents into whatever scope it is called from) might work.
Any ideas?

The gem seems to modify the execution scope of the config blocks. To work around this, you could wrap your functions in a class:
class MyConfig
def self.prepare_db(db)
db.name = "my_database"
# etc
db
end
end
Backup::Model.new(:my_backup, 'My Backup') do
database MySQL do |db|
db = MyConfig.prepare_db(db)
end
#etc
end
You could get a bit more fancy and abstract your default config merge:
class BaseConfig
##default_sets =
:db => {
:name => "my_database"
},
:s3 => {
:access_key => "my_s3_key"
}
}
def self.merge_defaults(initial_set, set_name)
##default_sets[set_name].each do |k, v|
initial_set.send("#{k}=".to_sym, v)
end
initial_set
end
end
Backup::Model.new(:my_backup, 'My Backup') do
database MySQL do |db|
db = BaseConfig.merge_defaults(db, :db)
end
store_with S3 do |s3|
s3 = BaseConfig.merge_defaults(s3, :s3)
end
end

In newest versions of backup gem you can simple use main config file like this:
Genrate main config file:
root#youhost:~# backup generate:config
Modify file /root/Backup/config.rb like this:
Backup::Storage::S3.defaults do |s3|
s3.access_key_id = "youkey"
s3.secret_access_key = "yousecret"
s3.region = "us-east-1"
s3.bucket = "youbacket"
s3.path = "youpath"
end
Backup::Database::PostgreSQL.defaults do |db|
db.name = "youname"
db.username = "youusername"
db.password = "youpassword"
db.host = "localhost"
db.port = 5432
db.additional_options = ["-xc", "-E=utf8"]
end
Dir[File.join(File.dirname(Config.config_file), "models", "*.rb")].each do |model|
instance_eval(File.read(model))
end
Create model file:
root#youhost:~# backup generate:model --trigger daily_backup \
--databases="postgresql" --storages="s3"
Then modify /root/Backup/models/daily_backup.rb like this:
# encoding: utf-8
Backup::Model.new(:daily_backup, 'Description for daily_backup') do
split_into_chunks_of 250
database PostgreSQL do |db|
db.keep = 20
end
store_with S3 do |s3|
s3.keep = 20
end
end
With this you can simply create daily, monthly or yearly archives.

Related

How do I stop the Tempfile created by Creek from being deleted before I'm done with it?

I'm writing a script that Creek and an .xlsx file and uses it to update the prices and weights of products in a database. The .xlsx file is located on an AWS server, so Creek copies the file down and stores it in a Tempfile while it is in use.
The issue is, at some point the Tempfile seems to be prematurely deleted, and since Creek continues to call on it whenever it iterates through a sheet, the script fails. Interestingly, my coworker's environment runs the script fine, though I haven't found a difference between what we're running.
Here is the script I've written:
require 'creek'
class PricingUpdateWorker
include Sidekiq::Worker
def perform(filename)
# This points to the file in the root bucket
file = bucket.files.get(filename)
# Make public temporarily to open in Creek
file.public = true
file.save
creek_sheets = Creek::Book.new(file.public_url, remote: true).sheets
# Close file to public
file.public = false
file.save
creek_sheets.each_with_index do |sheet, sheet_index|
p "---------- #{sheet.name} ----------"
sheet.simple_rows.each_with_index do |row, index|
next if index == 0
product = Product.find_by_id(row['A'].to_i)
if product
if row['D']&.match(/N\/A/) || row['E']&.match(/N\/A/)
product.delete
p '*** deleted ***'
else
product.price = row['D']&.to_f&.round(2)
product.weight = row['E']&.to_f
product.request_for_quote = false
product.save
p 'product updated'
end
else
p "#{row['A']} | product not found ***"
end
end
end
end
private
def connection
#connection ||= Fog::Storage.new(
provider: 'AWS',
aws_access_key_id: ENV['AWS_ACCESS_KEY_ID'],
aws_secret_access_key: ENV['AWS_SECRET_ACCESS_KEY']
)
end
def bucket
# Grab the file from the bucket
#bucket ||= connection.directories.get 'my-aws-bucket'
end
end
And the logs:
"---------- Sheet 1 ----------"
"product updated"
"product updated"
... I've cut out a bunch more of these...
"product updated"
"product updated"
"---------- Sheet 2 ----------"
rails aborted!
Errno::ENOENT: No such file or directory # rb_sysopen - /var/folders/9m/mfcnhxmn1bqbm6h91rx_rd8m0000gn/T/file20190920-19247-c6x4zw
"/var/folders/9m/mfcnhxmn1bqbm6h91rx_rd8m0000gn/T/file20190920-19247-c6x4zw" is the temporary file, and as you can see, it's been collected already, even though I'm still using it, and I believe it is still in scope. Any ideas what could be causing this? It's especially odd that my coworker can run this just fine.
In case it's helpful, here is a little code from Creek:
def initialize path, options = {}
check_file_extension = options.fetch(:check_file_extension, true)
if check_file_extension
extension = File.extname(options[:original_filename] || path).downcase
raise 'Not a valid file format.' unless (['.xlsx', '.xlsm'].include? extension)
end
if options[:remote]
zipfile = Tempfile.new("file")
zipfile.binmode
zipfile.write(HTTP.get(path).to_s)
# I added the line below this one, and it fixes the problem by preventing the file from being marked for garbage collection, though I shouldn't need to take steps like that.
# ObjectSpace.undefine_finalizer(zipfile)
zipfile.close
path = zipfile.path
end
#files = Zip::File.open(path)
#shared_strings = SharedStrings.new(self)
end
EDIT: Someone wanted to know exactly how I was running my code, so here it is.
I run the following rake task by executing bundle exec rails client:pricing_update[client_updated_prices.xlsx] in the command line.
namespace :client do
desc 'Imports the initial database structure & base data from uploaded .xlsx file'
task :pricing_update, [:filename] => :environment do |t, args|
PricingUpdateWorker.new.perform(args[:filename])
end
end
I should also mention that I'm running Rails, so the Gemfile.lock keeps the gem versions consistent between me and my coworker. My fog version is 2.0.0 and my rubyzip version is 1.2.2.

Finally it seems that the bug is not in Creek gem at all but rather in the rubyzip gem having trouble with xlsx files as noted in this issue It seems to depend on how the source of the file was generated. I created a simple 2 page spreadsheet in Google sheets and it works fine, but a random xlsx file may not.
require 'creek'
def test_creek(url)
Creek::Book.new(url, remote: true).sheets.each_with_index do |sheet, index|
p "----------Name: #{sheet.name} Index: #{index} ----------"
sheet.simple_rows.each_with_index do |row, i|
puts "#{row} index: #{i}"
end
end
end
test_creek 'https://tc-sandbox.s3.amazonaws.com/creek-test.xlsx'
# works fine should output
"----------Name: Sheet1 Index: 0 ----------"
{"A"=>"foo ", "B"=>"sheet", "C"=>"one"} index: 0
"----------Name: Sheet2 Index: 1 ----------"
{"A"=>"bar", "B"=>"sheet", "C"=>"2.0"} index: 0
test_creek 'http://dev-builds.libreoffice.org/tmp/test.xlsx'
# raises error

How to read data from a different file without using YAML or JSON

I'm experimenting with a Ruby script that will add data to a Neo4j database using REST API. (Here's the tutorial with all the code if interested.)
The script works if I include the hash data structure in the initialize method but I would like to move the data into a different file so I can make changes to it separately using a different script.
I'm relatively new to Ruby. If I copy the following data structure into a separate file, is there a simple way to read it from my existing script when I call #data? I've heard one could do something with YAML or JSON (not familiar with how either work). What's the easiest way to read a file and how could I go about coding that?
#I want to copy this data into a different file and read it with my script when I call #data.
{
nodes:[
{:label=>"Person", :title=>"title_here", :name=>"name_here"}
]
}
And here is part of my code, it should be enough for the purposes of this question.
class RGraph
def initialize
#url = 'http://localhost:7474/db/data/cypher'
#If I put this hash structure into a different file, how do I make #data read that file?
#data = {
nodes:[
{:label=>"Person", :title=>"title_here", :name=>"name_here"}
]
}
end
#more code here... not relevant to question
def create_nodes
# Scan file, find each node and create it in Neo4j
#data.each do |key,value|
if key == :nodes
#data[key].each do |node| # Cycle through each node
next unless node.has_key?(:label) # Make sure this node has a label
#WE have sufficient data to create a node
label = node[:label]
attr = Hash.new
node.each do |k,v| # Hunt for additional attributes
next if k == :label # Don't create an attribute for "label"
attr[k] = v
end
create_node(label,attr)
end
end
end
end
rGraph = RGraph.new
rGraph.create_nodes
end

Given that OP said in comments "I'm not against using either of those", let's do it in YAML (which preserves the Ruby object structure best). Save it:
#data = {
nodes:[
{:label=>"Person", :title=>"title_here", :name=>"name_here"}
]
}
require 'yaml'
File.write('config.yaml', YAML.dump(#data))
This will create config.yaml:
---
:nodes:
- :label: Person
:title: title_here
:name: name_here
If you read it in, you get exactly what you saved:
require 'yaml'
#data = YAML.load(File.read('config.yaml'))
puts #data.inspect
# => {:nodes=>[{:label=>"Person", :title=>"title_here", :name=>"name_here"}]}

How to reuse the query result for faster export to csv and xls without using global or session variable

I have a functionality that initially shows the results in HTML (a report) and then
can be exported to CSV and XLS
The idea is to reuse the results, of the query used to render the HTML, to export the same records without re-running the query again
The closest implementation is this: Storing the result in the global variable $last_consult
I have the following INDEX method in a Ruby controller
def index
begin
respond_to do |format|
format.html {
#filters = {}
#filters['email_enterprise'] = session[:enterprise_email] ;
# Add the selected filters
if (params[:f_passenger].to_s != '')
#filters['id_passenger'] = params[:f_passenger] ;
end
if (session[:role] == 2)
#filters['cost_center'] = session[:cc_name]
end
# Apply the filters and assign them to $last_consult that is used
$last_consult = InfoVoucher.where(#filters)
#info_vouchers = $last_consult.paginate(:page => params[:page], :per_page => 10)
estimate_destinations (#info_vouchers)
#cost_centers = fill_with_cost_centers(#info_vouchers)
}
format.csv {
send_data InfoVoucher.export
}
format.xls {
send_data InfoVoucher.export(col_sep: "\t")
}
The .export method is defined like this
class InfoVoucher < ActiveRecord::Base
include ActiveModel::Serializers::JSON
default_scope { order('voucher_created_at DESC') }
def attributes
instance_values
end
#Exporta a CSV o XLS
def self.export(options = {})
column_names = ["...","...","...","...","...",...]
exported_col_names = ["Solicitud", "Inicio", "Final", "Duracion", "Pasajero", "Unidades", "Recargo", "Propina", "Costo", "Centro de costo", "Origen", "Destino", "Proyecto", "Conductor", "Placas"]
CSV.generate(options) do |csv|
csv << exported_col_names
$last_consult.each do |row_export|
csv << row_export.attributes['attributes'].values_at(*column_names)
end
end
end
end
But this approach only works as long as there is no concurrent users between viewing the report and exporting it which in this case is unaceptable
I try to use a session variable to store the query result but since the result of the query can be quite it fails with this error
ActionDispatch::Cookies::CookieOverflow: ActionDispatch::Cookies::CookieOverflow
I have read about flash but don't consider it a good choice for this
Can you please point me in the right direction in how to persist the query results ,currently store in $last_consult, and make it avaible for the CSV and XLS export without using a global or session variable

Rails 4 has a bunch of cache solutions:
SQL query caching: caches the query result set for the duration of the request.
Memory caching: Limited to 32 mb. An example use is small sets, such as a list of object ids that were time-consuming to generate, e.g. the result of a complex select.
File caching: Great for huge results. Probably not what you want for your particular DB query, unless your results are huge and also you're using a RAM disk or SSD.
Memcache and dalli: an excellent fast distributed cache that's independent of your app. For your question, memcache can be a very good solution for apps that return the same results or reports to multiple users.
Terracotta Ehcache: this is enterprise and JRuby. I haven't personally used it. Looks like it good be good if you're building a serious workhorse app.
When you use any of these, you don't store the information in a global variable, nor a controller variable. Instead, you store the information by creating a unique cache key.
If your information is specific to a particular user, such as the user's most recent query, then a decent choice for the unique cache key is "#{current_user.id}-last-consult".
If your information is generic across users, such as a report that depends on your filters and not on a particular user, then a decent choice for the unique cache key is #filters.hash.
If your information is specific to a particular user, and also the the specific filters, the a decent choice for the unique cache is is "#{current_user.id}-#{#filters.hash}". This is a powerful generic way to cache user-specific information.
I have had excellent success with the Redis cache gem, which can work separately from Rails 4 caching. https://github.com/redis-store/redis-rails
I found this great article about most of the caching strategies that you mention
http://hawkins.io/2012/07/advanced_caching_part_1-caching_strategies/

After reading joelparkerhenderson answer I read this this great article about most of the caching strategies he mentioned
http://hawkins.io/2012/07/advanced_caching_part_1-caching_strategies/
I decided to use Dalli gem that depends on memcached 1.4+
And in order to configure and use Dalli I read
https://www.digitalocean.com/community/tutorials/how-to-use-memcached-with-ruby-on-rails-on-ubuntu-12-04-lts and
https://github.com/mperham/dalli/wiki/Caching-with-Rails
And this is how it ended up being implemented:
Installation/Configuration
sudo apt-get install memcached
installation can be verified running command
memcached -h
Then install the Dalli gem and configure it
gem install dalli
Add this lines to the Gemfile
# To cache query results or any other long-running-task results
gem 'dalli'
Set this lines in your config/environments/production.rb file
# Configure the cache to use the dalli gem and expire the contents after 1 hour and enable compression
config.perform_caching = true
config.cache_store = :dalli_store, 'localhost:11211', {:expires_in => 1.hour, :compress => true }
Code
In the controller I created an new method called query_info_vouchers which runs the query and stores the result in the cache by calling the Rails.cache.write method
In the index method I call the fetch to see if any cached data is available and this is only done for the CSV and XLS export format
def index
begin
add_breadcrumb "Historial de carreras", info_vouchers_path
respond_to do |format|
format.html {
query_info_vouchers
}
format.csv {
#last_consult = Rails.cache.fetch ("#{session[:passenger_key]}_last_consult") do
query_info_vouchers
end
send_data InfoVoucher.export(#last_consult)
}
format.xls {
#last_consult = Rails.cache.fetch ("#{session[:passenger_key]}_last_consult") do
query_info_vouchers
end
send_data InfoVoucher.export(#last_consult,col_sep: "\t")
}
end
rescue Exception => e
Airbrake.notify(e)
redirect_to manager_login_company_path, flash: {notice: GlobalConstants::ERROR_MESSAGES[:no_internet_conection]}
end
end
def query_info_vouchers
# Por defecto se filtran las carreras que son de la empresa
#filters = {}
#filters['email_enterprise'] = session[:enterprise_email] ;
# Add the selected filters
if (params[:f_passenger].to_s != '')
#filters['id_passenger'] = params[:f_passenger] ;
end
if (session[:role] == 2)
#filters['cost_center'] = session[:cc_name]
end
# Apply the filters and store them in the MemoryCache to make them available when exporting
#last_consult = InfoVoucher.where(#filters)
Rails.cache.write "#{session[:passenger_key]}_last_consult", #last_consult
#info_vouchers = #last_consult.paginate(:page => params[:page], :per_page => 10)
estimate_destinations (#info_vouchers)
#cost_centers = fill_with_cost_centers(#last_consult)
end
An in the model .export method
def self.export(data, options = {})
column_names = ["..","..","..","..","..","..",]
exported_col_names = ["Solicitud", "Inicio", "Final", "Duracion", "Pasajero", "Unidades", "Recargo", "Propina", "Costo", "Centro de costo", "Origen", "Destino", "Proyecto", "Conductor", "Placas"]
CSV.generate(options) do |csv|
csv << exported_col_names
data.each do |row_export|
csv << row_export.attributes['attributes'].values_at(*column_names)
end
end
end

Encrypt data bag from inside of ruby without relying on knife

At the moment to encrypt a data bag, I have to do :
system "knife data bag from file TemporaryEncrypting \"#{enc_file_path}\" --secret-file #{Secret_Key_Path}"
and that doesn't work because knife can't find a config file and I can't seem to get it read the one in C:\chef.
How do I do this from within ruby?

I worked out how to encrypt inside of ruby, just use this code:
require 'chef/knife'
#require 'chef/encrypted_data_bag_item' #you need to do this in chef version 12, they've moved it out of knife and into it's own section
require 'json'
secret = Chef::EncryptedDataBagItem.load_secret Secret_Key_Path
to_encrypt = JSON.parse(json_to_encrypt)
encrypted_data = Chef::EncryptedDataBagItem.encrypt_data_bag_item to_encrypt, secret
Answer achieved with information from this answer, here is the code in question:
namespace 'databag' do
desc 'Edit encrypted databag item.'
task :edit, [:databag, :item, :secret_file] do |t, args|
args.with_defaults :secret_file => "#{ENV['HOME']}/.chef/encrypted_data_bag_secret"
secret = Chef::EncryptedDataBagItem.load_secret args.secret_file
item_file = "data_bags/#{args.databag}/#{args.item}.json"
tmp_item_file = "/tmp/#{args.databag}_#{args.item}.json"
begin
#decrypt data bag into tmp file
raw_hash = Chef::JSONCompat.from_json IO.read item_file
databag_item = Chef::EncryptedDataBagItem.new raw_hash, secret
IO.write tmp_item_file, Chef::JSONCompat.to_json_pretty( databag_item.to_hash )
#edit tmp file
sh "#{ENV['EDITOR']} #{tmp_item_file}"
#encrypt tmp file data bag into original file
raw_hash = Chef::JSONCompat.from_json IO.read tmp_item_file
databag_item = Chef::EncryptedDataBagItem.encrypt_data_bag_item raw_hash, secret
IO.write item_file, Chef::JSONCompat.to_json_pretty( databag_item )
ensure
::File.delete tmp_item_file #ensure tmp file deleted.
end
end
end

What is the proper way to input options from a file? | Ruby Scripts

I'm trying to write a script that will take ip addresses from a host file, and username info from a config file. I'm obviously not holding a the file-name as a proper hash/value.
What should my File.new(options[:config_file], 'r').each { |params| puts params } be calling? I've tried what it is currently set too, and
File.new(config_file, 'r').each { |params| puts params }, as well as File.new(:config_file, 'r').each { |params| puts params } with no luck.
Should I be doing something different all together? Like load(filename = nil)?
options = {}
opt_parser = OptionParser.new do |opt|
opt.banner = 'Usage: opt_parser COMMAND [OPTIONS]'
opt.on('--host_file','I need hosts, put them here') do |host_file|
options[:host_file] = host_file
end
opt.on('--config_file', 'I need config info, put it here') do |config_file|
options[:config_file] = config_file
end
opt.on('-h', '--help', 'What your looking at') do |help|
options[:help] = help
puts opt
end
end
opt_parser.parse!
if options[:config_file]
File.new(options[:config_file], 'r').each { |params| puts params }
end
if options[:host_file]
File.new(options[:host_file], 'r').each { |host| puts host }
end

Parsing the hosts file
You can write your own parser or use a gem already implementing one.
Example using the "hosts" gem: (you need to install it)
require 'hosts'
hosts = Hosts::File.read('/etc/hosts')
entries = hosts.elements.select{ |element| element.is_a? Hosts::Entry }
addresses = Hash[entries.map{ |entry| [entry.name, entry.address] }]
# You should get a hash of entry names and addresses
# {"localhost"=>"127.0.0.1", "ip6-localhost"=>"::1"}
Parsing the config file
A common way to store configuration is to use YAML files.
Considering the following YAML file (in '/tmp/config.yml'):
username: foo
password: bar
You can parse this config file using the YAML module:
require 'yaml'
config = YAML.load_file('config.yml')
# You should get a hash of config values
# {"username"=>"foo", "password"=>"bar"}
If you don't want your password stored in plain text in a config file, you can:
ask the password at runtime if your context allow that
use environment variable to store the password and retrieve it at runtime
Edit:
If you only need to extract your hostnames from a text file, considering one hostname per line, you can use something like hostnames = IO.readlines("config.yml").map{ |line| line.chomp } to get an array of hostnames. You can after iterate through this array to do your operations.
www.ruby-doc.org/core-2.1.0/IO.html#method-i-readline

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio