Exract path and url from config file via regex - ruby

I have a gitmodules file like this:
[submodule "dotfiles/vim/bundle/cucumber"]
path = dotfiles/vim/bundle/cucumber
url = git://github.com/tpope/vim-cucumber.git
[submodule "dotfiles/vim/bundle/Command-T"]
path = dotfiles/vim/bundle/Command-T
url = git://github.com/vim-scripts/Command-T.git
What I want to do is to for each submodule get path and url as a hash or other structure which will keep data like:
submodule: cucumber (path -> 'path', url -> 'url')
How can I do it with regex? Or maybe there is more efficient way of parsing this kind of files?

This file format is something of a standard and so I imagine there is a gem or other code floating around that will parse it. On the other hand, it's easy to parse and encapsulated little text problems like this are "the fun part" of development, so why not reinvent the wheel? It's kind of like playing a game...
require 'pp'
def scangc
result = h = {}
open '../.gitconfig', 'r' do |f|
while s = f.gets
s.strip!
if s[0..0] == '['
result[s[1..-2].to_sym] = h = Hash.new
next
end
raise 'expected =' unless s['=']
a = s.strip.split /\s+=\s+/
h[a[0].to_sym] = a[1]
end
end
pp result
end
scangc

I would do it like this in python:
import re
x = """[submodule "dotfiles/vim/bundle/cucumber"]
path = dotfiles/vim/bundle/cucumber
url = git://github.com/tpope/vim-cucumber.git
[submodule "dotfiles/vim/bundle/Command-T"]
path = dotfiles/vim/bundle/Command-T
url = git://github.com/vim-scripts/Command-T.git"""
submodules = re.findall("\[submodule.*/(.*)\"\]",x)
paths = re.findall("path\s*=\s*(.*)",x)
urls = re.findall("url\s*=\s*(.*)",x)
group = zip(submodules,zip(paths,urls))
submodule_dict = dict([(z[0],{'path':z[1][0],'url':z[1][1]}) for z in group])
Which creates submodule_dict as
{'Command-T': {'path': 'dotfiles/vim/bundle/Command-T',
'url': 'git://github.com/vim-scripts/Command-T.git'},
'cucumber': {'path': 'dotfiles/vim/bundle/cucumber',
'url': 'git://github.com/tpope/vim-cucumber.git'}}

Related

Sanitizing URL strings

Say we have a string
url = "http://example.com/foo/baz/../../."
Obviously, we know from the Unix shell that ../../. essentially means to go up two directories. Hence, this URL is really going to http://example.com/. My question is, given these ../ characters in a string, how can we sanitize the URL string to point at the actual resource?
For example:
url = "http://example.com/foo/baz/../../hello.html"
url = process(url)
url = "http://example.com/hello.html"
Another:
url = "http://example.com/foo/baz/../."
url = process(url)
url = "http://example.com/foo/"
Keep in mind, the function still as to be able to take in normal URLs (ie. http://example.com) and return them as is if there is nothing to sanitize
The addressable gem can do this.
require 'addressable'
Addressable::URI.parse("http://example.com/foo/baz/../../hello.html").normalize.to_s
#=> "http://example.com/hello.html"
#!/usr/bin/env ruby
# ======
## defs:
def process(url)
url_components = url.split('/')
url_components2 = url_components.dup
current_index = 0
url_components.each do |component|
if component == '..'
url_components2.delete_at(current_index)
url_components2.delete_at(current_index-1)
current_index -= 1
elsif
component == '.'
url_components2.delete_at(current_index)
else
current_index += 1
end
end
url_resolved = url_components2.join('/')
return url_resolved
end
# =======
## tests:
urls = [
"http://example.com/foo/baz/../../.",
"http://example.com/foo/baz/../../hello.html",
"http://example.com/foo/baz/../."
]
urls.each do |url|
print url, ' => '
puts process(url)
end

I need to rename a file just created, but before the type extension

I've got the following variable:
filepath = "test.tmx"
I need to add _out at the end of the name of the file generated, but before the extension. So far, I've written this, but it's incorrect:
File.open(filepath+"-"+language_code+"_out", "w")
Any ideas?
First, extract the file extension and basename into a couple of variables:
# Get the file extension
ext = File.extname(filepath)
# Get the file's basename (without extension)
basename = File.basename(filepath, '.*')
Then you can construct the new filename using them:
File.open(basename + language_code + '_out' + ext, 'w')
Parse your filename to separate the base from extension using the standard library's File, and then re-assable.
This should work:
languages_list = ["es-AR", "es-CL", "es-CO", "es-MX", "es-PE"]
filepath = "adidas_174_Generic.tmx"
text = File.read(filepath)
languages_list.each do |language_code|
puts language_code
replace = text.gsub('<tuv xml:lang="es-PA">', "<tuv xml:lang=\"# {language_code}\">")
file_base = File.basename filepath, ".*"
file_extension = File.extname filepath
new_file_path = file_base+language_code+"_out"+file_extension
File.open(new_file_path, "w") {|file| file.puts replace}
The answer to your edited question is simple.
filepath = "test.tmx"
filepath_before_extension = File.basename(filepath, '.tmx')
# => "test"
puts "#{filepath_before_extension}_out.tmx"
# => test_out.tmx
Or:
puts filepath.sub(/\.tmx$/, '_out.tmx')
# => test_out.tmx
The answer to your original question is only slightly more complex.
If I understand correctly you're trying to combine filepath and language_code to form a filename like this:
adidas_174_Generic-en-AR_out.tmx
...but instead you're getting this:
adidas_174_Generic.tmx-en-AR_out
In that case, it's the same, except you add language_code to the output:
language_code = "es-AR"
filepath = "adidas_174_Generic.tmx"
filepath_before_extension = File.basename(filepath, '.tmx')
# => "adidas_174_Generic"
puts "#{filepath_before_extension}-#{language_code}_out.tmx"
# => adidas_174_Generic-es-AR_out.tmx
Or:
puts filepath.sub(/\.tmx$/, "-#{language_code}_out.tmx")
# => adidas_174_Generic-es-AR_out.tmx

file extension dependend actions

I want to check if a directory has a ".ogg" or ".m4a" file. In every case the dir is empty before starting a download session. So it just can have one "ogg" or one "m4a" file.
I tried out this code to fetch the filename:
def self.get_filename
if File.exists?('*.ogg')
file = Dir.glob('*.ogg')
#testfile = file[0]
#filename = File.basename(#testfile,File.extname(#testfile))
end
if File.exists?('*.m4a')
file = Dir.glob('*.m4a')
#testfile = file[0]
#filename = File.basename(#testfile,File.extname(#testfile))
end
end
Sadly the filename is actual empty. Maybe anyone knows why?
I think that you need Dir.glob instead.
Dir.glob('/path/to/dir/*.ogg') do |ogg_file|
#testfile = ogg_file
#filename = File.basename(#testfile,File.extname(#testfile))
end
File#exists? does not support regular expressions.
You can do this instead:
if Dir["*.rb"].any?
#....

Ruby, How to add a param to an URL that you don't know if it has any other param already

I have to add a new param to an indeterminate URL, let's say param=value.
In case the actual URL has already params like this
http://url.com?p1=v1&p2=v2
I should transform the URL to this other:
http://url.com?p1=v1&p2=v2&param=value
But if the URL has not any param yet like this:
http://url.com
I should transform the URL to this other:
http://url.com?param=value
I feel worry to solve this with Regex because I'm not sure that looking for the presence of & could be enough. I'm thinking that maybe I should transform the URL to an URI object, and then add the param and transform it to String again.
Looking for any suggestion from someone who has been already in this situation.
Update
To help with the participation I'm sharing a basic test suite:
require "minitest"
require "minitest/autorun"
def add_param(url, param_name, param_value)
# the code here
"not implemented"
end
class AddParamTest < Minitest::Test
def test_add_param
assert_equal("http://url.com?param=value", add_param("http://url.com", "param", "value"))
assert_equal("http://url.com?p1=v1&p2=v2&param=value", add_param("http://url.com?p1=v1&p2=v2", "param", "value"))
assert_equal("http://url.com?param=value#&tro&lo&lo", add_param("http://url.com#&tro&lo&lo", "param", "value"))
assert_equal("http://url.com?p1=v1&p2=v2&param=value#&tro&lo&lo", add_param("http://url.com?p1=v1&p2=v2#&tro&lo&lo", "param", "value"))
end
end
require 'uri'
uri = URI("http://url.com?p1=v1&p2=2")
ar = URI.decode_www_form(uri.query) << ["param","value"]
uri.query = URI.encode_www_form(ar)
p uri #=> #<URI::HTTP:0xa0c44c8 URL:http://url.com?p1=v1&p2=2&param=value>
uri = URI("http://url.com")
uri.query = "param=value" if uri.query.nil?
p uri #=> #<URI::HTTP:0xa0eaee8 URL:http://url.com?param=value>
EDIT:(by fguillen, to merge all the good propositions and also to make it compatible with his question test suite.)
require 'uri'
def add_param(url, param_name, param_value)
uri = URI(url)
params = URI.decode_www_form(uri.query || "") << [param_name, param_value]
uri.query = URI.encode_www_form(params)
uri.to_s
end
More elegant solution:
url = 'http://example.com?exiting=0'
params = {new_param: 1}
uri = URI.parse url
uri.query = URI.encode_www_form URI.decode_www_form(uri.query || '').concat(params.to_a)
uri.to_s #=> http://example.com?exiting=0&new_param=1
Well, you may also not know if this parameter already exists in url. If you want to replace it with new value in this case, you can do this:
url = 'http://example.com?exists=0&other=3'
params = {'exists' => 1, "not_exists" => 2}
uri = URI.parse url
uri.query = URI.encode_www_form(URI.decode_www_form(uri.query || '').to_h.merge(params))
uri.to_s
You can try to use my gem iri:
Iri.new('http://url.com?p1=v1&p2=v2').add(:param, 'value').to_s

Appending filenames in Ruby?

I'm very new to Ruby and branching out past first scripts asking what my favorite color is and repeating it back to me. I'm doing what I thought was a relatively simple task, moving files and changing the names.
I have a bunch of files in subdirectories that I need to move to a single directory and then append the file names of all of them. Specifically need to keep the original name and add onto the end, IE AAB701.jpg -> AAB701_01.jpg.
I have managed to find the files and move them (probably inefficiently) but I'm having no luck appending to the file name. Google search, stackoverflow, etc, no luck.
This is the code that I have now.
require 'find'
require "fileutils"
file_paths = []
Find.find('../../../Downloads') do |path|
file_paths << path if path =~ /.*\.jpg$/
end
file_paths.each do |filename|
name = File.basename('filename')
dest_folder = "../../../Desktop/Testing/"
FileUtils.cp(filename, dest_folder)
end
file_paths.each do |fullname|
append_txt = '_01'
filename = "*.jpg"
fullname = File.join(filename, append_txt)
end
The actual paths are pretty inconsequential, but I'm not familiar enough with File.join or gsub to figure out what is wrong/best.
First I'd extract some work into a small method:
def new_name(fn, dest = '../../../Desktop/Testing/', append = '_01')
ext = File.extname(fn)
File.join( dest, File.basename(fn, ext) + append + ext )
end
Then I'd apply a more functional style to your directory traversal and processing:
Dir[ '../../../Downloads/**/*.jpg' ].
select { |fn| File.file? fn }.
each { |fn| FileUtils.cp fn, new_name(fn) }
Also, I don't see what the Find module buys you over Dir#[] and the dir glob let's you filter to jpgs for free.
A simpler answer is for a file:
require 'pathname'
new_name =Pathname(orig_fn).sub_ext("01#{Pathname(orig_fn).extname}").to_s
I would modify your call to FileUtils.cp.
append_txt = '_01'
file_paths.each do |filename|
name = File.basename('filename')
newname = name + append_txt # + File.extension()
dest_folder = "../../../Desktop/Testing/"
FileUtils.cp(filename, dest_folder + newname)
end
Note that this code is not safe against malicious filenames; you should search the file handling docs for another way to do this.

Resources