get the operating system of an Amazon image via Fog

get the operating system of an Amazon image via Fog - ruby

My end goal is to get the operating system of an Amazon image. When I do:
connection = Fog::Compute.new(provider: 'AWS',
aws_access_key_id: 'blah',
aws_secret_access_key: 'thing')
images = connection.describe_images('Owner' => 'self').body['imagesSet']
The data I get returned does not include platform, as this documentation suggests. However, I do get values like:
architecture: "x86_64",
imageType: "machine",
kernelId: "aki-825ea7eb",
And if I Google for that kernel ID I find this page saying it's Linux. Is there a way I can pass kernelId to Amazon via Fog and get back data about that kernelId, such as linux?
On a separate note, sometimes my images don't have kernelId, so are there any other fields in a <DescribeImagesResponse xmlns="http://ec2.amazonaws.com/doc/2012-12-01/"> that are definite indicators of operating system?

Here's a solution if you have the Kernel ID using http://thecloudmarket.com.
Pass the Kernel ID to a variable in ruby.
ker_id = imagesSet
url = []
url_0 = "http://thecloudmarket.com/image/"
url_1 = "ker_id"
url_2 = "#/definition"
new_url = url_0 + url_1 _ url_2
There are many ways to forge this url just made it easy to read.
Then use nokogiri to parse the webpage and put the image name back into your script.
I didn't see another notifiers in the documentation.

Related

how to parse previously fetch whois data with Ruby Whois?

According to README on github, Ruby Whois can be used "as a standalone library to parse WHOIS records fetched previously and/or from different WHOIS clients."
I know how to use the library to directly perform whois query and parse the returning result. But I cannot find anywhere(stackoverflow included) how I can use this library to parse whois data previously fetched ?
I think it's not important but this is how I get my data, anyway: they are fetched through linux whois command and stored in separate files, each file containing one whois query result.
The manual pages on https://whoisrb.org/ are 404. Even the code on the homepage is outdated thus wrong, and the doc pages provide little information.
I tried to scan the source code on github( https://github.com/weppos/whois-parser and https://github.com/weppos/whois). I tried to find the answer on rubydoc ( https://www.rubydoc.info/gems/whois-parser/Whois/Parser, https://www.rubydoc.info/gems/whois/Whois/Record and some related pages). Both failed, partly because this task is the first time and the reason that I use Ruby.
So could anyone help me? I'm really desperate and I'll definitely appreciate any help.

Try it like this,
require 'whois-parser'
domain = 'google.com'
data = 'WHOIS DATA THAT YOU ALREADY HAVE'
whois_server = Whois::Server.guess domain
whois_data = [Whois::Record::Part.new(body: data, host: whois_server.host)]
record = Whois::Record.new(whois_server, whois_data)
parser = record.parser
parser.available? #=> false
parser.registered? #=> true

How to scrape the data using requests module only in python

I am actually trying to parse a website using the requests module, and extract some text out of it.
Url : https://www.icsi.in/student/Members/MemberSearch.aspx
after hitting the url in the Cp Number text field input : 16803
hit search,
on the bottom you can see some data, I want that data, let's say a name.
I am successfully able to get the data using selenium, but can't able to get it using requests module.
I have tried the requests module giving parameters, sessions, cookies etc.
but nothing worked.
url = "https://www.icsi.in/student/Members/MemberSearch.aspx"
ss = {'dnn$ctr410$MemberSearch$txtCpNumber':'16803',
'__EVENTTARGET':'dnn$ctr410$MemberSearch$btnSearch',
'__VIEWSTATEGENERATOR':'6A295697',
'dnn$ctlHeader$dnnSearch$Search':'SiteRadioButton'}
session = requests.Session()
cookies = session.cookies.get_dict()
for cookie in cookies:
session.cookies.set(cookie['name'], cookie['value'])
response = requests.post(url, data=ss)
print(response)
HTMLTree = html.fromstring(response.content)
name = HTMLTree.xpath('//div[#class="name_head"]//text()')
print(name)
I expect the output of the name of the person.
Anyone out there please help me.

If you don't mind using C# code I would be more than happy to help you otherwise it's a very lengthy process. If you choose that python is the only road you're willing to take then you should try grabbing the encrypted value within C:\User[USERNAME]\Appdata\Local\Google\Chrome\User Data\Default\Cookies You can change the file path accordingly to your OS. You can use SQLite to read and modify the encrypted values.
cookie = Decrypt(Encoding.Default.GetBytes(SQLDatabase1.GetValue(i, "encrypted_value")
if (cookie.Contains(".ASPXANONYMOUS")):
Step1 = cookie + "END"
Step2 = (step1 + ".ASPXANONYMOUS")
The following code above may help you with your journey.

how can I get ALL records from route53?

how can I get ALL records from route53?
referring code snippet here, which seemed to work for someone, however not clear to me: https://github.com/aws/aws-sdk-ruby/issues/620
Trying to get all (I have about ~7000 records) via resource record sets but can't seem to get the pagination to work with list_resource_record_sets. Here's what I have:
route53 = Aws::Route53::Client.new
response = route53.list_resource_record_sets({
start_record_name: fqdn(name),
start_record_type: type,
max_items: 100, # fyi - aws api maximum is 100 so we'll need to page
})
response.last_page?
response = response.next_page until response.last_page?
I verified I'm hooked into right region, I see the record I'm trying to get (so I can delete later) in aws console, but can't seem to get it through the api. I used this: https://github.com/aws/aws-sdk-ruby/issues/620 as a starting point.
Any ideas on what I'm doing wrong? Or is there an easier way, perhaps another method in the api I'm not finding, for me to get just the record I need given the hosted_zone_id, type and name?

The issue you linked is for the Ruby AWS SDK v2, but the latest is v3. It also looks like things may have changed around a bit since 2014, as I'm not seeing the #next_page or #last_page? methods in the v2 API or the v3 API.
Consider using the #next_record_name and #next_record_type from the response when #is_truncated is true. That's more consistent with how other paginations work in the Ruby AWS SDK, such as with DynamoDB scans for example.
Something like the following should work (though I don't have an AWS account with records to test it out):
route53 = Aws::Route53::Client.new
hosted_zone = ? # Required field according to the API docs
next_name = fqdn(name)
next_type = type
loop do
response = route53.list_resource_record_sets(
hosted_zone_id: hosted_zone,
start_record_name: next_name,
start_record_type: next_type,
max_items: 100, # fyi - aws api maximum is 100 so we'll need to page
)
records = response.resource_record_sets
# Break here if you find the record you want
# Also break if we've run out of pages
break unless response.is_truncated
next_name = response.next_record_name
next_type = response.next_record_type
end

Upload to S3 with progress in plain Ruby script

This question is related to this one: Tracking Upload Progress of File to S3 Using Ruby aws-sdk,
However since there is no clear solution to this I was wondering if there's a better/easier way (if one exists) of getting file upload progress with S3 using Ruby in 2018?
In my current setup I'm basically creating a new Resource, fetch my bucket and call upload_file but I haven't yet found any options for passing blocks which would help in yielding some sort of progress.
...
#connection = Aws::S3::Resource.new
#s3_bucket = #connection.bucket(bucket)
#s3_bucket.object(path).upload_file(data, {acl: 'public-read'})
...
Is there a way to do this using the newest sdk-for-ruby v3?
Any help (or even better a small example) would be great.

The example Trevor gives in https://stackoverflow.com/a/12147709/153886 is not hacky from what I can see - just wiring things together. The SDK simply does not provide a feature for passing progress details on all operations. Plus, Trevor is the maintainer of the Ruby SDK at AWS so I trust his judgement.
Expanding on his example
bar = ProgressBar.create(:title => "Uploading action", :starting_at => 0, :total => file.size)
obj = s3.buckets['my-bucket'].objects['object-key']
obj.write(:content_length => file.size) do |writable, n_bytes|
writable.write(file.read(n_bytes))
bar.progress += n_bytes
end
If you want to have a progress block right in the upload_file method I believe you will need to open a PR to the SDK. It is not that strange that is not the case for Ruby (or for any other runtime) because, for example, there could be an optimisation in the HTTP client library that uses IO.copy_stream from your source body argument to the destination socket, which does not relay progress anywhere.

Avoid repeated calls to an API in Jekyll Ruby plugin

I have written a Jekyll plugin to display the number of pageviews on a page by calling the Google Analytics API using the garb gem. The only trouble with my approach is that it makes a call to the API for each page, slowing down build time and also potentially hitting the user call limits on the API.
It would be possible to return all the data in a single call and store it locally, and then look up the pageview count from each page, but my Jekyll/Ruby-fu isn't up to scratch. I do not know how to write the plugin to run once to get all the data and store it locally where my current function could then access it, rather than calling the API page by page.
Basically my code is written as a liquid block that can be put into my page layout:
class GoogleAnalytics < Liquid::Block
def initialize(tag_name, markup, tokens)
super # options that appear in block (between tag and endtag)
#options = markup # optional optionss passed in by opening tag
end
def render(context)
path = super
# Read in credentials and authenticate
cred = YAML.load_file("/home/cboettig/.garb_auth.yaml")
Garb::Session.api_key = cred[:api_key]
token = Garb::Session.login(cred[:username], cred[:password])
profile = Garb::Management::Profile.all.detect {|p| p.web_property_id == cred[:ua]}
# place query, customize to modify results
data = Exits.results(profile,
:filters => {:page_path.eql => path},
:start_date => Chronic.parse("2011-01-01"))
data.first.pageviews
end
Full version of my plugin is here
How can I move all the calls to the API to some other function and make sure jekyll runs that once at the start, and then adjust the tag above to read that local data?
EDIT Looks like this can be done with a Generator and writing the data to a file. See example on this branch Now I just need to figure out how to subset the results: https://github.com/Sija/garb/issues/22

To store the data, I had to:
Write a Generator class (see Jekyll wiki plugins) to call the API.
Convert data to a hash (for easy lookup by path, see 5):
result = Hash[data.collect{|row| [row.page_path, [row.exits, row.pageviews]]}]
Write the data hash to a JSON file.
Read in the data from the file in my existing Liquid block class.
Note that the block tag works from the _includes dir, while the generator works from the root directory.
Match the page path, easy once the data is converted to a hash:
result[path][1]
Code for the full plugin, showing how to create the generator and write files, etc, here
And thanks to Sija on GitHub for help on this.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

get the operating system of an Amazon image via Fog - ruby

Related

how to parse previously fetch whois data with Ruby Whois?

How to scrape the data using requests module only in python

how can I get ALL records from route53?

Upload to S3 with progress in plain Ruby script

Avoid repeated calls to an API in Jekyll Ruby plugin

Categories

Resources