ruby-nmap how to create hash output instead of xml - ruby

I want to use the ruby-nmap gem to do a port scan on a number of instances. Here's what I'm currently using:
Nmap::Program.scan do |nmap|
nmap.syn_scan = true
nmap.service_scan = true
nmap.os_fingerprint = true
nmap.xml = 'scan.xml'
nmap.verbose = true
# address[:public_ip] is my target
nmap.targets = address[:public_ip]
end
It creates an xml file, however I would prefer it gives me json or a hash as output and does not write this to a file. Is there any easy way to do this without just reading the xml file it creates?

Related

How do I write data binary to gcs with ruby efficiently?

I want to upload data binary directly to GCP storage, without writing the file to disk. Below is the code snippet I have created to get to the state that I am going to be at.
require 'google/cloud/storage'
bucket_name = '-----'
data = File.open('image_block.jpg', 'rb') {|file| file.read }
storage = Google::Cloud::Storage.new("project_id": "maybe-i-will-tell-u")
bucket = storage.bucket bucket_name, skip_lookup: true
Now I want to directly put this data into a file on gcs, without having to write a file to disk.
Is there an efficient way we can do that?
I tried the following code
to_send = StringIO.new(data).read
bucket.create_file to_send, "image_inder_11111.jpg"
but this throws an error saying
/google/cloud/storage/bucket.rb:2898:in `file?': path name contains null byte (ArgumentError)
from /home/inder/.gem/gems/google-cloud-storage-1.36.1/lib/google/cloud/storage/bucket.rb:2898:in `ensure_io_or_file_exists!'
from /home/inder/.gem/gems/google-cloud-storage-1.36.1/lib/google/cloud/storage/bucket.rb:1566:in `create_file'
from champa.rb:14:in `<main>'
As suggested by #stefan, It should be to_send = StringIO.new(data), i.e. without .read (which would return a string again)

Precision gets lost for big number in telegraf

Precision gets lost for big number.
I am using tail input plugin to read file and data inside a file is in json format.
Below is the configuration
[inputs.tail]]
files = ["E:/Telegraph/MSTCIVRRequestLog_*.json"]
from_beginning = true
name_override = "tcivrrequest"
data_format = "json"
json_strict = true
[[outputs.file]]
files = ["E:/Telegraph/output.json"]
data_format = "json"
Input file contains
{"RequestId":959011990586458245}
Expected Output
{"fields":{"RequestId":959011990586458245},"name":"tcivrrequest","tags":{},"timestamp":1632994599}
Actual Output
{"fields":{"RequestId":959011990586458200},"name":"tcivrrequest","tags":{},"timestamp":1632994599}
Number 959011990586458245 converted into 959011990586458200(check last few digits).
Already Tried Below things but not worked
json_string_fields = ["RequestId"]
[[processors.converter]]
[processors.converter.fields]
string = [""RequestId""]"
precision = "1s"
json_int64_fields = ["RequestId"]
character_encoding = "utf-8"
json_strict = true
I was able to reproduce this with the json parser as well. My suggestion would be to move to the json_v2 parser with a config like the following:
[[inputs.file]]
files = ["metrics.json"]
data_format = "json_v2"
[[inputs.file.json_v2]]
[[inputs.file.json_v2.field]]
path = "RequestId"
type = "int"
I was able to get a result as follows:
file RequestId=959011990586458245i 1651181595000000000
The newer parser is generally more accurate and flexible for simple cases like the one you provided.
Thanks!

Is it possible to read pdf/audio/video files(unstructured data) using Apache Spark?

Is it possible to read pdf/audio/video files(unstructured data) using Apache Spark?
For example, I have thousands of pdf invoices and I want to read data from those and perform some analytics on that. What steps must I do to process unstructured data?
Yes, it is. Use sparkContext.binaryFiles to load files in binary format and then use map to map value to some other format - for example, parse binary with Apache Tika or Apache POI.
Pseudocode:
val rawFile = sparkContext.binaryFiles(...
val ready = rawFile.map ( here parsing with other framework
What is important, parsing must be done with other framework like mentioned previously in my answer. Map will get InputStream as an argument
We had a scenario where we needed to use a custom decryption algorithm on the input files. We didn't want to rewrite that code in Scala or Python. Python-Spark code follows:
from pyspark import SparkContext, SparkConf, HiveContext, AccumulatorParam
def decryptUncompressAndParseFile(filePathAndContents):
'''each line of the file becomes an RDD record'''
global acc_errCount, acc_errLog
proc = subprocess.Popen(['custom_decrypt_program','--decrypt'],
stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
(unzippedData, err) = proc.communicate(input=filePathAndContents[1])
if len(err) > 0: # problem reading the file
acc_errCount.add(1)
acc_errLog.add('Error: '+str(err)+' in file: '+filePathAndContents[0]+
', on host: '+ socket.gethostname()+' return code:'+str(returnCode))
return [] # this is okay with flatMap
records = list()
iterLines = iter(unzippedData.splitlines())
for line in iterLines:
#sys.stderr.write('Line: '+str(line)+'\n')
values = [x.strip() for x in line.split('|')]
...
records.append( (... extract data as appropriate from values into this tuple ...) )
return records
class StringAccumulator(AccumulatorParam):
''' custom accumulator to holds strings '''
def zero(self,initValue=""):
return initValue
def addInPlace(self,str1,str2):
return str1.strip()+'\n'+str2.strip()
def main():
...
global acc_errCount, acc_errLog
acc_errCount = sc.accumulator(0)
acc_errLog = sc.accumulator('',StringAccumulator())
binaryFileTup = sc.binaryFiles(args.inputDir)
# use flatMap instead of map, to handle corrupt files
linesRdd = binaryFileTup.flatMap(decryptUncompressAndParseFile, True)
df = sqlContext.createDataFrame(linesRdd, ourSchema())
df.registerTempTable("dataTable")
...
The custom string accumulator was very useful in identifying corrupt input files.

Create a image file from binary data in Ruby

I am able to access the binary data of a file and store it in a varible like this
s = File.binread("sample_22122015_03.jpg")
bits = s.unpack("B*")[0]
where bits has data like this "101001001010100100......."
However, I want to do some changes and again write the binary data back to a new image, but I am unable to.
I am using
File.open('shipping_label_new.jpg', 'wb') do|f|
f.write(Base64.decode64(bits))
end
but it's not working and I see that the image is corrupt.
Try this code
s = File.binread("test_img.jpg")
bits = s.unpack("B*")
File.open('new_test_img.jpg', 'wb') do|f|
f.write(bits.pack("B*"))
end
The reverse of String.unpack is Array.pack:
:007 > bits = 'abc'.unpack("B*")
=> ["011000010110001001100011"]
:008 > bits.pack("B*")
=> "abc"

Create in-memory only gzip

I'm trying to gzip a file in ruby without having to write it to disk first. Currently I only know how to make it work by using Zlib::GzipWriter, but I'm really hoping that I can avoid that and keep it in-memory only.
I've tried this, with no success:
def self.make_gzip(data)
gz = Zlib::GzipWriter.new(StringIO.new)
gz << data
string = gz.close.string
StringIO.new(string, 'rb').read
end
Here is what happens when I test it out:
# Files
normal = File.new('chunk0.nbt')
gzipped = File.new('chunk0.nbt.gz')
# Try to create gzip in program
make_gzip normal
=> "\u001F\x8B\b\u0000\x8AJhS\u0000\u0003S\xB6q\xCB\xCCI\xB52\xA8000OK1L\xB2441J5\xB5\xB0\u0003\u0000\u0000\xB9\x91\xDD\u0018\u0000\u0000\u0000"
# Read from a gzip created with the gzip command
reader = Zlib::GzipReader.open gzipped
reader.read
"\u001F\x8B\b\u0000\u0000\u0000\u0000\u0000\u0000\u0000\xED]\xDBn\xDC\xC8\u0011%\x97N\xB82<\x9E\x89\xFF!\xFF!\xC9\xD6dFp\x80\u0005\xB2y\r\"\xEC\n\x89\xB0\xC6\xDAX+A./\xF94\xBF\u0006\xF1\x83>`\u0005\xCC\u000F\xC4\xF0\u000F.............(for 10,000 columns)
You're actually gzipping normal.to_s(which is something like "#<File:0x007f53c9b55b48>") in the following code.
# Files
normal = File.new('chunk0.nbt')
# Try to create gzip in program
make_gzip normal
You should read the content of the file, and make_gzip on the content:
make_gzip normal.read
As I commented, the make_gzip can be updated:
def self.make_gzip(data)
gz = Zlib::GzipWriter.new(StringIO.new)
gz << data
gz.close.string
end

Resources