Loading protocol buffer in ruby or java similar to node - ruby

I have a .proto file that contains my schema and service definition. I'm looking for a method in ruby/java that is similar to how Node loads and parses it (code below). Looking at the grpc ruby gem, I don't see anything that can replicate how Node does it.
Digging around I see this (https://github.com/grpc/grpc/issues/6708) which states that dynamically loading .proto files is only available in node. Hopefully, someone can provide me with an alternative.
Use case: Loading .proto files dynamically as provided in the client but I can only use either ruby or java to do it.
let grpc = require("grpc");
let loader = require("#grpc/proto-loader");
let packageDefinition = loader.loadSync(file.file, {});
let parsed = grpc.loadPackageDefinition(packageDefinition);

I've been giving this a try for the past few month and it seems like Node is the only way to read a protobuf file on runtime. Hopefully this helps anyone in the future that needs it.

Related

osquery extension in Ruby - create new table

I'm trying to implement an extension for osquery in Ruby.
I found some libs and examples doing the same in Java, Node and Python, but nothing helpful implemented in Ruby language.
According to this documention, it's possible generating the code using Thrift: https://osquery.readthedocs.io/en/stable/development/osquery-sdk/#thrift-api
The steps I did, so far:
Generated the code using thrift -r --gen rb osquery.thrift
Created a class and some code to connect to the server and register the extension
This is the code of the class
# include thrift-generated code
$:.push('./gen-rb')
require 'thrift'
require 'extension_manager'
socket = Thrift::UNIXSocket.new(<path_to_socket>)
transport = Thrift::FramedTransport.new(socket)
protocol = Thrift::BinaryProtocol.new(transport)
client = ExtensionManager::Client.new(protocol)
transport.open()
info = InternalExtensionInfo.new
info.name = "zzz"
info.version = "1.0"
extension = ExtensionManager::RegisterExtension_args.new
extension.info = info
client.registerExtension(extension, {'table' => {'zzz' => [{'name' => 'TEXT'}]}})
To get the <path_to_socket> you can use:
> osqueryi --nodisable_extensions
osquery> select value from osquery_flags where name = 'extensions_socket';
+-----------------------------------+
| value |
+-----------------------------------+
| /Users/USERNAME/.osquery/shell.em |
+-----------------------------------+
When I try to get this table using osqueryi, I don't see the table when I run select * from osquery_registry;.
Have anybody by any chance implemented an osquery extension already? I'm stuck and I don't know how to proceed from here.
I don't think I've seen anyone make a ruby extension, but once you have the thrift side, it should be pretty simple.
As a tool, osquery supports a lot of options. So there's no single way to do this. Generally speaking, extensions run as their own process and the communicate over that thrift socket.
Usually, they're very simple and osquery invokes extensions directly with appropriate command line arguments. This is alluded to in the doc you linked with the example accepting --interval, --socket, and --timeout. If you do this, you'll want to look at osquery's --extensions_autoload and --extensions_require option. (I would recommend this route)
Less common, is to start osquery with a specified socket path using --extensions_socket. And then your extension can use that. This way is more common is the extension cannot be a simple binary, and instead is a large complex system.
I find myself playing around with thrift via ruby. And it seems to work if I used a BufferedTransport:
socket = Thrift::UNIXSocket.new('/tmp/osq.sock')
transport = Thrift::BufferedTransport.new(socket)
protocol = Thrift::BinaryProtocol.new(transport)
client = ExtensionManager::Client.new(protocol)
transport.open()
client.ping()
client.query("select 1")

How to send a list of objects in Haxe between client and server?

I am trying to write an online message board in Haxe (OpenFL). There are lots of server/client examples online. But I am new to this area and I do not understand any of them. What is the easiest way to send a list of objects between server and client? Could you guys give an example?
You could use JSON
You can put this in your openFL project (client):
var myData = [1,2,3,4,5];
var http = new haxe.Http("server.php");
http.addParameter("myData", haxe.Json.stringify(myData));
http.onData = function(resultData) {
trace('the data is send to server, this is the response:' + resultData);
}
http.request(true);
If you have a server.php file, you can access the data like this:
$myData = json_decode($_POST["myData"]);
If the server returns Json data which needs to be read in the client, then in Haxe you need to do haxe.Json.parse(resultData);
EDIT: I'm not still sure if the user's problem is really about sending "a list of objects"; see comment to the question...
The easiest way is to use Haxe Serialization, either with Haxe Remoting or with your own protocol on top of TCP/UDP. The choice of protocol depends whether you already have something built and whether you will be calling functions or simply getting/posting data.
In either case, haxe.Serializer/Unserializer will give you a format to transmit most (if not all) Haxe objects from client to server with minimal code.
See the following minimal example (from the manual) on how to use the serialization APIs. The format is string based and specified.
import haxe.Serializer;
import haxe.Unserializer;
class Main {
static function main() {
var serializer = new Serializer();
serializer.serialize("foo");
serializer.serialize(12);
var s = serializer.toString();
trace(s); // y3:fooi12
var unserializer = new Unserializer(s);
trace(unserializer.unserialize()); // foo
trace(unserializer.unserialize()); // 12
}
}
Finally, you could also use other serialization formats like JSON (with haxe.Json.stringify/parse) or XML, but they wouldn't be so convenient if you're dealing with enums, class instances or other data not fully supported by these formats.

Parquet-MR AvroParquetWriter - how to convert data to Parquet (with Specific Mapping)

I'm working on a tool for converting data from a homegrown format to Parquet and JSON (for use in different settings with Spark, Drill and MongoDB), using Avro with Specific Mapping as the stepping stone. I have to support conversion of new data on a regular basis and on client machines which is why I try to write my own standalone conversion tool with a (Avro|Parquet|JSON) switch instead of using Drill or Spark or other tools as converters as I probably would if this was a one time job. I'm basing the whole thing on Avro because this seems like the easiest way to get conversion to Parquet and JSON under one hood.
I used Specific Mapping to profit from static type checking, wrote an IDL, converted that to a schema.avsc, generated classes and set up a sample conversion with specific constructor, but now I'm stuck configuring the writers. All Avro-Parquet conversion examples I could find [0] use AvroParquetWriter with deprecated signatures (mostly: Path file, Schema schema) and Generic Mapping.
AvroParquetWriter has only one none-deprecated Constructor, with this signature:
AvroParquetWriter(
Path file,
WriteSupport<T> writeSupport,
CompressionCodecName compressionCodecName,
int blockSize,
int pageSize,
boolean enableDictionary,
boolean enableValidation,
WriterVersion writerVersion,
Configuration conf
)
Most of the parameters are not hard to figure out but WriteSupport<T> writeSupport throws me off. I can't find any further documentation or an example.
Staring at the source of AvroParquetWriter I see GenericData model pop up a few times but only one line mentioning SpecificData: GenericData model = SpecificData.get();.
So I have a few questions:
1) Does AvroParquetWriter not support Avro Specific Mapping? Or does it by means of that SpecificData.get() method? The comment "Utilities for generated Java classes and interfaces." over 'SpecificData.class` seems to suggest that but how exactly should I proceed?
2) What's going on in the AvroParquetWriter constructor, is there an example or some documentation to be found somewhere?
3) More specifically: the signature of the WriteSupport method asks for 'Schema avroSchema' and 'GenericData model'. What does GenericData model refer to? Maybe I'm not seeing the forest because of all the trees here...
To give an example of what I'm aiming for, my central piece of Avro conversion code currently looks like this:
DatumWriter<MyData> avroDatumWriter = new SpecificDatumWriter<>(MyData.class);
DataFileWriter<MyData> dataFileWriter = new DataFileWriter<>(avroDatumWriter);
dataFileWriter.create(schema, avroOutput);
The Parquet equivalent currently looks like this:
AvroParquetWriter<SpecificRecord> parquetWriter = new AvroParquetWriter<>(parquetOutput, schema);
but this is not more than a beginning and is modeled after the examples I found, using the deprecated constructor, so will have to change anyway.
Thanks,
Thomas
[0] Hadoop - The definitive Guide, O'Reilly, https://gist.github.com/hammer/76996fb8426a0ada233e, http://www.programcreek.com/java-api-example/index.php?api=parquet.avro.AvroParquetWriter
Try AvroParquetWriter.builder :
MyData obj = ... // should be avro Object
ParquetWriter<Object> pw = AvroParquetWriter.builder(file)
.withSchema(obj.getSchema())
.build();
pw.write(obj);
pw.close();
Thanks.

sparkR 1.4.0 : how to include jars

I'm trying to hook SparkR 1.4.0 up to Elasticsearch using the elasticsearch-hadoop-2.1.0.rc1.jar jar file (found here). It's requiring a bit of hacking together, calling the SparkR:::callJMethod function. I need to get a jobj R object for a couple of Java classes. For some of the classes, this works:
SparkR:::callJStatic('java.lang.Class',
'forName',
'org.apache.hadoop.io.NullWritable')
But for others, it does not:
SparkR:::callJStatic('java.lang.Class',
'forName',
'org.elasticsearch.hadoop.mr.LinkedMapWritable')
Yielding the error:
java.lang.ClassNotFoundException:org.elasticsearch.hadoop.mr.EsInputFormat
It seems like Java isn't finding the org.elasticsearch.* classes, even though I've tried including them with the command line --jars argument, and the sparkR.init(sparkJars = ...) function.
Any help would be greatly appreciated. Also, if this is a question that more appropriately belongs on the actual SparkR issue tracker, could someone please point me to it? I looked and was not able to find it. Also, if someone knows an alternative way to hook SparkR up to Elasticsearch, I'd be happy to hear that as well.
Thanks!
Ben
Here's how I've achieved it:
# environments, packages, etc ----
Sys.setenv(SPARK_HOME = "/applications/spark-1.4.1")
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths()))
library(SparkR)
# connecting Elasticsearch to Spark via ES-Hadoop-2.1 ----
spark_context <- sparkR.init(master = "local[2]", sparkPackages = "org.elasticsearch:elasticsearch-spark_2.10:2.1.0")
spark_sql_context <- sparkRSQL.init(spark_context)
spark_es <- read.df(spark_sql_context, path = "index/type", source = "org.elasticsearch.spark.sql")
printSchema(spark_es)
(Spark 1.4.1, Elasticsearch 1.5.1, ES-Hadoop 2.1 on OS X Yosemite)
The key idea is to link to the ES-Hadoop package and not the jar file, and to use it to create a Spark SQL context directly.

Xtext get the absolute path of the generated files

I want to access the file generated by Xtext to compile it automatically. So I need its absolute path. It's enough to get the absolute path of the current project at run-time. Any idea how I can get it?
I am working inside the "MyDslGenerator" Class. I tried to get it from the "resource" in
override void doGenerate(Resource resource, IFileSystemAccess fsa)
but couldn't find it.
Help is highly appreciated.
I ended up using this code:
var uri = (fsa as IFileSystemAccessExtension2).getURI(fileName)
maybe you can use the Interface org.eclipse.xtext.generator.IFileSystemAccessExtension2. the passed IFileSystemAccess may implement this interface too.

Resources