Unable to call sparkRSQL.init function - sparkr

I am new to Spark and was trying to run the example mentioned in SparkR page. With some effort, I was able to install sparkR into my machine and was able to run the basic wordcount example. However, when I try to run:
library(SparkR) #works fine - loads the package
sc <- sparkR.init() #works fine
sqlContext <- sparkRSQL.init(sc) #fails
It says, there is no package called ‘sparkRSQL’. As per documentation sparkRSQL.init is a function in sparkR package. Please let me know if I am missing anything here.
Thanks in advance.

I have already faced this problem when trying to test sparkR. There is a lack of documentation on this part.
The problem is that "sparkRSQL" and "sparkRHive" are not included in the master branch, so you have to install sparkR package from the "sparkr-sql" branch using this command:
library(devtools)
install_github("amplab-extras/SparkR-pkg", ref="sparkr-sql", subdir="pkg")
There was a hint in Amplab website
DataFrame was introduced in Spark 1.3; the 1.3-compatible SparkR version can be found in the Github repo sparkr-sql branch, which includes a preliminary R API to work with DataFrames. To link SparkR against older versions of Spark, use the archives on this page or the master branch.

Related

Go Module Import - Invalid version: go.mod has malformed module path

I have looked all over for an answer for "go.mod has malformed module path" but I have not found an answer to why I can't get a library I am writing to import. To simplify I have made a tiny library repo: https://github.com/buphmin/test-go-pkg
Note: I am using the stripe api library for structure inspiration. https://github.com/stripe/stripe-go
Problem:
I create a library, go mod init , push code and tag to github. Then try to import package to use the library elsewhere and I get an error message: 'go get: github.com/buphmin/test-go-pkg#v1.0.0: invalid version: go.mod has malformed module path "github.com/buphmin/test-go-pkg/v1" at revision v1.0.0'
I have no idea why this is an issue and I have not found an answer thus far.
Steps to Reproduce
Assuming you have go installed.
Create local folder
go mod init <your_mod>
go get github.com/buphmin/test-go-pkg/v1
error occurs
Other info
go v1.16
ubuntu 18 LTS
go mod file
Copied from the source of truth listed above: https://github.com/buphmin/test-go-pkg
module github.com/buphmin/test-go-pkg/v1
go 1.16
Edit - Answer:
My understand now with the help of #Steven Penny is that v1, v2, etc has more significance than just organization. This article explains how go treats that versioning https://www.honeybadger.io/blog/golang-go-package-management.
This:
module github.com/buphmin/test-go-pkg/v1
is not valid. Should be this:
module github.com/buphmin/test-go-pkg

go-swagger restapi/configure_todo_list.go - api.TodoGetHandler undefined error

I am a newbie in go and go-swagger. I am following steps in Simple Server tutorial in goswagger.io.
I am using Ubuntu 18.04, swagger v0.25.0 and go 1.15.6.
Following the same steps, there are a few differences of the files generated. For instance, goswagger.io's has find_todos_okbody.go and get_okbody.go in models but mine does not. Why is that so?
Link to screenshot of my generated files vs
Link to screenshot of generated files by swagger.io
Starting the server as written in the tutorial go install ./cmd/todo-list-server/ gives me the following error. Can anyone please help with this?
# my_folder/swagger-todo-list/restapi
restapi/configure_todo_list.go:41:8: api.TodosGetHandler undefined (type *operations.TodoListAPI has no field or method TodosGetHandler)
restapi/configure_todo_list.go:42:6: api.TodosGetHandler undefined (type *operations.TodoListAPI has no field or method TodosGetHandler)
The first step in goswagger.io todo-list is swagger init spec .... Which directory should I run this command in? I ran it in a newly created folder in my home directory. However, from the page, it shows the path to be ~/go/src/github.com/go-swagger/go-swagger/examples/tutorials/todo-list. I am not sure whether I should use go get ..., git clone ... or create those folders. Can someone advise me?
Thanks.
This is likely the documentation lagging behind the version of the code that you are running. As long as it compiles, the specific files the tool generates isn't so crucial.
This is a compilation error. When you do go install foo it will try to build the foo package as an executable and then move that to your GOPATH/bin directory. It seems that the generated code in restapi/configure_todo_list.go isn't correct for the operations code generated.
All you need to run this tutorial yourself is an empty directory and the swagger tool (not its source code). You run the commands from the root of this empty project. In order not to run into GOPATH problems I would initialise a module with go mod init todo-list-example before doing anything else.
Note that while the todo-list example code exists inside the go-swagger source, it's there just for documenting example usage and output.
What I would advice for #2 is to make sure you're using a properly released version of go-swagger, rather than installing from the latest commit (which happens when you just do a go get), as I have found that to be occasionally unstable.
Next, re-generate the entire server, but make sure you also regenerate restapi/configure_todo_list.go by passing --regenerate-configureapi to your swagger generate call. This file isn't always refreshed because you're meant to modify it to configure your app, and if you changed versions of the tool it may be different and incompatible.
If after that you still get the compilation error, it may be worth submitting a bug report at https://github.com/go-swagger/go-swagger/issues.
Thanks #EzequielMuns. The errors in #2 went away after I ran go get - u -f ./... as stated in
...
For this generation to compile you need to have some packages in your GOPATH:
* github.com/go-openapi/runtime
* github.com/jessevdk/go-flags
You can get these now with: go get -u -f ./...
I think it's an error of swagger code generation. You can do as folloing to fix this:
delete file configure_todo_list.go;
regenerate code.
# swagger generate server -A todo-list -f ./swagger.yml
Then, you can run command go install ./cmd/todo-list-server/, it will succeed.

sphinx autofunction works fine on local but not on server

The autofunction works perfectly on my local machine but on server it does not show anything. I check the log and find the following thing,
WARNING: autodoc: failed to import function blablabla; the following exception was raised: No module named 'numpy'.
I suppose installing packages is not a must on sphinx. I only need sphinx to read my docstring and generate documentation.
Well, after several hours of investigation I am able to answer my question. Using autodoc_mock_imports does the trick.

how to move a spark model from one spark installation to another?

I have an ALS model that I train on one spark installation. I persist as so:
model.save(sc, './recommender_model')
On the filesystem it looks like this:
$ find ./recommender_model
./recommender_model
./recommender_model/metadata
./recommender_model/metadata/_SUCCESS
./recommender_model/metadata/._SUCCESS.crc
./recommender_model/metadata/.part-00000.crc
./recommender_model/metadata/part-00000
./recommender_model/data
./recommender_model/data/product
./recommender_model/data/product/part-r-00001-406655c7-5c12-44d4-9b39-d5367ccabe29.gz.parquet
./recommender_model/data/product/_common_metadata
./recommender_model/data/product/.part-r-00000-406655c7-5c12-44d4-9b39-d5367ccabe29.gz.parquet.crc
./recommender_model/data/product/.part-r-00001-406655c7-5c12-44d4-9b39-d5367ccabe29.gz.parquet.crc
./recommender_model/data/product/_SUCCESS
./recommender_model/data/product/._metadata.crc
./recommender_model/data/product/._SUCCESS.crc
./recommender_model/data/product/._common_metadata.crc
./recommender_model/data/product/part-r-00000-406655c7-5c12-44d4-9b39-d5367ccabe29.gz.parquet
./recommender_model/data/product/_metadata
./recommender_model/data/user
./recommender_model/data/user/_common_metadata
./recommender_model/data/user/.part-r-00001-f8bf36d3-2145-4af2-9780-6271d68ea25c.gz.parquet.crc
./recommender_model/data/user/_SUCCESS
./recommender_model/data/user/.part-r-00000-f8bf36d3-2145-4af2-9780-6271d68ea25c.gz.parquet.crc
./recommender_model/data/user/._metadata.crc
./recommender_model/data/user/part-r-00000-f8bf36d3-2145-4af2-9780-6271d68ea25c.gz.parquet
./recommender_model/data/user/._SUCCESS.crc
./recommender_model/data/user/._common_metadata.crc
./recommender_model/data/user/part-r-00001-f8bf36d3-2145-4af2-9780-6271d68ea25c.gz.parquet
./recommender_model/data/user/_metadata
I would like to move this folder to another spark installation so that I train on one spark installation and use another spark installation for predictions.
Can I simply tar up this folder and unpack it to the other spark instance where I can load the model? E.g.
model = MatrixFactorizationModel.load(sc, './recommender_model')
my_movie = sc.parallelize([(0, 500)]) # Quiz Show (1994)
individual_movie_rating_RDD = model.predictAll(my_movie)
individual_movie_rating_RDD.collect()
I performed some basic testing and the approach of zipping up the whole folder and unzipping it on another machine worked well for me.

J language's "load" command

I'm working through the J primer, and getting stuck when it comes to the load command.
In particular, there are times when the next step in a tutorial is load 'foo' and I'll get an error like the following:
load 'plot'
not found: /users/username/j64-801/addons/graphics/plot/plot.ijs
|file name error: script
| 0!:0 y[4!:55<'y'
When I do ls /users/username/j64/addons/ I only have config and ide in there, so it's sensible that graphics is not found.
My question:
if given an example that says load 'foo', how do I go about finding and installing foo?
I'd recommend simply installing all the JAL packages ("Addons"). There aren't too many, so the download won't take long, and you'll have access to everything you need to run the Labs, Wiki examples, and any code posted by the community (e.g. on the J Forums).
To install all available Addons, type the following into Jconsole (you could theoretically type it into JHS or JQT instead, but since those are distributed as Addons, you might not be able to upgrade them while they're running):
load'pacman' NB. J PACkage MANager
install'all'
The package manager will start running, and you'll see output like:
Updating server catalog...
Installing 52 packages
Downloading base library...
Installing base library...
Downloading api/gl3...
Installing api/gl3...
Downloading api/ncurses...
Installing api/ncurses...
Then stop and restart Jconsole, and run:
load 'pacman'
'update' jpkg 'all'
To make sure all recursive dependencies were satisfied and all packages are up to date (in particular, the base library). Ultimately, you want to see something like:
Updating server catalog...
Local JAL information was last updated: <datetime>
All available packages are installed and up to date.
Then stop & restart J one last time. When that's done, you should have everything you need to run the Labs.
To answer your final question, if you see a line like:
load'foo'
The first thing you should do is run getscripts_j_ 'foo'. In your example:
getscripts_j_ 'plot'
+--------------------------------------------------------------+
|c:/users/user/j64-801/addons/graphics/plot/plot.ijs|
+--------------------------------------------------------------+
Here, you can see the fully-qualified path of where J expects the package to live.
In particular, you can see it where it is relative to the addons directory, which will always be in the form addons/category/module/foo.ijs. The category and module name indicate which addon you need to install, so all you have to do pick the desired entry from the catalog visible in the package manager.

Resources