Trying to export graphics from R (ggplot2) to powerpoint - reporting

can someone explain me why the package "ReporteRs" mentioned in numerous communications as THE package to make these kind of export is not anymore available ? What is the replacement ? I read a mention about an alternative : "officer" but I was unable to find it. Many Thanks for your help . Ax

If you would like to export R graphs to Powerpoint you can also use the wrapper package export built on top of officer that just came out on CRAN, see
https://cran.r-project.org/web/packages/export/index.html and for demo
https://github.com/tomwenseleers/export
Typical syntax is very easy, e.g.:
install.packages("export")
library(export)
library(ggplot2)
qplot(Sepal.Length, Petal.Length, data = iris, color = Species,
size = Petal.Width, alpha = I(0.7))
graph2ppt(file="ggplot2_plot.pptx", width=6, height=5)
You can also use it to export to Word, Excel, Latex or HTML and you can also use it to export statistical output of various R stats objects.

Related

How to experiment source code with `pdb`, `inspect`, `pprint`?

Problem
I want to understand source code of Kur (deep learning library)
I have no proper training in programming, I prefer to learn by experimenting without prior theoretical knowledge
I want an easy tool to help me dig into the detail workings of source code
debug tools are like pdb library seem to be a good choice to try
But what is the easiest way to get started in experimenting source with pdb?
I just want to write one line of code to dig into details, rather than writing a few lines as demonstrated in many examples when you google pdb
In other words, which function of pdb should I use? and How to use it effectively for experimenting source?
Toy Example
I want to explore the inner workings of kur dump mnist.yml
To make it simple, I want to just explore not beyond __main__.py and kurfile.py.
To be more specific, I want to explore dump() and parse_kurfile() in __main__.py, and Kurfile.__init__() in kurfile.py
Their relationship is as follows:
console: kur dump mnist.yml -->
python: __main__.py : main() --> dump() --> parse_kurfile() -->
python: kurfile.py : Kurfile class --> __init__() ...
python: ... the rest is not to be explored
Which function of pdb should I use to explore the execution flow from dump() to parse_kurfile() and to Kurfile.__init__() and back to dump() again?
Update
How to effectively explore Jupyter notebook using pdb?
pdb inside Jupyter can't even remember console history, not good
One possible solution
use pdb.set_trace only
set_trace is trace the details on the level of the current code block, it will not go deeper to the next inner function.
for example, when I put a single pdb.set_trace inside dump(), pdb will not help me trace into the function of parse_kurfile(), but stay on the current dump() block:
def dump(args):
""" Dumps the Kurfile to stdout as a JSON blob.
"""
pdb.set_trace()
### parse kurfile.yml into parts to be used in python code
spec = parse_kurfile(args.kurfile, args.engine)
If I want to go deeper into parse_kurfile in __main__.py and Kurfile.__init__ in kurfile.py, then I just need to put one pdb.set_trace in each of the two functions, like below:
Update
From my experience so far, there are two libraries inspect and pprint go well with pdb library.
Inside library inspect, I use the following functions the most:
inspect.getdoc: to see the doc of the function
inspect.getmodule: to find out where this function or object come from
inspect.getfullargspec: to find out all the inputs the func takes
inpsect.getsourceliens: to get the source code of the function
with these functions above, when I want to checkout other functions, I don't have to go to find the source code in editor, I can see them right where I am in the pdb.
From library pprint, you can guess, I use pprint.pprint to print out the source code, the doc in a more readable format right inside pdb.
More Update
A working station to explore and experiment source:
using atom to split window and see different source files at the same time;
use iterm2 to split window and use ipython to execute python or bash code
organise them in the following way:
More update
During exploring, I want to have all the attributes and methods of a module or class ready at hand.
To achieve it, I can use inspect.getmembers(module or class name) and use iterm2 split window to view it:
Update: How to change color of iterm2 for the eyes?
Go to iterm2 preferences, color, change to Tango Dark, to gray the foreground color to make white texts look soft
Change Kur logger color setting to:
## logcolor.py
# Color codes for each log-level.
COLORS = {
'DEBUG': BLUE,
'INFO': MAGENTA,
'WARNING': RED,
'ERROR': RED,
'CRITICAL': GREEN
}
How to effectively use pdb in Jupyter notebook?
One way to avoid the drawbacks of pdb in Jupyter:
download the notebook into py file
insert codes like import pdb and pdb.set_trace() into the python code
in console, run python your.py
Now you can explore this py file as you do in above answer

Does the equivalent of matlab's xlswrite() exist for Stata?

I need to export several matrices created in Stata to several different specifically named sheets of an already existing excel file. This would be a piece of cake in Matlab using xlswrite(). I am having trouble finding a similar command in Stata.
"xml_tab" would work but it doesn't seem to want to let me open and make changes to an already existing excel file. It always starts by creating a new excel file.
I would appreciate some help as to how I might get "xml_tab", or some other Stata command, to open an already existing excel file, make changes to it (overwriting specific sheets with new matrices), and then save it without overwriting all the other stuff on the other sheets that I don't want to touch.
Can Stata do that?
Thanks
EDIT:
An example of what I need to do is this:
*Define poverty line
scalar povlin=29347.5
*1) SETUP sheet
mat SETUP=(1,J(1,3,0),1,J(1,2,0),1,1,J(1,5,0),povlin)
/* Here I need to export the matrix SETUP to sheet "SETUP" in an
already existing excel file. In matlab it would be
xlswrite('filename','SETUP','A2') */
*2) FARM sheet
tabstat acres,stat(sum) save
mat acrtot=r(StatTotal)
tabstat aehh07 offrinc07,save
mat vmeans=r(StatTotal)
mat maehh=vmeans[1,1]
mat moffrinc=vmeans[1,2]
tabstat aehh07 offrinc07 acres,stat(cv) save
mat CV=r(StatTotal)
tabstat acres,save
mat macres=r(StatTotal)
mat FARM=(1,acrtot,maehh,CV[1,1],moffrinc,CV[1,2],moffrinc,CV[1,2],J(1,3,0),macres)
/* Here I need to export the matrix FARM to sheet "FARM" in the
same already existing excel file where I put the SETUP matrix. In matlab it would
be xlswrite('filename','FARM','A2') */
I need to do this kind of thing for several sheets.
By matrices you might mean (a) Mata matrices (b) Stata matrices (c) complete or partial datasets including one or more Stata variables. The last (c) seems most likely.
The way to change something in Excel is to open Excel. Stata does not offer as it were remote control of Excel manipulations, which you seem to be asking for. But the Stata commands import excel and export excel appear to offer alternatives.
I have never used xml_tab (Stata Journal 2008) but I always understood its main purpose to be export of results tables, not data.
If you mean (b), you can use svmat first.
I am guessing you don't mean (a).
(UPDATE July 2013) Stata 13 now has a putexcel command. See http://www.stata.com/help.cgi?putexcel for an introduction.

Convert tabular data to graphviz DOT format

I have tabular data with the following fields constituting a record:
SourceID
SourceLabel
SourceGroupID
TargetID
TargetLabel
TargetGroupID
I would like to convert this data to graphviz DOT format either programmatically or as part of a script. In particular, I would like to cluster nodes according to their GroupID.
This seems like it would be a common task--are there existing tools / code (preferably Python or R) examples that do this?
It sounds like the NetworkX library for Python might do what you want. What you need to do is read in an edge list (see networkx.readwrite.edgelist), process it to create the groups or anything else you need, and write out the Graphvis dot file (see networkx.drawing.nx_pydot.write_dot).
NetworkX can do other graph visualizations on its own without Graphvis (gallery, docs), and can export many other formats including GraphML. There are tons of open source tools to visualize graphs that can import GraphML, like
NodeXL, a great introductory tool that integrates network analysis into Excel 2007/2010 (Disclaimer: I'm an advisor for it). Other awesome tools include Gephi and Cytoscape, while Pajek and UCINet are some proprietary alternatives.

How to diff 2 notebooks at the source level?

Any one knows a tool to find difference between 2 notebooks at the source level?
The compare notebooks tool in workbench 2 seems to work at the internal data structure level which is not useful for me. I am looking for tool that looks at differences at the source level (what one sees when looking at a notebook, i.e. not the FullForm).
I am using V8 of Mathematica on windows.
EDIT1:
How I display the output/report from NotebookDiff in a more readable form?
This answer is based on discussion in the comments to other parts of this question.
It also could (and should) be automated if it's going to be used with any regularity.
This could be done by tagging the cells you want compared and using NotebookFind to find the cells for extraction and comparison.
A solution for comparing just a single large cell of code (as sometimes occurs when makeing demonstrations) is to copy the code in InputForm from both notebooks
and paste it into a simple diff tool such as Quick Diff Online
which will then display the standard diff for you:
The above code was taken from one of Nasser's demonstrations.
Another option is to use CellDiff from the AuthorTools package.
Needs["AuthorTools`"];
CellDiff[Cell["Some text.", "Text"],
Cell["Some different text.", "Text"]]
To use on your demonstrations you can copy the cell expressions from the two versions by right clicking on the cell brackets:
There is an undocumented package in the built-in add-ons (in $InstallationDirectory/AddOns/Applications) called AuthorTools. Once loaded, it exposes a NotebookDiff function which provides some basic diff features:
Needs["AuthorTools`"];
nb1 = NotebookPut[
Notebook[{Cell["Subsection heading", "Subsection"],
Cell["Some text.", "Text"]}]];
nb2 = NotebookPut[
Notebook[{Cell["Edited Subsection heading", "Subsection"],
Cell["Some different text.", "Text"]}]];
NotebookPut#NotebookDiff[nb1, nb2]
As this package is undocumented, please realize it is potentially buggy and is not considered a supported feature, but hopefully you still find it useful.
Note that you can also get handles to notebooks with e.g.:
nb1 = NotebookOpen["path/to/a/notebook.nb"]
and a list of notebooks currently open in the front end
Notebooks[]
If you must work with notebooks then NotebookDiff in AuthorTools is probably your best bet. If this is an important part of your workflow (due to version control or some other constraint) and you have some flexibility you may want to consider moving the code from the existing notebook (.nb) into a package file (.m), which will be saved as plain text. You can still open and edit package files in the Mathematica notebook front end, but you get the added benefit of being able to diff them using existing text diffing tools.

FITS Export with custom Metadata

does anybody has experience in exporting data as a FITS file with custom Metadata (FITS header) information? So far I was only able to generate FITS files with the standard Mathematica FITS header template. The documentation gives no hint on whether custom Metadata export is supported and how it might be done.
The following suggestions from comp.soft-sys.math.mathematica do not work:
header=Import[<some FITS file>, "Metadata"];
Export<"test.fits",data ,"Metadata"->header]
or
Export["test.fits",{"Data"->data,"Metadata"->header}]
What is the proper way to export my own Metadata to a FITS file ?
Cheers,
Markus
Update: response from Wolfram Support:
"Mathematica does not yet support Export of metadata for FITS file. The
example are referring to importing of this data. We do plan to support
this in the future..."
"There are also plans to include binary tables into FITS import
functionality."
I will try to come up with some workaround.
According to the documentation for v.7 and v.8, there is a couple of ways of accomplishing what you want, and you almost have the rule form correct:
Export["test.fits", {"Data" -> data, "Metadata" -> header}, "Rules"]
The other ways are
Export["test.fits", header, "Metadata"]
Export["test.fits", {data, header}, {{"Data", "Metadata"}}]
note the double brackets around the element labels in the second method.
Edit: After some testing, due to prodding from #belisarius, whenever I include the "Metadata" element, I get an error stating that it is not a valid export element. Also, you can't export a "RawData" element, either. So, I'd submit a bug for two reasons: the metadata isn't user settable which is vitally important for any serious application. At a minimum, the user should at least be able to augment the default Mathematica metadata. Second, the documentation is woefully inadequate in describing what is a "valid" export element vs. import element. Of course, I'd describe all of the documentation for v.6 and beyond as woefully inadequate, so this is par for the course.
Mathematica 9 now allows export of metadata (header) entries, which are additive to the standard required entries. In the Help browser, search "FITS" and there is an example that shows this (with Export followed by Import to verify).

Resources