Convert markdown italics and boldface to latex - windows

I want to be able to convert markdown italics and boldface to latex versions on the fly (i.e., give a text string(s) return a text string(s)). I thought easy. Wrong! (Which it still may be). See the sill buisness and error I tried at the bottom.
What I have (note the starting asterisk that's been escaped as in markdown):
x <- "\\*note: I *like* chocolate **milk** too ***much***!"
What I would like:
"*note: I \\emph{like} chocolate \\textbf{milk} too \\textbf{\\emph{much}}!"
I'm not attached to regex but would prefer a base solution (though not essential).
Silly business:
helper <- function(ins, outs, x) {
gsub(paste0(ins[1], ".+?", ins[2]), paste0(outs[1], ".+?", outs[2]), x)
}
helper(rep("***", 2), c("\\textbf{\\emph{", "}}"), x)
Error in gsub(paste0(ins[1], ".+?", ins[2]), paste0(outs[1], ".+?", outs[2]), :
invalid regular expression '***.+?***', reason 'Invalid use of repetition operators'
I have this toy that Ananda Mahto helped me make if it's helpful. You could access it from reports via wheresPandoc <- reports:::wheresPandoc
EDIT Per Ben's comments I tried:
action <- paste0(" echo ", x, " | ", wheresPandoc(), " -t latex ")
system(action)
*note: I *like* chocolate **milk** too ***much***! | C:\PROGRA~2\Pandoc\bin\pandoc.exe -t latex
EDIT2 Per Dason's comments I tried:
out <- paste("echo", shQuote(x), "|", wheresPandoc(), " -t latex"); system(out)
system(out, intern = T)
> system(out, intern = T)
\*note: I *like* chocolate **milk** too ***much***! | C:\PROGRA~2\Pandoc\bin\pandoc.exe -t latex

The lack of pipes on Windows made this tricky, but you can get around it using input to provide the stdin:
> x = system("pandoc -t latex", intern=TRUE, input="\\*note: I *like* chocolate **milk** too ***much***!")
> x
[1] "*note: I \\emph{like} chocolate \\textbf{milk} too \\textbf{\\emph{much}}!"

Noting I am working on windows and from ?system
This means that redirection, pipes, DOS internal commands, ... cannot be used
and the note from ?system2
Note
system2 is a more portable and flexible interface than system,
introduced in R 2.12.0. It allows redirection of output without
needing to invoke a shell on Windows, a portable way to set
environment variables for the execution of command, and finer control
over the redirection of stdout and stderr. Conversely, system (and
shell on Windows) allows the invocation of arbitrary command lines.
Using system2
system2('pandoc', '-t latex', input = '**em**', stdout = TRUE)

Related

How to debug with PureScript?

Issue
Following is a minimal, contrived example:
read :: FilePath -> Aff String
read f = do
log ("File: " <> f) -- (1)
readTextFile UTF8 f -- (2)
I would like to do some debug logging in (1), before a potential error on (2) occurs. Executing following code in Spago REPL works for success cases so far:
$ spago repl
> launchAff_ $ read "test/data/tree/root.txt"
File: test/data/tree/root.txt
unit
Problem: If there is an error with (2) - file is directory here - , (1) seems to be not executed at all:
$ spago repl
> launchAff_ $ read "test/data/tree"
~/purescript-book/exercises/chapter9/.psci_modules/node_modules/Effect.Aff/foreign.js:532
throw util.fromLeft(step);
^
[Error: EISDIR: illegal operation on a directory, read] {
errno: -21,
code: 'EISDIR',
syscall: 'read'
}
The original problem is more complex including several layers of recursions (see E-Book exercise 3), where I need logging to debug above error.
Questions
How can I properly log regardless upcoming errors here?
(Optional) Is there a more sophisticated, well-established debugging alternative - purescript-debugger? A decicated VS Code debug extension/functionality would be the cherry on the cake.
First of all, the symptoms you observe do not mean that the first line doesn't execute. It does always execute, you're just not seeing output from it due to how console works in the PureScript REPL. The output gets swallowed. Not the only problem with REPL, sadly.
You can verify that the first line is always executed by replacing log with throwError and observing that the error always gets thrown. Or, alternatively, you can make the first line modify a mutable cell instead of writing to the console, and then examine the cell's contents.
Finally, this only happens in REPL. If you put that launchAff_ call inside main and run the program, you will always get the console output.
Now to the actual question at hand: how to debug trace.
Logging to console is fine if you can afford it, but there is a more elegant way: Debug.trace.
This function has a hidden effect - i.e. its type says it's pure, but it really produces an effect when called. This little lie lets you use trace in a pure setting and thus debug pure code. No need for Effect! This is ok as long as used for debugging only, but don't put it in production code.
The way it works is that it takes two parameters: the first one gets printed to console and the second one is a function to be called after printing, and the result of the whole thing is whatever that function returns. For example:
calculateSomething :: Int -> Int -> Int
calculateSomething x y =
trace ("x = " <> show x) \_ ->
x + y
main :: Effect Unit
main =
log $ show $ calculateSomething 37 5
> npx spago run
'x = 37'
42
The first parameter can be anything at all, not just a string. This lets you easily print a lot of stuff:
calculateSomething :: Int -> Int -> Int
calculateSomething x y =
trace { x, y } \_ ->
x + y
> npx spago run
{ x: 37, y: 5 }
42
Or, applying this to your code:
read :: FilePath -> Aff String
read f = trace ("File: " <> f) \_ -> do
readTextFile UTF8 f
But here's a subtle detail: this tracing happens as soon as you call read, even if the resulting Aff will never be actually executed. If you need tracing to happen on effectful execution, you'll need to make the trace call part of the action, and be careful not to make it the very first action in the sequence:
read :: FilePath -> Aff String
read f = do
pure unit
trace ("File: " <> f) \_ -> pure unit
readTextFile UTF8 f
It is, of course, a bit inconvenient to do this every time you need to trace in an effectful context, so there is a special function that does it for you - it's called traceM:
read :: FilePath -> Aff String
read f = do
traceM ("File: " <> f)
readTextFile UTF8 f
If you look at its source code, you'll see that it does exactly what I did in the example above.
The sad part is that trace won't help you in REPL when an exception happens, because it's still printing to console, so it'll still get swallowed for the same reasons.
But even when it doesn't get swallowed, the output is a bit garbled, because trace actually outputs in color (to help you make it out among other output), and PureScript REPL has a complicated relationship with color:
> calculateSomething 37 5
←[32m'x = 37'←[39m
42
In addition to Fyodor Soikin's great answer, I found a variant using VS Code debug view.
1.) Make sure to build with sourcemaps:
spago build --purs-args "-g sourcemaps"
2.) Add debug configuration to VS Code launch.json:
{
"version": "0.2.0",
"configurations": [
{
"type": "pwa-node",
"request": "launch",
"name": "Launch Program",
"skipFiles": ["<node_internals>/**"],
"runtimeArgs": ["-e", "require('./output/Main/index.js').main()"],
"smartStep": true // skips files without (valid) source map
}
]
}
Replace "./output/Main/index.js" / .main() with the compiled .js file / function to be debugged.
3.) Set break points and step through the .purs file via sourcemap support.

pandoc-citeproc as API: processCites' does not add references

I have a small text file in markdown :
---
title: postWithReference
author: auf
date: 2010-07-29
keywords: homepage
abstract: |
What are the objects of
ontologists .
bibliography: "/home/frank/Workspace8/SSG/site/resources/BibTexLatex.bib"
csl: "/home/frank/Workspace8/SSG/site/resources/chicago-fullnote-bibliography-bb.csl"
---
An example post. With a reference to [#Frank2010a] and more[#navratil08].
## References
and process it in Haskell with processCites' which has a single argument, namely the Pandoc data resulting from readMarkdown. The bibliography and the csl style should be taken from the input file.
The process does not produce errors, but the result of processCites is the same text as the input; references are not treated at all. For the same input the references are resolved with the standalone pandoc (this excludes errors in the bibliography and the csl style)
pandoc -f markdown -t html --filter=pandoc-citeproc -o p1.html postWithReference.md
The issue is therefore in the API. The code I have is:
markdownToHTML4 :: Text -> PandocIO Value
markdownToHTML4 t = do
pandoc <- readMarkdown markdownOptions t
let meta2 = flattenMeta (getMeta pandoc)
-- test if biblio is present and apply
let bib = Just $ ( meta2) ^? key "bibliography" . _String
pandoc2 <- case bib of
Nothing -> return pandoc
_ -> do
res <- liftIO $ processCites' pandoc -- :: Pandoc -> IO Pandoc
when (res == pandoc) $
liftIO $ putStrLn "*** markdownToHTML3 result without references ***"
return res
htmltex <- writeHtml5String html5Options pandoc2
let withContent = ( meta2) & _Object . at "contentHtml" ?~ String ( htmltex)
return withContent
getMeta :: Pandoc -> Meta
getMeta (Pandoc m _) = m
What do I misunderstand? are there any reader options necessary for citeproc? The bibliography is a BibLatex file.
I found in hakyll code a comment, which I cannot understand in light of the code there - perhaps somebody knows what the intention is.
-- We need to know the citation keys, add then *before* actually parsing the
-- actual page. If we don't do this, pandoc won't even consider them
-- citations!
I have a workaround (not an answer to the original question, I still hope that somebody can identify my error!). It is simple to call the standalone pandoc with System.readProess and pass the text and get the result back, not even reading and writing files:
processCites2x :: Maybe FilePath -> Maybe FilePath -> Text -> ErrIO Text
-- porcess the cites in the text (not with the API)
-- using systemcall because the standalone pandoc works with
-- call: pandoc -f markdown -t html --filter=pandoc-citeproc
-- with the input text on stdin and the result on stdout
-- the csl and bib file are used from text, not from what is in the arguments
processCites2x _ _ t = do
putIOwords ["processCite2" ] -- - filein\n", showT styleFn2, "\n", showT bibfn2]
let cmd = "pandoc"
let cmdargs = ["--from=markdown", "--to=html5", "--filter=pandoc-citeproc" ]
let cmdinp = t2s t
res :: String <- callIO $ System.readProcess cmd cmdargs cmdinp
return . s2t $ res
-- error are properly caught and reported in ErrIO
t2s and s2t are conversion utilities between string and text, ErrIO is ErrorT Text a IO and callIO is essentially liftIO with handling of errors.
The original problem was very simple: I had not included the option Ext_citations in the markdownOptions. When it is included, the example works (thanks to help I received from the pandoc-citeproc issue page). The referenced code is updated...

Forcing a package's function to use user-provided function

I'm running into a problem with the MNP package which I've traced to an unfortunate call to deparse (whose maximum width is limited to 500 characters).
Background (easily skippable if you're bored)
Because mnp uses a somewhat idiosyncratic syntax to allow for varying choice sets (you include cbind(choiceA,choiceB,...) in the formula definition), the left hand side of my formula call is 1700 characters or so when model.matrix.default calls deparse on it. Since deparse supports a maximum width.cutoff of 500 characters, the sapply(attr(t, "variables"), deparse, width.cutoff = 500)[-1L] line in model.matrix.default has as its first element:
[1] "cbind(plan1, plan2, plan3, plan4, plan5, plan6, plan7, plan8, plan9, plan10, plan11, plan12, plan13, plan14, plan15, plan16, plan17, plan18, plan19, plan20, plan21, plan22, plan23, plan24, plan25, plan26, plan27, plan28, plan29, plan30, plan31, plan32, plan33, plan34, plan35, plan36, plan37, plan38, plan39, plan40, plan41, plan42, plan43, plan44, plan45, plan46, plan47, plan48, plan49, plan50, plan51, plan52, plan53, plan54, plan55, plan56, plan57, plan58, plan59, plan60, plan61, plan62, plan63, "
[2] " plan64, plan65, plan66, plan67, plan68, plan69, plan70, plan71, plan72, plan73, plan74, plan75, plan76, plan77, plan78, plan79, plan80, plan81, plan82, plan83, plan84, plan85, plan86, plan87, plan88, plan89, plan90, plan91, plan92, plan93, plan94, plan95, plan96, plan97, plan98, plan99, plan100, plan101, plan102, plan103, plan104, plan105, plan106, plan107, plan108, plan109, plan110, plan111, plan112, plan113, plan114, plan115, plan116, plan117, plan118, plan119, plan120, plan121, plan122, plan123, "
[3] " plan124, plan125, plan126, plan127, plan128, plan129, plan130, plan131, plan132, plan133, plan134, plan135, plan136, plan137, plan138, plan139, plan140, plan141, plan142, plan143, plan144, plan145, plan146, plan147, plan148, plan149, plan150, plan151, plan152, plan153, plan154, plan155, plan156, plan157, plan158, plan159, plan160, plan161, plan162, plan163, plan164, plan165, plan166, plan167, plan168, plan169, plan170, plan171, plan172, plan173, plan174, plan175, plan176, plan177, plan178, plan179, "
[4] " plan180, plan181, plan182, plan183, plan184, plan185, plan186, plan187, plan188, plan189, plan190, plan191, plan192, plan193, plan194, plan195, plan196, plan197, plan198, plan199, plan200, plan201, plan202, plan203, plan204, plan205, plan206, plan207, plan208, plan209, plan210, plan211, plan212, plan213, plan214, plan215, plan216, plan217, plan218, plan219, plan220, plan221, plan222, plan223, plan224, plan225, plan226, plan227, plan228, plan229, plan230, plan231, plan232, plan233, plan234, plan235, "
[5] " plan236, plan237, plan238, plan239, plan240, plan241, plan242, plan243, plan244, plan245, plan246, plan247, plan248, plan249, plan250, plan251, plan252, plan253, plan254, plan255, plan256, plan257, plan258, plan259, plan260, plan261, plan262, plan263, plan264, plan265, plan266, plan267, plan268, plan269, plan270, plan271, plan272, plan273, plan274, plan275, plan276, plan277, plan278, plan279, plan280, plan281, plan282, plan283, plan284, plan285, plan286, plan287, plan288, plan289, plan290, plan291, "
[6] " plan292, plan293, plan294, plan295, plan296, plan297, plan298, plan299, plan300, plan301, plan302, plan303, plan304, plan305, plan306, plan307, plan308, plan309, plan310, plan311, plan312, plan313)"
When model.matrix.default tests this against the variables in the data.frame, it returns an error.
The problem
To get around this, I've written a new deparse function:
deparse <- function (expr, width.cutoff = 60L, backtick = mode(expr) %in%
c("call", "expression", "(", "function"), control = c("keepInteger",
"showAttributes", "keepNA"), nlines = -1L) {
ret <- .Internal(deparse(expr, width.cutoff, backtick, .deparseOpts(control), nlines))
paste0(ret,collapse="")
}
However, when I run mnp again and step through, it returns the same error for the same reason (base::deparse is being run, not my deparse).
This is somewhat surprising to me, as what I expect is more typified by this example, where the user-defined function temporarily over-writes the base function:
> print <- function() {
+ cat("user-defined print ran\n")
+ }
> print()
user-defined print ran
I realize the right way to solve this problem is to rewrite model.matrix.default, but as a tool for debugging I'm curious how to force it to use my deparse and why the anticipated (by me) behavior is not happening here.
The functions fixInNamespace and assignInNamespace are provided to allow editing of existing functions. You could try ... but I will not since mucking with deparse looks too dangerous:
assignInNamespace("deparse",
function (expr, width.cutoff = 60L, backtick = mode(expr) %in%
c("call", "expression", "(", "function"), control = c("keepInteger",
"showAttributes", "keepNA"), nlines = -1L) {
ret <- .Internal(deparse(expr, width.cutoff, backtick, .deparseOpts(control), nlines))
paste0(ret,collapse="")
} , "base")
There is an indication on the help page that the use of such functions has restrictions and I would not be surprised that such core function might have additional layers of protection. Since it works via side-effect, you should not need to assign the result.
This is how packages with namespaces search for functions, as described in Section 1.6, Package Namespaces of Writing R Extensions
Namespaces are sealed once they are loaded. Sealing means that imports
and exports cannot be changed and that internal variable bindings
cannot be changed. Sealing allows a simpler implementation strategy
for the namespace mechanism. Sealing also allows code analysis and
compilation tools to accurately identify the definition corresponding
to a global variable reference in a function body.
The namespace controls the search strategy for variables used by
functions in the package. If not found locally, R searches the package
namespace first, then the imports, then the base namespace and then
the normal search path.

How to use Unix pager programs like `less` from Ruby?

Suppose I have a string called very_long_string whose content I want to send to the standard output. But since the string is very long, I want to use less to display the text on the terminal. When I use
`less #{very_long_string}`
I get File not found error message, and if I use:
`less <<< #{very_long_string}`
I get unexpected redirection error message.
So, how to use less from inside Ruby?
You could open a pipe and feed your string to less via its stdin.
IO.popen("less", "w") { |f| f.puts very_long_string }
(Assuming very_long_string is the variable holding your string.)
See: http://www.ruby-doc.org/core-1.8.7/IO.html#method-c-popen
A simple hack:
require 'tempfile'
f = Tempfile.new('less')
f.write(long_string)
system("less #{f.path}")
Although less can read text files its natural fit is to use it as the last command in a pipe. So a natural fit would be:
shell-command-1 | shell-command-2 | shell-command-3 | less
At your shell prompt:
echo tanto va la gatta al lardo che ci lascia lo zampino|less
..So you can try this in irb:
`echo tanto va la gatta al lardo che ci lascia lo zampino|less`
but I will prefer to use:
your_string = "tanto va la gatta al lardo che ci lascia lo zampino"
`echo "#{your_string}"|less`
If you have time read this SO question.
For a thorough demonstration of using system calls in ruby see this gist:
https://gist.github.com/4069

command line arguments in bash to Rscript

I have a bash script that creates a csv file and an R file that creates graphs from that.
At the end of the bash script I call Rscript Graphs.R 10
The response I get is as follows:
Error in is.vector(X) : subscript out of bounds
Calls: print ... <Anonymous> -> lapply -> FUN -> lapply -> is.vector
Execution halted
The first few lines of my Graphs.R are:
#!/bin/Rscript
args <- commandArgs(TRUE)
CorrAns = args[1]
No idea what I am doing wrong? The advice on the net appears to me to say that this should work. Its very hard to make sense of commandArgs
With the following in args.R
print(commandArgs(TRUE)[1])
and the following in args.sh
Rscript args.R 10
I get the following output from bash args.sh
[1] "10"
and no error. If necessary, convert to a numberic type using as.numeric(commandArgs(TRUE)[1]).
Just a guess, perhaps you need to convert CorrAns from character to numeric, since Value section of ?CommandArgs says:
A character vector containing the name
of the executable and the
user-supplied command line arguments.
UPDATE: It could be as easy as:
#!/bin/Rscript
args <- commandArgs(TRUE)
(CorrAns = args[1])
(CorrAns = as.numeric(args[1]))
Reading the docs, it seems you might need to remove the TRUE from the call to commandArgs() as you don't call the script with --args. Either that, or you need to call Rscript Graphs.R --args 10.
Usage
commandArgs(trailingOnly = FALSE)
Arguments
trailingOnly logical. Should only
arguments after --args be returned?
Rscript args.R 10 where 10 is the numeric value we want to pass to the R script.
print(as.numeric(commandArgs(TRUE)[1]) prints out the value which can then be assigned to a variable.

Resources