Custom content on a tags page - hakyll

I have setup Hakyll to generate basic tag pages from blog posts as follows:
main = do
hakyll $ do
match "preambles/*.md" $ compile $ pandocCompiler >>= relativizeUrls
tags <- buildTags "posts/*.md" (fromCapture "tags/*.html")
tagsRules tags $ \tag pattern -> do
let title = "Posts tagged \"" ++ tag ++ "\""
route idRoute
compile $ do
posts <- recentFirst =<< loadAll pattern
preamble <- loadBody $ fromFilePath ("preambles/" ++ tag ++ ".html")
let ctx = constField "title" title
`mappend` listField "posts" postCtx (return posts)
`mappend` bodyField preamble
`mappend` defaultContext
makeItem ""
>>= loadAndApplyTemplate "templates/tag.html" ctx
>>= loadAndApplyTemplate "templates/default.html" ctx
>>= relativizeUrls
...
I would like to be able to supply an optional preamble for each page.
To do this I would expect to have a markdown file per tag in a preambles directory, and attempt to use these when building the tags pages. However, I can't figure out how to make this optional, because loadSnapshot will fail if not found.
Is this an appropriate approach, and if so, how would I handle missing data?
Also, I'm not sure that bodyField is the appropriate way to pass the data to the template.

This seems like an entirely reasonable approach to me.
To answer your main question, you can take advantage of the Alternative instance on Compiler to provide a fallback.
I'd like to suggest a few minor changes too.
First off, replace your preamble rule with:
let preamblePattern = "preambles/*.md"
match preamblePattern $ do
compile $ pandocCompiler
>>= relativizeUrls
>>= saveSnapshot "preamble"
>>= pure . void
>>= pure . void converts the Compiler (Item String) into a Compiler (Item ()).
This prevents Hakyll from producing any output files for this rule.
The saveSnapshot "preamble" means we can still get at the content later.
In addition to this, I've extracted the pattern into a variable, for reasons that will become clear later.
Then, replace:
preamble <- loadBody $ fromFilePath ("preambles/" ++ tag ++ ".html")
with:
let preambleIdent = fromCapture preamblePattern tag
preamble <- (loadSnapshotBody preambleIdent "preamble") <|> pure "fallback"
This uses fromCapture, to generate the identifier from the Pattern.
This reduces duplication, and also sidesteps the bug in your code where you used the extension ".html" when the Identifier is generated from the original filepath, and has extension ".md".
This uses loadSnapshotBody rather than loadBody because of the change suggested above, where we remove the content from the rule output.
Most importantly, it uses <|> from Alternative to provide a fallback.
You'll probably want to replace "fallback" with whichever default value you want.
Finally, makeItem "" needs to be replaced with makeItem preamble.

Related

Is it posible to enable extensions on pandoc filters?

I'm trying to make a filter to transform some features of org-mode to GitLab-markdown (not supported by Pandoc out of the box), in particular, the math blocks.
The filter should work when transforming to markdown, but instead of giving the markdown format for the math blocks (enclosed by $$...$$), it should write the blocks as
``` math
a + b = c
```
The preces I have now is
In org-mode, the math blocks are simply the latex code:
\begin{equation}
a + b = c
\end{equation}
this is parsed as a pandoc AST RawBlock with format latex. I then remove the first (\begin{equation}) an last line (\end{equation}), and construct a pandoc CodeBlock with atrributes {"math"}, so the CodeBlock object displays in AST as
CodeBlock ("",["math"],[]) "a + b = c\n"
and then I let Pandoc create the markdown document, and the written result is
``` {.math}
a + b = c
```
The question:
I want the bare math, not {.math} written, without the use of CLI options.
I am aware that this can be done setting the Writer extension fenced_code_attributes to false (eg. $pandoc -w markdown-fenced_code_attributes ...), but I would much prefer this done inside the filter.
Or is it possible to set the extensions inside the filter?
Here is my atempted lua-filter:
function split(str,pat)
local tbl = {}
str:gsub(pat, function(x) tbl[#tbl+1]=x end)
return tbl
end
function RawBlock(rb)
if rb.format == "latex" then
local text = rb.text
split_text = split(text, "[^\n]*")
if split_text[1] == '\\begin{equation}' and split_text[#split_text-1] == '\\end{equation}' then
table.remove(split_text, #split_text-1)
table.remove(split_text, 1)
text = table.concat(split_text, "\n")
local cb = pandoc.CodeBlock()
cb.attr = {"",{"math"}}
cb.text = text
return cb
end
end
end
You could take full control of the output by creating the desired block yourself.
E.g., instead of local cb = pandoc.CodeBlock() ff., you could write
return pandoc.RawBlock('markdown',
string.format('``` math\n%s\n```\n', text)
)
So you'd basically be creating the Markdown yourself, which is relatively safe in the case of code blocks (assuming that the math doesn't contain ```, which would be very unusual).
As for the original question: enabling extensions or options in the filter is currently not possible.

Haskell breaks my type when I call show/read

I have defined a new type in my haskell code, which takes a list of lists of strings as a record. An example might be
Board{size=(4,7),pieces=[["OA","AA","AA"],["BBB","BOO"],["OCCC","CCCO","OOCO"]]}
I've set it up as a derived instance of Show and Read. If I just input the code above into ghci, then I get out exactly what I put in, which is fine. However, if I call it with show, I get
"Board {size = (4,7), pieces = [[\"OA\",\"AA\",\"AA\"],[\"BBB\",\"BOO\"],[\"OCCC\",\"CCCO\",\"OOCO\"]]}"
The speech marks are fine, but I've no idea why the backslashes are there. Are you not allowed to next speech marks or something? In any case, this now totally breaks if I try to call read to get it back. I get a long error trying to tell me that none of the strings are data constructors - I don't know why it thinks they are.
Is there any way round this?
Given
> data Board = Board { size :: (Int, Int), pieces :: [[String]] } deriving (Show, Read)
> let b = Board{size=(4,7),pieces=[["OA","AA","AA"],["BBB","BOO"],["OCCC","CCCO","OOCO"]]}
The result of show b is a String
> show b
"Board {size = (4,7), pieces = [[\"OA\",\"AA\",\"AA\"],[\"BBB\",\"BOO\"],[\"OCCC\",\"CCCO\",\"OOCO\"]]}"
The quotes in any String are escaped when show-ing a string. If you output the string instead you'll see that it doesn't contain the \ characters.
> putStrLn $ show b
Board {size = (4,7), pieces = [["OA","AA","AA"],["BBB","BOO"],["OCCC","CCCO","OOCO"]]}
The string produced by show can be read back in again as a board by read
> (read . show $ b) :: Board
Board {size = (4,7), pieces = [["OA","AA","AA"],["BBB","BOO"],["OCCC","CCCO","OOCO"]]}
GHCI already calls show. When you type show something under GHCI, it’s as if you called show $ show something.

Haskell grammar to validate a string in specific format

I would like to define a grammar in Haskell that matches a string in format "XY12XY" (some alpha followed by some numerics), eg variable names in programming languages.
customer123 is a valid variable name, but '123customer' is not a valid variable name.
I am at a loss how to define the grammar and write a validator function that would validate whether a given string is valid variable name. I have been trying to understand and adapt the parser example at: https://wiki.haskell.org/GADT but I just can't get my head around how to tweak it to make it work for my need.
If any kind fellow Haskell gurus would help me define this please:
validate :: ValidFormat -> String -> Bool
validate f [] = False
validate f s = ...
I would like to define the ValidFormat grammar as:
varNameFormat = Concat Alpha $ Concat Alpha Numeric
I'd start with a simple parser and see if that satisfies your needs, unless you can explain why this is not enough for your use case. Parsers are pretty straightforward. I'll give a very simple (and maybe incomplete) example with attoparsec:
import Control.Applicative
import Data.Attoparsec.ByteString.Char8
import qualified Data.ByteString.Char8 as B
validateVar :: B.ByteString -> Bool
validateVar bstr = case parseOnly variableP bstr of
Right _ -> True
Left _ -> False
variableP :: Parser String
variableP =
(++)
<$> many1 letter_ascii -- must start with one or more letters
<*> many (digit <|> letter_ascii) -- then can have any combination of letters/digits
<* endOfInput -- make sure we don't ignore invalid trailing chars
variableP combines parsers via <*> and will require you to handle both results of many1 letter_ascii and many (digit <|> letter_ascii). In this case we just concatenate both results via (++), check the types of many1, many, letter_ascii and digit. The <* says "parse this, but discard the result of the right hand parser" (otherwise you'd have to handle 3 results).
That means if you run the parser on "abc123" you'll get back "abc123". If you parse "1abc" the parser will fail.
Check the type of parseOnly:
parseOnly :: Parser a -> ByteString -> Either String a
We pass it our parser and the bytestring it should parse. If the parser fails we'll get Left <something went wrong>. If the parser succeeds, we'll get Right <our string>. The cool thing is... instead of just giving a string on success, we could do pretty much anything with the results in variableP, as in: use something different than (++), convert the types and whatnot (mind that the Parser type might also have to change then).
Since we only care if the parser succeeded in validateVar, we can just ignore the result in either case.
So instead of defining GADTs for your grammar, you just define Parsers.
You might also find this link useful for a tutorial: http://www.seas.upenn.edu/~cis194/fall14/spring13/lectures.html (week 10 and 11, including the assignments where you basically write your own little parser library)
I've taken this from examples of regex-applicative
import Text.Regex.Applicative
import Data.Char
import Data.Maybe
varNameFormat :: RE Char String
varNameFormat = (:) <$> psym isAlpha <*> many (psym isAlphaNum)
validate :: RE Char String -> String -> Bool
validate re str = isJust $ str =~ re
You will have
*Main> validate varNameFormat "a123"
True
*Main> validate varNameFormat "1a23"
False

Listing functions with debug flag set in R

I am trying to find a global counterpart to isdebugged() in R. My scenario is that I have functions that make calls to other functions, all of which I've written, and I am turning debug() on and off for different functions during my debugging. However, I may lose track of which functions are set to be debugged. When I forget and start a loop, I may get a lot more output (nuisance, but not terrible) or I may get no output when some is desired (bad).
My current approach is to use a function similar to the one below, and I can call it with listDebugged(ls()) or list the items in a loaded library (examples below). This could suffice, but it requires that I call it with the list of every function in the workspace or in the packages that are loaded. I can wrap another function that obtains these. It seems like there should be an easier way to just directly "ask" the debug function or to query some obscure part of the environment where it is stashing the list of functions with the debug flag set.
So, a two part question:
Is there a simpler call that exists to query the functions with the debug flag set?
If not, then is there any trickery that I've overlooked? For instance, if a function in one package masks another, I suspect I may return a misleading result.
I realize that there is another method I could try and that is to wrap debug and undebug within functions that also maintain a hidden list of debugged function names. I'm not yet convinced that's a safe thing to do.
UPDATE (8/5/11): I searched SO, and didn't find earlier questions. However, SO's "related questions" list has shown that an earlier question that is similar, though the function in the answer for that question is both more verbose and slower than the function offered by #cbeleites. The older question also doesn't provide any code, while I did. :)
The code:
listDebugged <- function(items){
isFunction <- vector(length = length(items))
isDebugged <- vector(length = length(items))
for(ix in seq_along(items)){
isFunction[ix] <- is.function(eval(parse(text = items[ix])))
}
for(ix in which(isFunction == 1)){
isDebugged[ix] <- isdebugged(eval(parse(text = items[ix])))
}
names(isDebugged) <- items
return(isDebugged)
}
# Example usage
listDebugged(ls())
library(MASS)
debug(write.matrix)
listDebugged(ls("package:MASS"))
Here's my throw at the listDebugged function:
ls.deb <- function(items = search ()){
.ls.deb <- function (i){
f <- ls (i)
f <- mget (f, as.environment (i), mode = "function",
## return a function that is not debugged
ifnotfound = list (function (x) function () NULL)
)
if (length (f) == 0)
return (NULL)
f <- f [sapply (f, isdebugged)]
f <- names (f)
## now check whether the debugged function is masked by a not debugged one
masked <- !sapply (f, function (f) isdebugged (get (f)))
## generate pretty output format:
## "package::function" and "(package::function)" for masked debugged functions
if (length (f) > 0) {
if (grepl ('^package:', i)) {
i <- gsub ('^package:', '', i)
f <- paste (i, f, sep = "::")
}
f [masked] <- paste ("(", f [masked], ")", sep = "")
f
} else {
NULL
}
}
functions <- lapply (items, .ls.deb)
unlist (functions)
}
I chose a different name, as the output format are only the debugged functions (otherwise I easily get thousands of functions)
the output has the form package::function (or rather namespace::function but packages will have namespaces pretty soon anyways).
if the debugged function is masked, output is "(package::function)"
the default is looking throught the whole search path
This is a simple one-liner using lsf.str:
which(sapply(lsf.str(), isdebugged))
You can change environments within the function, see ?lsf.str for more arguments.
Since the original question, I've been looking more and more at Mark Bravington's debug package. If using that package, then check.for.traces() is the appropriate command to list those functions that are being debugged via mtrace.
The debug package is worth a look if one is spending much time with the R debugger and various trace options.
#cbeleites I like your answer, but it didn't work for me. I got this to work but it is less functional than yours above (no recursive checks, no pretty print)
require(plyr)
debug.ls <- function(items = search()){
.debug.ls <- function(package){
f <- ls(package)
active <- f[which(aaply(f, 1, function(x){
tryCatch(isdebugged(x), error = function(e){FALSE}, finally=FALSE)
}))]
if(length(active)==0){
return(NULL)
}
active
}
functions <- lapply (items, .debug.ls)
unlist (functions)
}
I constantly get caught in the browser window frame because of failing to undebug functions. So I have created two functions and added them to my .Rprofile. The helper functions are pretty straight forward.
require(logging)
# Returns a vector of functions on which the debug flag is set
debuggedFuns <- function() {
envs <- search()
debug_vars <- sapply(envs, function(each_env) {
funs <- names(Filter(is.function, sapply(ls(each_env), get, each_env)))
debug_funs <- Filter(isdebugged, funs)
debug_funs
})
return(as.vector(unlist(debug_vars)))
}
# Removes the debug flag from all the functions returned by `debuggedFuns`
unDebugAll <- function(verbose = TRUE) {
toUnDebug <- debuggedFuns()
if (length(toUnDebug) == 0) {
if (verbose) loginfo('no Functions to `undebug`')
return(invisible())
} else {
if (verbose) loginfo('undebugging [%s]', paste0(toUnDebug, collapse = ', '))
for (each_fn in toUnDebug) {
undebug(each_fn)
}
return(invisible())
}
}
I have tested them out, and it works pretty well. Hope this helps!

Is it possible to rename and block built-in functions temporarily?

I wish to temporarily rename a built-in symbol and use it with different name while block the main name of this symbol. For example, I wish the following code to print only "2" but not "1" and "3":
Block[{print = Print, Print}, Print[1]; print[2]; Print[3];]
In really the above code prints nothing.
Is it possible to make print working inside such code while completely block symbol Print?
Solutions like
With[{Print = f, print = Print}, Print[1]; print[2]; Print[3];]
are not suitable since Print is not really blocked inside such code.
The question appeared while thinking on a way to disable tracing of Message internals.
This is not very clean, but I believe it is serviceable.
Internal`InheritedBlock[{Print},
Unprotect[Print];
Print[x__] := Null /; ! TrueQ[$prn];
print[x__] := Block[{$prn = True}, Print[x]];
Print[1]; print[2]; Print[3];
]
If it is not acceptable to have the function replaced with Null in the return, you may need to use something like:
func[x__] := Hold[func[x]] /; ! TrueQ[$prn];
Followed by a ReleaseHold after the Block.
Or:
func[x__] := zz[x] /; ! TrueQ[$prn];
and then follow the Block with: /. zz -> func

Resources