R Markdown YAML "Scanner error: mapping values..." - yaml

I have noticed this issue when knitting all file types (html, pdf, word). To make sure there's not an issue specific to my program, I went ahead and ran the default .rmd file you get when you create a new markdown. In each case, it does knit correctly, but I always see this at the end. I have searched online and here but cannot seem to find an explanation
Error in yaml::yaml.load(string, ...) :
Scanner error: mapping values are not allowed in this context at line 6, column 19
Error in yaml::yaml.load(string, ...) :
Scanner error: mapping values are not allowed in this context at line 6, column 19
Error in yaml::yaml.load(string, ...) :
Scanner error: mapping values are not allowed in this context at line 4, column 22
Here is my default YAML
---
title: "Untitled"
author: "Scott Jackson"
date: "April 20, 2017"
output: word_document
---
Line 4, column 22 is the space between the 7 and "
I'm not sure where Line 6, column 19 is, but that line is the dashes at the bottom
Any ideas?
Thank you.

I get this error when trying to add a table of contents to the YAML:
title: "STAC2020 Data Analysis"
date: "July 16, 2020"
output: html_notebook:
toc: true
However, if I put html_notebook: on to a separate line then I don't get the error:
title: "STAC2020 Data Analysis"
date: "July 16, 2020"
output:
html_notebook:
toc: true
I do not know why this formatting makes a difference, but it allowed my document to knit and with a table of contents.

I realize this question has gone unanswered for awhile, but maybe someone can still benefit. I had the same error message and I realized I had an extra header command in my yaml. I can't reproduce your exact error, but I get the same message with different line/column references with:
---
title: "Untitled"
author: "Scott Jackson"
date: "April 20, 2017"
output: output: word_document
---
Error in yaml::yaml.load(string, ...) :
Scanner error: mapping values are not allowed in this context at line 4, column 15
Calls: <Anonymous> ... parse_yaml_front_matter -> yaml_load_utf8 -> <Anonymous>
Execution halted
Line 4 column 15 seems to refer to the second colon after the second "output".

I received this error when there was an indentation in the wrong place:
For example, the indentation before header-includes as seen in the example code below caused the error
---
title: "This is a title"
author: "Author Name"
header-includes:
.
.
.
---
When you remove the indentation, the following code below did not produce the error:
---
title: "This is a title"
author: "Author Name"
header-includes:
.
.
.
---

Similarly to Tim Ewers I also got this error when I added a TOC to the YAML:
title: "My title"
date: "April 1, 2020"
output:
pdf_document: default
toc: true
html_document: paged
However, the solution I found was to remove "default", this allowed me to knit the document without an error:
title: "My title"
date: "April 1, 2020"
output:
pdf_document:
toc: true
html_document: paged

I guess this error happens on your content instead of your yaml block.
Because there is no extra content display so I will give a minimal example.
> library(yaml)
> library(magrittr)
> "
+ ---
+ title: 'This is a title'
+ output: github_document
+ ---
+
+ some content
+ " %>%
+ yaml.load()
$title
[1] "This is a title"
$output
[1] "github_document"
It works well. And here is another example.
> "
+ ---
+ title: 'This is a title'
+ output: github_document
+ ---
+
+ some content
+ some content: some content
+ " %>%
+ yaml.load()
Error in yaml.load(.) :
Scanner error: mapping values are not allowed in this context at line 8, column 13
The errors happens at line 8. Because there is a key-value pair not at yaml block.
yaml.load is not enough smart for me.
The temporal solution for me is just extract all lines above the second ---.
> text <- "
+ ---
+ title: 'This is a title'
+ output: github_document
+ ---
+
+ some content
+ some content: some content
+ "
> library(xfun)
> read_lines(text,n_max = 5) %>%
+ yaml.load()
$title
[1] "This is a title"
$output
[1] "github_document"

I had a similar problem and made a request in the YAML and rticles help pages:
https://github.com/viking/r-yaml/issues/92
https://github.com/rstudio/rticles/issues/363

I know this is a 5 year old question but I just got this same error as I was missing a colon
---
title: ''
output:
pdf_document
includes:
before_body: before_body.tex
---
should have been
---
title: ''
output:
pdf_document:
includes:
before_body: before_body.tex
---
and while that doesn't strictly answer the example given, I hope it will help future sufferers of this error message.

Related

updating a yaml file with ruby truncating spaces and adding dash

I have a yaml file which contains the following data
:books:
:action1:
:book_name: name1
:book_author: author1
:publish_date: 2009
:action2:
:book_name: name2
:book_author: author2
:publish_date: 2016
I am trying to update one of the yaml file using simple ruby snippet as below
test = YAML::load_file('books_details.yml')
test[:book][:action1][:book_name] = "book x"
test[:book][:action1][:book_author] = "author y"
test[:book][:action1][:publish_date] = "2019"
File.open('books_details.yml', 'w') { |f| YAML.dump(test, f) }
this works but i get the following output
:books:
:action1:
:book_name: book x
:book_author: author y
:publish_date: 2019
:action2:
:book_name: name2
:book_author: author2
:publish_date: 2016
There is a --- added to the top of the file and the spaces are truncated
Is there any other library that i could use that would not remove spaces and append --- to it??

Change line spacing for RMD abstract?

Is it possible to change the line spacing for the abstract specified in my YAML header to single space, while leaving the rest of the document in double space? My YAML is below:
output: pdf_document
number_sections: true
title: |
| My Title
author:
- Me
header-includes:
- \usepackage{setspace}\doublespacing
- \usepackage{float}
abstract: "My abstract"
keywords: "My keywords"
date: "`r format(Sys.time(), '%B %d, %Y')`"
geometry: margin=1in
fontsize: 12pt
spacing: double
fig_caption: yes
indent: true
---
I've tried wrapping the abstract like so, but it did not work:
abstract:
- \usepackage{setspace}\singlespacing
"My abstract"
- \end{singlespacing}
The abstract is automatically wrapped, so it is enough to use \singlespacing before it:
---
output: pdf_document
number_sections: true
title: |
| My Title
author:
- Me
header-includes:
- \usepackage{setspace}\doublespacing
- \usepackage{float}
abstract: \singlespacing My abstract which has to be long enough to take multiple
lines otherwise one does not see the effect of single-spacing.
keywords: "My keywords"
date: "`r format(Sys.time(), '%B %d, %Y')`"
geometry: margin=1in
fontsize: 12pt
fig_caption: yes
indent: true
---
## R Markdown
This is an R Markdown document. Markdown is a simple formatting syntax for
authoring HTML, PDF, and MS Word documents. For more details on using R Markdown
see <http://rmarkdown.rstudio.com>.
Result:

Parse multiline text with pattern

here is a little example:
02-09-17 1:01 PM - Some User (Add comments)
Hello,
How are you?
Regards,
02-09-17 3:29 PM - Another User (Add comments)
Hey,
Thanks, all is fine.
Some another text here.
02-09-17 4:30 AM - Just a User (Add comments)
some text
with
multiline
I want to parse and process this three comments. What is the best way for this?
Tried regex like this - http://www.rubular.com/r/k1CHJ1STTD but have problems with /m flag. Without multiline flag for regex - can`t catch "body" of comment.
Also tried to split by regex:
text_above.split(/^(\d{1,2}-\d{1,2}-\d{2} \d{1,2}:\d{1,2} [AP]M - .+ \(Add comments\))/)
=> ["",
"02-09-17 1:01 PM - Some User (Add comments)",
"\n" + "Hello,\n" + "\n" + "How are you?\n" + "\n" + "Regards,\n" + "\n",
"02-09-17 3:29 PM - Another User (Add comments)",
"\n" + "Hey,\n" + "\n" + "Thanks, all is fine.\n" + "\n" + "Some another text here.\n" + "\n",
"02-09-17 4:30 AM - Just a User (Add comments)",
"\n" + "some text\n" + "with\n" + "multiline\n" + "\n",
"02-09-17 5:29 PM - Another User (Add comments)",
"\n" + "Hey,\n" + "\n" + "Thanks, all is fine.\n" + "\n" + "Some another text here.\n" + "\n",
"02-09-17 6:30 AM - Just a User (Add comments)",
"\n" + "some text\n" + "with\n" + "multiline\n"]
But this is not comfortable solution.
Ideally I want to get regex captures with three or two group matches, for example:
1. 02-09-17 1:01 PM
2. Some User (Add comments)
3. Hello,
How are you?
Regards,
for each comment, or, Array of comments:
[['02-09-17 1:01 PM - Some User (Add comments) Hello,
How are you?
Regards,'],[...]]
Any ideas? Thanks.
You can keep it simple using two splits (one for the whole string and one for each block):
text.split(/\n\n(?=\d\d-)/).map { |m| m.split(/ - |\n/, 3) }
You can also use the scan method, but it's a little more fastidious:
text.scan(/([\d-]+[^-]+) - (.*)\n(.*(?>\n.*)*?(?=\n\n\d\d-|\z))/)
slice_before might be easier to understand than a huge scan, and it has the advantage of keeping the pattern (split removes it)
data = text.each_line.slice_before(/^\d\d\-\d\d\-\d\d/).map do |block|
time, user = block.shift.strip.split(' - ')
[time, user, block.join.strip]
end
p data
# [["02-09-17 1:01 PM",
# "Some User (Add comments)",
# "Hello,\n\nHow are you?\n\nRegards,"],
# ["02-09-17 3:29 PM",
# "Another User (Add comments)",
# "Hey,\n\nThanks, all is fine.\n\nSome another text here."],
# ["02-09-17 4:30 AM",
# "Just a User (Add comments)",
# "some text\nwith\nmultiline"]]
You can use this regular expression:
(\d{2}-\d{2}-\d{2} \d{1,2}:\d{2} (?:AM|PM)) - (.*?)\r?\n((?:.|\r?\n)+?)(?=\r?\n\d{2}-\d{2}-\d{2} \d{1,2}:\d{2} (?:AM|PM) - |$)
(\d{2}-\d{2}-\d{2} \d{1,2}:\d{2} (?:AM|PM)) matches the first group, the date and time. The date must consist of three numbers, separated by a dash, followed by the time with AM/PM
(.*?)\r?\n((?:.|\r?\n)+?) matches the username up to the first line break (\r?\n) as the second group. Afterwards, anything including linebreaks is matching and building the third group, the comment.
This won't work, because it would handle everything from the beginning of the comment up to the end of the file as a comment. Therefore, you need to select the next date/time format, so that it stops there. You can do this just by repeating the date/time format after the comment and matching non-greedy, but this will include the next datetime already in the current match and therefore exclude it in the next match (which will lead to a skip of every second match). To circumvent this, you can use a positive lookahead: (?=\r?\n\d{2}-\d{2}-\d{2} \d{1,2}:\d{2} (?:AM|PM) - |$). This matches a number afterwards, but does not include it in the match. The last comment must then end at the end of the string $.
You need to use the global flag /g but mustn't use the multi-line flag /g, because the matching of the comment goes over multiple lines.
Here is a live example: https://regex101.com/r/o63GQE/2

Why doesn't YAML alias work unless I use flow style?

If you run this YAML 1.1
- &first {'first': ['description', ['aliases'], ["Explanatory sentences ", "go here."]]}
- *first
- &second 'second':
- 'description'
- ['aliases']
-
- "Explanatory sentences "
- "go here."
- *second
through YAMLlint, you get this:
---
-
first:
- description
-
- aliases
-
- "Explanatory sentences "
- "go here."
-
first:
- description
-
- aliases
-
- "Explanatory sentences "
- "go here."
-
second:
- description
-
- aliases
-
- "Explanatory sentences "
- "go here."
- second
Notice that the first group is repeated twice, while the second group is only shown in full once, with just the name where the repeated block should be. The first group and the second group have exactly the same data - the only difference is the layout. Why doesn't the alias work properly for the second group?
My best guess is that the &anchor has very high precedence. I tried this
- &first 'first': ['description', ['aliases'], ["Explanatory sentences ", "go here."]]
- *first
Rather than this:
- &first {'first': ['description', ['aliases'], ["Explanatory sentences ", "go here."]]}
- *first
And suddenly it behaved the same way as the second group. So it appears that unless you explicitly include the 'first' in a larger node, the &first anchor attaches to just the 'first' string and nothing else.

Strange behavior splitting arrays with Ruby (v1.9.2)

I am trying to handle an array with Ruby v1.9.2 but it has some strange behavior.
The best explanation may be done with examples:
CASE 1 TEST
#test1 = "image/bmp, image/gif, image/jpg".split(',')
Debug #test1:
---
- image/bmp # why this?!
- " image/gif"
- " image/jpg"
CASE 2 TEST
#test2 = ", image/bmp, image/gif, image/jpg".split(',')
Debug #test2:
---
- "" # why this?!
- " image/bmp"
- " image/gif"
- " image/jpg"
WHAT I NEED
Notice: I can use the CASE 2 TEST, but I would like to do things right and better.
Debug that I would like to have:
---
- " image/bmp"
- " image/gif"
- " image/jpg"
In the test case 1 there is no space before "image/bmp" in the result because there is no space before "image/bmp" in the original string.
In the test case 2 there is an empty string at the beginning because the string starts with a comma, and for every separator in the string there is a string in the resulting array, containing what comes before that separator (which in this case means the empty string).
If you want the result you've shown, you could just add a space (but no comma) before "image/bmp" in the source string. Alternatively you could split by /, */ and then add one space before each string with map. Though frankly I don't get why you want a space before each string.
>> ", image/bmp, image/gif, image/jpg".split(/\s*,\s*/).select{|x| x!=""}
=> ["image/bmp", "image/gif", "image/jpg"]

Resources