An ampersand at the start of a YAML entry is normally seen as a label for a set of data that can be referenced later. How do you escape a legitimate ampersand at the start of a YAML entry. For example:
---
- news:
news_text: “Text!’
I am looking to not have &ldquo be a label within the yaml file, but rather when I get parse the YAML file to have the news_text come back with the “ in the entry.
Just put quotes around the text
require 'yaml'
data = <<END
---
- news:
news_text: "“Text!’"
END
puts YAML::load(data).inspect
# produces => [{"news"=>{"news_text"=>"“Text!’"}}]
You probably can enclose the text in quotes:
---
- news:
news_text: "“Text!’"
Besides, you can probably just as well use the proper characters there:
---
- news:
news_text: “Text!’
Putting escapes specific to a totally different markup language into a document written in another markup language seems ... odd to me, somehow.
Or you could put the string on the next line, if you put a '>' or '|' at the spot where the string used to be. Using the '|' character your parser will keep your custom line breaks, while '>' turns it into one long string, ignoring line breaks.
- news:
news_text: >
“Text!’
Putting the entire string in single quotes would do what you want:
---
- news:
news_text: '“Text!’'
But, I think that any yaml library should be smart enough to do that for you?
Related
first question for me on Stack Overflow.
I am trying to write a Bash script to convert the kind of Github Wiki links generated for other internal Github Wiki pages into conventional Markdown-style links.
The Github Wiki link strings look like this:
[[An example of another page]]
I want to convert it to look like this:
[An example of another page](An-example-of-another-page.htm)
Documents have an unknown number of these links and I don't know the content.
Currently I have been playing around with one-line sed solutions given to other problems, like this one:
https://askubuntu.com/questions/1283471/inserting-text-to-existing-text-within-brackets
... with absolutely no success. I'm not even sure where to start with it.
Thanks.
You can try this sed
$ sed -E 's/\[(.[^]]*)\]/\1/g;s/\[(.[^]]*)]/&(\1)/g;:jump s/(\([^ \)]*)[ ]/\1-/;tjump' input_file
[An example of another page](An-example-of-another-page)
s/\[(.[^]]*)\]/\1/g - Remove brackets []
s/\[(.[^]]*)]/&(\1)/g - Duplicate the content inside brackets [], return the match &, then manipulate the match and add parenthesis (\1)
:jump s/(\([^ \)]*)[ ]/\1-/;tjump - Create a label jump, match the empty spaces within the match if it is within parenthesis and replace with -
You can use bash's internal regular expression support to find and replace instances of wiki linked [[text]] with [text](text.htm). The pattern you want to use is \[\[([^\]]*)\]\]
\[ and \] - escapes the left and right square brackets so that they aren't interpreted as meta-characters that let you match character classes
([^\]]*) captures all text inside the double brackets until the first right square bracket
From there you can evaluate this regex and use the $BASH_REMATCH array to extract and manipulate the text. You'll need to run this multiple times in order to match all instances in the string and then replace the string inline using the / and // operators.
Here's a sample script:
#!/usr/bin/env bash
wiki_string="Now, this is [[a story]] all about how
My life [[got flipped-turned upside down]]
And I'd [[like to take a minute]]
Just [[sit]] right there
I'll [[tell you]] how I [[became the prince]] of a town called Bel-Air"
printf 'Original: %s\n' "$wiki_string"
# find each instance of [[text]] and capture the text inside
# the square brackets
# if successful, BASH_REMATCH will contain the matched text and the
# captured value inside the parentheses
while [[ "$wiki_string" =~ \[\[([^\]]*)\]\] ]]; do
# escape the [ and ] characters so we can replace [[text]]
# with our modified value
replace_text="${BASH_REMATCH[0]}"
replace_text="${replace_text/\[\[/\\[\\[}"
replace_text="${replace_text/\]\]/\\]\\]}"
# Get the matched value inside the brackets
link_text="${BASH_REMATCH[1]}"
# store another copy of the text with the spaces replaced
# with dashes and appending .htm
link_target="${link_text// /-}.htm"
# Finally, replace the matched [[text]] with [text](text.htm)
wiki_string="${wiki_string//$replace_text/[$link_text]($link_target)}"
done
printf '\nUpdated: %s\n' "$wiki_string"
Thanks to HatLess for the answer which I adapted. The snippet below converts Github-style links into Markdown-style links, without the two issues that HatLess's solution had. Specifically this doesn't break pre-existing Markdown-style links and it doesn't replace spaces with hyphens within brackets unless part of a link.
sed -E 's/\[\[(.[^]]*)]]/&(support\-\1\.htm)/g;:jump s/(]\([^ \)]*)[ ]/\1-/;tjump;s/\[\[/\[/g;s/]]\(/]\(/g' | pandoc -t html
I'm trying to convert TXT files into pipe-delimited text files.
Let's say I have a file called sample.csv:
aaa",bbb"ccc,"ddd,eee",fff,"ggg,hhh,iii","jjj kkk","lll"" mmm","nnn"ooo,ppp"qqq",rrr" sss,"ttt,""uuu",Z
I'd like to convert this into an output that looks like this:
aaa"|bbb"ccc|ddd,eee|fff|ggg,hhh,iii|jjj kkk|lll" mmm|"nnn"ooo|ppp"qqq"|rrr" sss|ttt,"uuu|Z
Now after tons of searching, I have come the closest using this sed command:
sed -r 's/""/\v/g;s/("([^"]+)")?,/\2\|/g;s/"([^"]+)"$/\1/;s/\v/"/g'
However, the output that I received was:
aaa"|bbb"ccc|ddd,eee|fff|ggg,hhh,iii|jjj kkk|lll" mmm|"nnn"ooo|pppqqq|rrr" sss|ttt,"uuu|Z
Where the expected for the 9th column should have been ppp"qqq" but the result removed the double quotes and what I got was pppqqq.
I have been playing around with this for a while, but to no avail.
Any help regarding this would be highly appreciated.
As suggested in comments sed or any other Unix tool is not recommended for this kind of complex CSV string. It is much better to use a dedicated CSV parser like this in PHP:
$s = 'aaa",bbb"ccc,"ddd,eee",fff,"ggg,hhh,iii","jjj kkk","lll"" mmm","nnn"ooo,ppp"qqq",rrr" sss,"ttt,""uuu",Z';
echo implode('|', str_getcsv($s));
aaa"|bbb"ccc|ddd,eee|fff|ggg,hhh,iii|jjj kkk|lll" mmm|nnnooo|ppp"qqq"|rrr" sss|ttt,"uuu|Z
The problem with sample.csv is that it mixes non-quoted fields (containing quotes) with fully quoted fields (that should be treated as such).
You can't have both at the same time. Either all fields are (treated as) unquoted and quotes are preserved, or all fields containing a quote (or separator) are fully quoted and the quotes inside are escaped with another quote.
So, sample.csv should become:
"aaa""","bbb""ccc","ddd,eee",fff,"ggg,hhh,iii","jjj kkk","lll"" mmm","""nnn""ooo","ppp""qqq""","rrr"" sss","ttt,""uuu",Z
to give you the desired result (using a csv parser):
aaa"|bbb"ccc|ddd,eee|fff|ggg,hhh,iii|jjj kkk|lll" mmm|"nnn"ooo|ppp"qqq"|rrr" sss|ttt,"uuu|Z
Have the same problem.
I found right result with https://www.papaparse.com/demo
Here is a FOSS on github. So maybe you can check how it works.
With the source of [ "aaa""","bbb""ccc","ddd,eee",fff,"ggg,hhh,iii","jjj kkk","lll"" mmm","""nnn""ooo","ppp""qqq""","rrr"" sss","ttt,""uuu",Z ]
The result appears in the browser console:
[1]: https://i.stack.imgur.com/OB5OM.png
I've browsed similar questions and believe i've applied all that i've been able to glean from answers.
I have a .yml file where as far as I can tell each element is formatted identically. And yet according to YamlLint.com
(<unknown>): mapping values are not allowed in this context at line 119 column 16
In this case, line 119 is the line containing the second instance the word "transitions" below. That I can tell each element is formatted identically. Am I missing something here?
landingPage:
include: false
transitions:
-
condition:location
nextState:location
location:
include:false
transitions:
-
condition:excluded
nextState:excluded
excluded:
include:false
transitions:
-
condition:excluded
nextState: excluded
-
condition:age
nextState:age
You cannot have a multiline plain scalar, such as your include:false transitions be the key to a mapping, that is why you get the mapping values not allowed in this context error.
Either you forgot that you have to have a space after the value indicator (:), and you meant to do:
include: false
transitions:
or you need to quote your multi-line scalar:
'include:false
transitions':
or you need to put that plain scalar on one line:
include:false transitions:
please note that some libraries do not allow value indicators in a plain scalar at all, even if they are not followed by space
I fixed this for myself by simply realizing I had indented a line too far, and un-indenting it.
we need to use space before ":"
Then it will excecute
check the yaml script in below
http://www.yamllint.com/
There are couple of issues in the yaml file, with yaml files it gets messy, fortunately it can be identified easily with tools like yaml lint
Install it
npm install -g yaml-lint
Here is how you can validate
E:\githubRepos\prometheus-sql-exporter-usage\etc>yamllint prometheus.yaml
√ YAML Lint successful.
For me the problem was a unicode '-' from a cut and paste. Visualy it looked OK, but the character was 'EN DASH' (U+2013) instead of 'HYPHEN MINUS' (U+002D)
In mine case it was the space after the : in a value:
query-url: https://blabla.com/blabla?label=blabla: blabla
To fix:
query-url: https://blabla.com/blabla?label=blabla:%20blabla
Or:
query-url: "https://blabla.com/blabla?label=blabla: blabla"
If you are using powershell and have copied the cat command, it won't work properly (I'm guessing it is encoding the content in some way). Instead of using "$(cat file.yaml)" you should use $(Get-Content file.yaml -Raw) without the quotes.
Really annoying!
In my case if was some odd disappearing of the initial formatting of the initial chart that was copied in Intellij Idea. It was possible to gfigure out with text-compare tool only:
So, when you do your copy and paste in your IDE, please double check is what you have copied is exactly what you paste, aren't some additional spaces were added.
This is related to cleaning files before parsing them elsewhere, namely, malformed/ugly CSV. I see plenty of examples for removing/matching all characters between certain strings/characters/delimiters, but I cannot find any for specific strings. Example portion of line would look something like:
","Should now be allowed by rule above "Server - Access" added by Rich"\r
To be clear, this is not the entire line, but the entire line is enclosed in quotes and separated by "," and ends in ^M (Windows newline/carriage return).The 'columns' preceding this would be enclosed at each side by ",". I would probably use this too to remove cruft that appears earlier in the line.
What I am trying to get to is the removal of all double quotes between "," and "\r ("Server - Access" - these ones) without removing the delimiters. Alternatively, I may just find and replace them with \" to delimit them for the Ruby CSV library. So far I have this:
(?<=",").*?(?="\\r)
Which basically matches everything between the delimiters. If I replace .*? with anything, be that a letter, double quotes etc, I get zero matches. What am I doing wrong?
Note: This should be Ruby compatible please.
If I understand you correctly, you can use negative lookahead and lookbehind:
text = '","Should now be allowed by rule above "Server - Access" added by Rich"\r'
puts text.gsub(/(?<!,)"(?![,\\r])/, '\"')
# ","Should now be allowed by rule above \"Server - Access\" added by Rich"\r
Of course, this won't work if the values themselves can contain comas and new lines...
For example:
code = <<-EOH
bundle install
bundle exec unicorn -c /etc/unicorn.cfg -D
EOH
What does this code do? What is <<- called?
It's called heredoc. An easy way to define multiline strings which may include single or double quotes without needing to escape them.
See more here, for example.
Often you use heredocs to define large chunks of code. Some editors know about this and can highlight syntax for you there (if you specify language). Look:
There is also a newer HEREDOC syntax for Ruby <<~END that more closely resembles what you would typically see in most shells and other languages with the ~ instead of the - to tell Ruby to strip the leading whitespace to match the least indented line in the block.
https://infinum.co/the-capsized-eight/multiline-strings-ruby-2-3-0-the-squiggly-heredoc
Looks to me like heredoc. The - allows the ending delimiter to ignore whitespace before it.
A simple Google Search gave me this.