How to use HEREDOC to pass as an argument to a method? - ruby

Code example:
create_data_with(
first: "Lorem ipsum dolor sit amet, consectetur adipiscing elit.",
second: <<~TEXT
Aenean vel ex bibendum, egestas tortor sit amet, tempus lorem. Ut sit
amet rhoncus eros. Vestibulum ante ipsum primis in faucibus orci
luctus et ultrices posuere cubilia curae; Quisque non risus vel lacus
tristique laoreet. Curabitur quis auctor mauris, nec tempus mauris.
TEXT,
third: "Nunc aliquet ipsum at semper sodales."
)
The error is present in this line:
second: <<~TEXT
RuboCop describes it like this:
Lint/Syntax: unterminated string meets end of file
(Using Ruby 3.1 parser; configure using TargetRubyVersion parameter, under AllCops)
second: <<~TEXT
Can you please tell me what should be the syntax? I need to keep the look and use of <<~.

Another option is to move the heredoc after the method call. However, since the heredoc starts on the line following its identifier, your method call must not span multiple lines:
create_data_with(first: "foo", second: <<~TEXT, third: "bar")
Aenean vel ex bibendum, egestas tortor sit amet, tempus lorem. Ut sit
amet rhoncus eros. Vestibulum ante ipsum primis in faucibus orci
luctus et ultrices posuere cubilia curae; Quisque non risus vel lacus
tristique laoreet. Curabitur quis auctor mauris, nec tempus mauris.
TEXT
For longer values, you could use multiple heredocs:
create_data_with(first: <<~FIRST, second: <<~SECOND, third: <<~THIRD)
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
FIRST
Aenean vel ex bibendum, egestas tortor sit amet, tempus lorem. Ut sit
amet rhoncus eros. Vestibulum ante ipsum primis in faucibus orci
luctus et ultrices posuere cubilia curae; Quisque non risus vel lacus
tristique laoreet. Curabitur quis auctor mauris, nec tempus mauris.
SECOND
Nunc aliquet ipsum at semper sodales.
THIRD

With heredocs, the parser expects the exact delimiter to close the literal. You open with TEXT, but you close with TEXT, and ruby doesn't consider this literal closed. However, you can (and should in this case) put the comma after the opening delimiter. Here's a fix:
create_data_with(
first: "Lorem ipsum dolor sit amet, consectetur adipiscing elit.",
second: <<~TEXT,
Aenean vel ex bibendum, egestas tortor sit amet, tempus lorem. Ut sit
amet rhoncus eros. Vestibulum ante ipsum primis in faucibus orci
luctus et ultrices posuere cubilia curae; Quisque non risus vel lacus
tristique laoreet. Curabitur quis auctor mauris, nec tempus mauris.
TEXT
third: "Nunc aliquet ipsum at semper sodales."
)
You can even call methods this way. For example, the squiggly heredoc (<<~TEXT) was previously done in rails as <<-TEXT.strip_heredoc

Related

Processing a specific part of a text according to pattern from AWK script

Im developing a script in awk to convert a tex document into html, according to my preferences.
#!/bin/awk -f
BEGIN {
FS="\n";
print "<html><body>"
}
# Function to print a row with one argument to handle either a 'th' tag or 'td' tag
function printRow(tag) {
for(i=1; i<=NF; i++) print "<"tag">"$i"</"tag">";
}
NR>1 {
[conditions]
printRow("p")
}
END {
print "</body></html>"
}
Its in a very young stage of development, as seen.
\documentclass[a4paper, 11pt, titlepage]{article}
\usepackage{fancyhdr}
\usepackage{graphicx}
\usepackage{imakeidx}
[...]
\begin{document}
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nulla placerat lectus sit amet augue facilisis, eget viverra sem pellentesque. Nulla vehicula metus risus, vel condimentum nunc dignissim eget. Vivamus quis sagittis tellus, eget ullamcorper libero. Nulla vitae fringilla nunc. Vivamus id suscipit mi. Phasellus porta lacinia dolor, at congue eros rhoncus vitae. Donec vel condimentum sapien. Curabitur est massa, finibus vel iaculis id, dignissim nec nisl. Sed non justo orci. Morbi quis orci efficitur sem porttitor pulvinar. Duis consectetur rhoncus posuere. Duis cursus neque semper lectus fermentum rhoncus.
\end{document}
What I want, is that the script only interprets the lines that are between \begin{document} and \end{document}, since before they are imports of libraries, variables, etc; which at the moment do not interest me.
How do I make it so that it only processes the text within that pattern?
GNU AWK has feature called Range when you provide two conditions sheared by , then action will be applied only between lines with these conditions (including these lines), consider following simple example, let file.txt content be
junk
\begin{document}
desired text
more desired text
\end{document}
more junk
then
awk '$0=="\\begin{document}",$0=="\\end{document}"{print}' file.txt
gives output
\begin{document}
desired text
more desired text
\end{document}
(tested in gawk 4.2.1)
Use a regex to set a flag and then print based on that flag:
awk '/^\\begin{document}/{flag=1}
flag
/^\\end{document}/{flag=0}' file
That print everything between the start and ending strings inclusive:
\begin{document}
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nulla placerat lectus sit amet augue facilisis, eget viverra sem pellentesque. Nulla vehicula metus risus, vel condimentum nunc dignissim eget. Vivamus quis sagittis tellus, eget ullamcorper libero. Nulla vitae fringilla nunc. Vivamus id suscipit mi. Phasellus porta lacinia dolor, at congue eros rhoncus vitae. Donec vel condimentum sapien. Curabitur est massa, finibus vel iaculis id, dignissim nec nisl. Sed non justo orci. Morbi quis orci efficitur sem porttitor pulvinar. Duis consectetur rhoncus posuere. Duis cursus neque semper lectus fermentum rhoncus.
\end{document}
If you only want the text between and not including the start and end strings:
awk '
/^\\begin{document}/{flag=1; next}
/^\\end{document}/{flag=0}
flag' file
Prints:
# leading blank line printed...
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nulla placerat lectus sit amet augue facilisis, eget viverra sem pellentesque. Nulla vehicula metus risus, vel condimentum nunc dignissim eget. Vivamus quis sagittis tellus, eget ullamcorper libero. Nulla vitae fringilla nunc. Vivamus id suscipit mi. Phasellus porta lacinia dolor, at congue eros rhoncus vitae. Donec vel condimentum sapien. Curabitur est massa, finibus vel iaculis id, dignissim nec nisl. Sed non justo orci. Morbi quis orci efficitur sem porttitor pulvinar. Duis consectetur rhoncus posuere. Duis cursus neque semper lectus fermentum rhoncus.
# ending blank line printed...

How to split a file in bash by pattern if find a number

I have a text like:
1Lorem ipsum dolor sit amet, consectetur adipiscing elit. 2Vivamus dictum, justo mattis sollicitudin pretium, ante magna gravida ligula, 3a condimentum libero tortor sit amet lectus. Nulla congue mauris quis lobortis interdum. 4Integer eget ante mattis ante egestas suscipit. Suspendisse imperdiet pellentesque risus, a luctus sem pellentesque nec. Curabitur vel luctus eros. Morbi id magna sit amet 5risus hendrerit porta. Praesent vitae sapien in nunc aliquet pharetra vitae sed lectus. Donec id magna magna. Phasellus eget rhoncus purus, vitae vestibulum nisl. 6Phasellus massa mi, ultricies id mi sit amet, tristique auctor mi.
I want to split the text by the numbers found, whatever; like:
1Lorem ipsum dolor sit amet, consectetur adipiscing elit.
2Vivamus dictum, justo mattis sollicitudin pretium, ante magna gravida ligula,
3a condimentum libero tortor sit amet lectus. Nulla congue mauris quis lobortis interdum.
...
In awk, I tried:
cat text | awk -F'/^[-+]?[0-9]+$/' '{for (i=1; i<= NF; i++) print $i}'
Where -F is /^[-+]?[0-9]+$/, a pattern used to test if is a number or not. But it`snt split the text.
If I change the pattern to any separator it works without problems, what is then the pattern that I should use for it?
I would harness GNU AWK for this task following way, let file.txt content be
1Lorem ipsum dolor sit amet, consectetur adipiscing elit. 2Vivamus dictum, justo mattis sollicitudin pretium, ante magna gravida ligula, 3a condimentum libero tortor sit amet lectus. Nulla congue mauris quis lobortis interdum. 4Integer eget ante mattis ante egestas suscipit. Suspendisse imperdiet pellentesque risus, a luctus sem pellentesque nec. Curabitur vel luctus eros. Morbi id magna sit amet 5risus hendrerit porta. Praesent vitae sapien in nunc aliquet pharetra vitae sed lectus. Donec id magna magna. Phasellus eget rhoncus purus, vitae vestibulum nisl. 6Phasellus massa mi, ultricies id mi sit amet, tristique auctor mi.
then
awk 'BEGIN{RS="[-+]?[0-9]+"}{printf "%s%s%s", $0, NR==1?"":"\n", RT}' file.txt
gives output
1Lorem ipsum dolor sit amet, consectetur adipiscing elit.
2Vivamus dictum, justo mattis sollicitudin pretium, ante magna gravida ligula,
3a condimentum libero tortor sit amet lectus. Nulla congue mauris quis lobortis interdum.
4Integer eget ante mattis ante egestas suscipit. Suspendisse imperdiet pellentesque risus, a luctus sem pellentesque nec. Curabitur vel luctus eros. Morbi id magna sit amet
5risus hendrerit porta. Praesent vitae sapien in nunc aliquet pharetra vitae sed lectus. Donec id magna magna. Phasellus eget rhoncus purus, vitae vestibulum nisl.
6Phasellus massa mi, ultricies id mi sit amet, tristique auctor mi.
Explanation: I inform GNU AWK that row separator (RS) is (- or +) repeated 0 or 1 time and digit repeated 1 or more time. Then for every row I printf content of said line followed by newline (only for non-first word) followed by found row terminator (RT).
(tested in gawk 4.2.1)
This inserts a new line before every number, except the first, and also strips any whitespace before the new line.
sed -E 's/[[:blank:]]*([0-9]+)/\
\1/g; s/\n//'
You still have the problem of numbers within each line which are regular content. These will also have a new line prepended.
absolutely no need for vendor proprietary solutions :
{m,n,g}awk '
(NF=NF)+gsub("[0-9]+[^0-9]+[.]? ","&\n")+gsub("[ \t]+\n",FS)' FS='\n' OFS= \
RS='^$' ORS=
_
1Lorem ipsum dolor sit amet, consectetur adipiscing elit.
2Vivamus dictum, justo mattis sollicitudin pretium, ante magna gravida ligula,
3a condimentum libero tortor sit amet lectus. Nulla congue mauris quis lobortis interdum.
4Integer eget ante mattis ante egestas suscipit. Suspendisse imperdiet pellentesque risus, a luctus sem pellentesque nec. Curabitur vel luctus eros. Morbi id magna sit amet
5risus hendrerit porta. Praesent vitae sapien in nunc aliquet pharetra vitae sed lectus. Donec id magna magna. Phasellus eget rhoncus purus, vitae vestibulum nisl.
6Phasellus massa mi, ultricies id mi sit amet, tristique auctor mi.

How to ignore URL when searching using ElasticSearch?

Hi,I have a set of documents which may contains some texts, but may have URLs inside them:
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam tincidunt metus a convallis imperdiet. Praesent interdum magna ut lorem bibendum vehicula. Maecenas consectetur tortor a ex pulvinar, sit amet sollicitudin nunc maximus. Pellentesque non gravida ligula, imperdiet pharetra odio. Nunc non massa vitae mauris tempor tempus. Nulla ac laoreet tellus. Nulla consequat tortor eu eros euismod bibendum. Curabitur ante ligula, aliquet at lacus at, pretium convallis eros. Fusce id mi condimentum, tempor lorem ut, pharetra libero.
https://document.io/document/ipsum
In eget eleifend neque. Morbi ex leo, tincidunt non enim ut, rutrum suscipit metus. Cras laoreet ex ut massa consequat condimentum. Aenean finibus eu nisl ut rhoncus. Aliquam finibus nisl risus, id facilisis justo rutrum et. Aenean enim libero, commodo id mi ut, mattis sollicitudin tellus. Aliquam molestie ligula sit amet lorem malesuada, aliquet pretium dolor malesuada. Phasellus fringilla libero in sollicitudin tristique. Quisque molestie, enim et aliquam dapibus, ex erat ultrices nisi, luctus ornare lorem metus eu sapien.
I am using a match query to search words inside the document, however, as you can see sometimes the URL has words that are also part of the actual texts. This is messing the result up. I am just wondering if ElasticSearch has a way for me to simply ignore the URLs and just focus on the texts?
I am using english analyzer for this field at this moment.
You can use Pattern replace character filter in your analyzer. For removing URL from your text you can add this filter to your search analyzer:
Filter:
"char_filter": {
"type": "pattern_replace",
"pattern": "\\b(https?|ftp|file)://[-a-zA-Z0-9+&##/%?=~_|!:,.;]*[-a-zA-Z0-9+&##/%=~_|]",
"replacement": ""
}
This filter will replace URL with empty string so you will not get result from URL match.

Add title to readmore button in tagged articles using joomla

I'm new to joomla and I have a problem in adding title of articles in read more button using TAG ITEMS.
I have two articles like this:
Lorem Ipsum
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque quis elit augue. Aliquam mattis sem sed ligula mattis faucibus. Donec vitae pretium sem. Vivamus ipsum enim...
Read More
Dolor Sit amet
Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Maecenas purus ex, ultrices eget ante ac, tempor suscipit nunc. Etiam viverra dolor id...
Read More
=========================================================================
What i want to do is add the title of each article to the readmore button, so it would look like this:
Lorem Ipsum
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque quis elit augue. Aliquam mattis sem sed ligula mattis faucibus. Donec vitae pretium sem. Vivamus ipsum enim...
Read More: Lorem Ipsum
Dolor sit amet
Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Maecenas purus ex, ultrices eget ante ac, tempor suscipit nunc. Etiam viverra dolor id...
Read More: Dolor sit amet
Can anyone help me with this problem?
Thanks!
In Content -> Articles -> Options, set "Show Title with Read More" = "Show".

Only return slimmed down version of feed

I'm trying to create a new yahoo pipe that will only returned a slimmed down version of an xml.
Say my original XML looks like:
<?xml version="1.0" encoding="UTF-8" ?>
<name>Joe bloggs</name>
<age>31</age>
<description>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse aliquam metus id eros blandit vel convallis nunc accumsan. Fusce adipiscing eros a enim feugiat vestibulum. Cras vulputate malesuada neque vel ultricies. Nunc commodo condimentum risus, eu interdum odio rutrum ut. Nullam nec neque eget dolor tristique dignissim sit amet non nibh. Donec sagittis, elit eget tempus laoreet, tellus eros gravida nunc, eu elementum sem turpis eget velit. In hac habitasse platea dictumst. Donec sed nibh nec arcu feugiat malesuada nec sollicitudin neque. Morbi egestas gravida blandit. Praesent luctus ipsum sed sem porta a tempus ipsum congue. Cras non lectus metus. Fusce non purus quam, vel convallis urna. Aenean dignissim consequat tincidunt. Nunc posuere pulvinar est, id pretium sem vestibulum non</description>.
I'm trying to create a yahoo pipe that will change the tag names, in which I'm using the rename module, and it works fine.
Now, I'm wanting to get rid of the description tag, so my XML only returns name and age.
How can I do that with yahoo pipes?
Cheers in advance for any help
Use the Regex module on the description field and replace .* with an empty textfield. That deletes the field.
Use the "Create RSS" module as the last step in the chain. Then only include the fields you want.

Resources