text to html script - ruby

Does anyone have or know of a Ruby script that converts text to html?
Example:
I have a file that contains the following text:
Host Name
Info1
Line1
Info2
Line2
I want to have ruby convert it to the following html output
Host Name
Info1
Line1
Info2
Line2
I tried running RedCloth but got the following error:
The program can't start because msvcrt-ruby18.dll is missing
Thanks
Thanks

That depends upon what you mean by "text to HTML." There are several "web text generators" that convert easy-to-read free text with minimal markup (asterisks to indicate bold, double-spaced paragraphs get surrounded in <p> tags, etc). The most common, for Ruby, are Redcloth, which implements Textile free text, and Bluecloth, which implements Markdown.

Related

YAML - keep text formatting in new document

What i have:
a: some meta info
b: more meta info
c: actual nicely
formatted text that
has line breaks
I'm looking to move c to a new YAML document by using doc separator ---
a: some meta info
b: more meta info
---
actual nicely
formatted text that has line breaks
and so on
But when I use 2nd alternative, I lose formatting like new lines etc.
Is there a way I can use the latter YAML approach format and keep line breaks?
I'm currently using ruamel.yaml library to read this yaml and below function to load my file.
yaml.load_all(f, Loader=yaml.Loader)
If you want the line breaks to be in your loaded value I recommend to make the second document a literal style scalar.
If you have input.yaml:
a: some meta info
b: more meta info
--- |
actual nicely
formatted text that
has line breaks
then this program:
from pathlib import Path
import ruamel.yaml
path_name = Path('input.yaml')
yaml = ruamel.yaml.YAML()
for data in yaml.load_all(path_name):
print(repr(data))
gives:
ordereddict([('a', 'some meta info'), ('b', 'more meta info')])
'actual nicely\nformatted text that\nhas line breaks\n'
Please note that some YAML libraries do (incorrectly) assume that a literal style scalar at the root level of a document needs to be indented.

Bash: replace specific newline with space

I have numerous files with extension .awesome containing lines like the following:
something =
[51,42,12]
Where something =* is in all the files as well as **[ (numbers vary.)
I would like to get rid of the newline, but don't know how. I came across tr, but worry it would replace all newlines. My files contain multiple newlines that I would like to retain (only change this newline.) I've been able to successfully to find and replace in the past with sed, but am having specifically with the special characters (\n and =.) In addition, I'm reading that sed is line by line and cannot handle something like this.
Any guidance would be appreciated.
GNU sed solution:
Sample test.awesome file contents:
some text
another text
something =
[51,42,12]
text
text
The job:
sed '/something =/{N; s/\n/ /;}' test.awesome
The output:
some text
another text
something = [51,42,12]
text
text

Batch text editing .rtfd files

So I have a large bunch of .rtfd text files also containing images. What I'd like to do is making the following edits to all of these files:
Delete everything before a certain line of text, which I'll call 'start_line' here.
Delete every line that says 'a_line'.
Delete everything starting from (and including) 'end_line'.
So if the input files look like this:
line1
line2
start_line
line3
a_line
line4
end_line
line5
I want the output files to look like this:
start_line
line3
line4
Please note that the output file does not include end_line, but it does still include start_line.
Also the output files must remain in the .rtfd format and retain all images, as well as their original lay-out.
I am using Mac OSX and do not have much experience with batch text editing. Can this be done using the terminal? Or can you point me to a piece of software that allows operations like this?
Thank you very much!
Edit: The lines I specified earlier can also just be seen as strings of text, so the input file would become:
text1
start_string
text2
unwanted_string
text3
end_string
text4
So the output would be:
start_string
text2
text3

extracting data from txt file?

Extract data from a text file, the file consists of the following, say:
<img src="a.jpg" alt="abc" height="12px" width="12px">
<div class="ab3" id="1122">
<img src="b.jpg" alt="abc" height="12px" width="12px">
<div class=cd5" id="9876">
I want to extract the "id" value from the above shown text file...
the output should be:
1122
9876
I tried using findstr, find etc(DOS-COMMANDS), but not able to find the perfect regular expression for the same,
any other way is there, any help?
I agree with #izogfif, you should consider some other tools for this task.
But, to answer what you asked for, I got this regex:
id="[0-9]+"
It will give you output like this:
id="1122"
id="9876"
From there you can save those results (or use a pipe, however you do that in DOS), and then this regex:
[0-9]*
Will give you this output:
1122
9876
Use the following code:
( id=")[^"]*"
This will match any Id's value.
You can replace id with any attribute you are searching for.

Unix grep command outputs garbage

I´m executing the following command "grep bruno < bash.txt " which gives me the right output "bruno" and garbage "\f0\fs24 \cf0".
I´m on the command shell on a Mac OS X v10.6.8 and i´m pretty sure i should be getting the line of the found word and the word. Not garbage.
This is the Output:
Mobile-Devs-MacBook-Pro:Screenshots Poupe mdev$ grep bruno < bash.txt
\f0\fs24 \cf0 bruno\
In bash.txt i only have written "bruno", if i output with "cat bash.txt" it also gives me the following garbage:
Mobile-Devs-MacBook-Pro:Screenshots Poupe mdev$ cat bash.txt
{\rtf1\ansi\ansicpg1252\cocoartf1038\cocoasubrtf360
{\fonttbl\f0\fswiss\fcharset0 Helvetica;}
{\colortbl;\red255\green255\blue255;}
\paperw11900\paperh16840\margl1440\margr1440\vieww9000\viewh8400\viewkind0
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\ql\qnatural\pardirnatural
\f0\fs24 \cf0 bruno\
If i make "echo bruno > bash.txt" and then "cat bash.txt" it gives me a clean output. Why am i not seeing a clean output when i write the file by hand?
Your file isn't a plain text file. It is RTF. grep is giving you the line containing "bruno", along with the rich text formatting.
When you do:
echo bruno > bash.txt
bash.txt contains only "bruno".
When you "edit the file by hand", your editor is saving as RTF. You need to save as plain text.
That isn't a plain text file. That looks like an RTF. Grep only understands text and its job is to output the entire line where the search text is found.
I cannot tell from your formatting, but I have to believe the "garbage" you are seeing is on the same line as the "bruno" text.
As others have pointed out, the problem is that the file is in RTF format, and contains formatting information. If you want to create a plain text file in TextEdit, use the menu option Format > Make Plain Text before saving it. Better yet, don't use TextEdit at all -- my favorite for plain text editing is TextWrangler, but there are plenty of other options.

Resources