How to apply 80 character line width in Rmd files? - rstudio

With a quite large group, we have translated R4DS into spanish, and we all used different line widths.
I want to standardize this to 80 characters per line to avoid lines reading this way. In Atom this is just ctrl + shift + q but it breaks the code chunks.
Is there a way to break the lines just in text paragraphs from RStudio?
I don't want to hit enter ~20,000 times to convert this:
into this:

Related

How to make two rows of words as big as one word in InDesign?

Im not sure how to express it so I posted a picture in link below.
It should look like this
Just enter the text on 3 lines like so:
MORE
AT
THE HALL
Then adjust the point sizes, leading, kearning, etc. to create the aesthetic you want.
In this case line 1 and 3 could have full justification.
You can use scaling of the text(as shown in the character panel in attached snapshot) because changing font size also moves the baseline and causes the text to shift downward.
These attributes are also exposed via scripting.

How to convert image to table

I have an image of a table (in my case .gif) and want to extract the table it was (ideally, .ods).
Is there any way to do so? (doing it manually is discarted, since the table has more than 1000 rows and 6 columns)
Here is a part of the image / table:
You will be able to get most of it through OCR, but you'll need to manually verify the data and fix some inaccuracies that will be there. It definitely won't be perfect.
First thing to do is to ensure you have a good quality image for the OCR software:
Here's what I did with your sample png (I'm using Windows):
I opened the image in The Gimp.
Removed the orange/blue backgrounds:
a) Select -> By Color and clicked the blue background
b) I held down Shift and clicked the orange background (this will add it to the current selection)
c) Edit -> Fill With BG Color (this sets it to white)
d) Ctrl-Shift-A to cancel the selection
I removed the partially cut off '305' line:
a) used the Rectangular Select tool button from the palette, and filled the selection with BG Color, as above
Let's remove the table border:
a) Click the 'Fuzzy Select' tool button from the palette
b) Click somewhere on the table border (you should see the 'marching ants' instead of the border)
c) Edit -> Fill With BG Color
d) Ctrl-Shift-A to cancel the selection again
We need to increase the number of pixels that the numbers use so that the OCR can better detect their shapes
a) Image -> Scale Image. I chose to scale by 1000% with Linear Interpolation (the other interpolations won't work as well)
Download and install Tesseract from GitHub
a) At the command prompt type (include the double-quotes to cope with spaces within the path, & change your paths as necessary):
"D:\Program Files (x86)\Tesseract-OCR\tesseract" "d:\temp\your_image.png" "d:\temp\your_txt_file_output"
The output with be a text file with an appended .txt extension. It will still have a few artifacts but we can easily correct those in Notepad++ (or similar):
a) The commas were seen as full-stops, so I did a Find and Replace of "." with "," (I'm assuming you don't have any decimal points in the data!)
b) There were some spaces before a few commas, so I did Find and Replace " ," with "," (note I included a space before the comma in the Find)
c) There were still some spaces in the numbers, so I did a Find and Replace of " " with "" (a space with an empty replace)
This gave the following result:
298 299 300 301 302 303 304
910,820,000 920,820,000 930,820,000 941,820,000
952,820,000 983,820,000 9?4,820,000 210,000
220,000 220,000 220,000 220,000 220,000
220,000 2,500 2,500 3,000 3,000
3,000 3,000 3,000 19,000 19,000
20,000 20,000 20,000 20,000 20,000
Note the question mark in the place of 7 in the second block of text. Things like that still need to be tidied up.
Lastly, you'd copy and paste the rows of text into your spreadsheet etc.
I wanted to post another option I finally found online.
https://convertio.co/es/ocr/
Even though I think K Scandrett answer deserves to be the correct one, since it doesn't rely on a URL, which might go down.
If this is a one-time/rare need and you are windows OS user and you have a Microsoft Excel installed, the application supports extracting the image data to excel. Follow this link for the complete reference.

ZPL data printing at label

I have to print data from DATA_FIELD that can contain between 5 to 50 characters
and label can fit just 20 letters
due i have right to left spelling words i have to print always top 20 letters from right other wise ill loose 1st words of customer name and its usually most important because contains name of the customer
for example i have a code
^FO40,240^A#N,40,40,E:DAVIDBD.FNT
^FD%%Depositor%%
^FS
thats depositor name is:
i dont know why its so long name -- can be variable
and i have to print always last 20 letters as:
its so long name --can be variable
will be happy to get any tips or help
Regards
There really isn't much ZPL can do to help. ZPL is really a page description language, not a programming language.
You will need to process the string to the correct length before adding it to your label code. If you are not using a mono spaced font, then you will have to accommodate for the variable character width.
If you are using a monospaced font you simply have to know how many characters will fit in the area you are trying to print.
If you can wrap text, you might make use of the Use the ^FB – Field Block command in the manual

Why does the text that is read in from a text file get turned into a black square and then the first letter?

The folder names should not have black squares for example the 'j' folder should actually be joshua.murray and not have a black square in front of it.
I was just wondering if anyone else has ever had this problem?
It's the unicode byte order mark (BOM). It a two or three byte header that says what type of unicode it is.
 ■K e i
is what a UTF 16 (notepad's unicode format) looks like as OEM in a command prompt.

How to get display width of a string from Linux command line?

I am working on an AWK script that processes a text file line by line, formats them and stuffs them into an SVG file text field. The SVG takes care of text wrapping automatically, but I want to predict where each line will wrap. (I need some characters to repeat and extend close to the end of the line). I know the exact font, font size, and width of the text field.
Is there a standard utility in Linux or easily available in Ubuntu that will give a width in pixels or inches given a string, font, and font size?
For example:
get-width 'Nimbus Sans L' 18 "test string"
returns "x pixels"
You can do this with the ghostscript interpreter, assuming you have that and the fonts are set up correctly.
Here is the possibly mysterious incantation:
gs -dQUIET -sDEVICE=nullpage 2>/dev/null - \
<<<'18 /NimbusSanL-Regu findfont exch scalefont setfont
(test string) stringwidth pop =='
Using -dQUIET suppresses warnings about font substitution, which is probably not a good idea until you have some idea about how to name the fonts you're looking for.
ghostscript is not a layout engine, and you may find the measurement doesn't work with complicated bidirectional text, combining characters, or East Asian languages. (I tested it with a little Arabic, and it was OK, but no guarantees.) It does not kern, so it will normally produce measurements a little larger than a good layout engine, and possibly a lot larger if the font positions diacritics using kerning.
Finally, if your text includes unbalanced parentheses or backslashes, you'll need to escape them. I use the following:
"$(sed 's/[()\\]/\\&/g' <<<"$text")"
That's because Postscript strings are enclosed in (...) -- (test string) -- and are allowed to include balanced parentheses. Unbalanced parentheses will usually generate a syntax error, unless they are backslash-escaped.
If you have access to inkscape:
FONT="Nimbus Sans L"
SIZE=18
STRING="test string"
inkscape --without-gui --query-id=id1 -W <(echo '<svg><text id="id1" style="font-size:'$SIZE'px;font-family:'$FONT';">'$STRING'</text></svg>') 2>/dev/null
Output (e.g.):
76.577344

Resources