Text in columns (like in a table) - bash

I would like to have one column with a label and a second column with a longer text inside with line breaks like in a table.
Label Text: Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed
diam nonumy eirmod tempor invidunt ut labore et dolore magna
aliquyam erat, sed diam voluptua. At vero eos et accusam et
justo duo dolores et ea rebum. Stet clita kasd gubergren, no
sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem
ipsum dolor sit amet, consetetur sadipscing elitr, sed diam.
I tried:
paste label.txt long.txt | column -s $'\t'
Thank you very much in advance!

Glad you have accepted an answer. Just for others who might want to have the
text re-wrapped to avoid over-long lines, this sort of text-processing is what nroff was invented for
over 40 years ago. It's now part of the groff package. Here's an example:
(echo -e '.na\n.nh'
cat label.txt
echo "'in \\w' $(<label.txt)'u"
cat long.txt ) |
nroff | sed '/^$/d'
Nroff commands begin with . or ' at start of line.
.na stops justification, .nh stops hyphenation, 'in sets the indent
to the width of the string (\w'...'), and the sed is to remove trailing blank lines.
You can set the line width with .ll 80 eg for 80 columns.
Long live nroff!
Label Text: Lorem ipsum dolor sit amet, consetetur sadipscing
elitr, sed diam nonumy eirmod tempor invidunt ut
labore et dolore magna aliquyam erat, sed diam
voluptua. At vero eos et accusam et justo duo dolores
et ea rebum. Stet clita kasd gubergren, no sea
takimata sanctus est Lorem ipsum dolor sit amet.
Lorem ipsum dolor sit amet, consetetur sadipscing
elitr, sed diam.

The following bash script might help you:
padded-paste.sh:
#!/bin/bash
label=$1
text=$2
# get the number of lines in the text
nline=$(wc -l ${text} | cut -f 1 -d' ')
# get the width of the label
padding=$(awk 'NR==1{ print length }' ${label})
# create a temp directory
tmpdir=$(mktemp -dt "$(basename $0).XXXXXXXXXX")
templabel=${tmpdir}/label.tmp
# print the first line of the label file to a temp file:
awk 'NR==1{ print }' ${label} > ${templabel}
# add blank padding to the temp label file:
for i in $(seq 2 $nline); do
printf "%*s\n" $padding "" >> ${templabel}
done
# pasted the padded lable to the long text
paste -d' ' ${templabel} ${text}
Based on the following inputs:
label.txt:
Label Text:
long.txt:
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy
eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam
voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet
clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit
amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam.
You can use it like:
sh padded-paste.sh label.txt long.txt
And it will output:
Label Text: Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy
eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam
voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet
clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit
amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam.

Related

Finding all ocurrences from determinate word and exctracting the next word in bash

I have a .txt file where the word 'picture:' is found multiple times in the file. How can I extract all words after the 'pictures:' word and save in a text file
I tried the follow code,but doesn't work:
cat users_sl.txt |awk -F: '/^login:"/{print $2}' cookies.txt
user_sl.txt:
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Quis picture lobortis scelerisque fermentum dui faucibus in ornare quam. Est ullamcorper eget nulla facilisi etiam dignissim diam quis. Quis viverra nibh cras pulvinar mattis nunc sed. Turpis massa sed elementum picture tempus egestas. Condimentum vitae sapien pellentesque habitant. Et molestie ac feugiat sed lectus vestibulum mattis ullamcorper. Tincidunt lobortis feugiat vivamus at augue eget arcu picture dictum varius. Donec massa sapien faucibus et molestie ac feugiat sed. Tincidunt eget nullam non nisi est. Ornare arcu dui vivamus arcu. Mattis enim ut tellus elementum sagittis vitae et leo duis
picturelist.txt:
lobortis
dictum
tempus
Well, I'm assuming you actually just have picture instead of **picture:**, and that you may need to deal with line breaks, so...
$ cat sl.txt
Lorem ipsum dolor sit amet, consectetur adipiscing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
Quis picture lobortis scelerisque fermentum dui faucibus in ornare quam.
Est ullamcorper eget nulla facilisi etiam dignissim diam quis.
Quis viverra nibh cras pulvinar mattis nunc sed.
Turpis massa sed elementum picture tempus egestas.
Condimentum vitae sapien pellentesque habitant.
Et molestie ac feugiat sed lectus vestibulum mattis ullamcorper.
Tincidunt lobortis feugiat vivamus at augue eget arcu picture
dictum varius. Donec massa sapien faucibus et molestie ac feugiat sed.
Tincidunt eget nullam non nisi est.
Ornare arcu dui vivamus arcu.
Mattis enim ut tellus elementum sagittis vitae et leo duis
$ cat sl.txt | tr '\n' ' ' | grep -o 'picture [^ ]*' | cut -d' ' -f2
lobortis
tempus
dictum
Edit: Explanation:
tr '\n' ' ' replaces every (unix) line break with a space -- makes the whole thing one line.
The -o flag tells grep to return only the matched string. The search pattern starts with picture and a space picture , and then everything that follows that is not a space: [^ ]*.
Finally cut using the space character for a delimiter -d ' ' prints the second field: -f 2
Here is a bash solution with a clean shellcheck. Tested with bash version 5.2.2 on a MacOS Ventura system.
#!/usr/bin/env bash
IFS=" " read -r -a WORDS <<< "$(tr '\n' ' ' < users_sl.txt)"
echo processing ${#WORDS[#]} words
for (( i=0; i < ${#WORDS[#]}; i++ ))
do
if [ "${WORDS[$i]}" = "picture" ]; then
echo "${WORDS[i+1]}"
fi
done | tee picturelist.txt
With bash:
#!/bin/bash
arr=( $(<user_sl.txt) )
for ((i=0; i<${#arr[#]}; i++)); do
if [[ ${arr[i]} == picture ]]; then
printf '%s\n' "${arr[i+1]}"
fi
done | tee picturelist.txt
Output
lobortis
tempus
dictum
With perl:
$ perl -nE 'say for /\bpicture\b\s+(\w+)\b/g' user_sl.txt | tee picturelist.txt
lobortis
tempus
dictum
With awk:
$ awk '{
for (i=1; i<=NF; i++) {
if ($i == "picture") print $(i+1)
}
}' user_sl.txt | tee picturelist.txt
or
$ printf '%s\n' $(< users_sl.txt) |
awk '/picture/{p=1;next} {if (p==1) {print;p=0}}' > picturelist.txt
lobortis
tempus
dictum

MigraDoc: How to apply vertical line spacing to a paragraph?

I am creating a PDF using MigraDoc.
Everything works fine except the setting of line spacing of a paragraph.
I want to have more vertical space between paragraph lines.
What I tried so far without any change in the resulting PDF:
string text = "Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.";
Paragraph para = CreateParagraph(text , "Helvetica", 7, "0.1mm", Colors.Black, ParagraphAlignment.Left);
// tried this:
para.Format.LineSpacing = MigraDoc.DocumentObjectModel.Unit.FromMillimeter(12);
// and tried that:
para.Format.LineSpacing = 12;
Can anyone point me in the right direction?
The meaning of LineSpacing depends on the value set for LineSpacingRule.
If LineSpacingRule is set to e.g. Single or Double then the value set for LineSpacing will be ignored.
Try AtLeast or Exactly for LineSpacingRule.

what is Naur Text-Processing

Can someone please explain to me in layman terms what the Naur Text-Processing rules? I'm having trouble understanding what the rules mean such as line by line form and line breaks.
Imagine that you have a text, say
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua.\nUt enim
ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut
aliquip ex ea commodo consequat. Duis aute irure dolor in
reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum.
The text contains three kinds of characters:
Spaces ()
New Line characters (\n)
Letters (all other characters: letters, digits, punctuations...)
You have to split the given text into lines in the most efficient way (you want to obtain as few lines as possible), but the split must meet restrictions:
New Line character \n must start a new line
You can split text and start a new line on space only
Each line can contain at most MaxPos (given constant) characters.
In the sample above for MaxPos = 30 we can split as
Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor
incididunt ut labore et
dolore magna aliqua.\n <- \n New Line must break; we can't add "Ut" in the line
Ut enim ad minim veniam,
...
These splits broke the rules and that's why are invalid:
Lorem ipsum dolor sit amet, consectetur <- The line is too long, exceeds MaxPos = 30
...
Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor incidi <- wrong split: we can split on spaces only
dunt
...
Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor
incididunt ut labore et
dolore magna aliqua.\nUt enim <- \n (New Line) must start a new line
ad minim veniam, quis nostrud
...

pipe output to stdout and then to command then to variable

I'm working on a TeamCity server, one of my build commands is:
xcodebuild -scheme "<myscheme>" archive
I need to retrieve the .dSYM file
code=$(cat <<-'CODE'
$lines = file("php://stdin");
foreach($lines as $line){
if(preg_match("#Touch (.*dSYM)#",$line,$m))echo "$m[1]\n";
}
CODE
)
dsym=$(xcodebuild -scheme "<myscheme>" archive | php -r "$code")
This will work. However, my issue is, I would like the logs of xcodebuild to be piped to stdout AND php -r "$code"
xcodebuild -scheme "<myscheme>" archive | tee >(php -r "$code" --)
This also works, the build log shows, and if I change php -r "$code" -- to php -r "$code" -- | cat, it logs the .dSYM file location.
But, the following doesn't work:
xcodebuild -scheme "<myscheme>" archive | tee >(dsym=$(php -r "$code" --))
#this one is the closest but is the wrong way around,
#dsym = all the output, the filename is sent to stdout
exec 5>&1
dsym=$(xcodebuild -scheme "<myscheme>" archive | tee >(php -r "$code" >&5))
And I am unable to get my head around how read -u X dsym works or is meant to be working. Does anyone know how I would go about:
Piping all output to stdout
Piping all output to an intermediate program/script (grep)
Storing the above intermediate program/script output into a variable
To test: save a file scheme.out and replace xcodebuild... with cat scheme.out
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus nibh
nulla, tempor nec dolor ac, eleifend imperdiet diam. Mauris tristique
congue condimentum. Nullam commodo erat fringilla vestibulum tempus.
Aenean mattis varius erat in venenatis. Donec eu tellus urna. Morbi
lacinia vulputate purus, eu egestas tortor varius eget. Curabitur
vitae commodo elit, vitae ullamcorper leo.
Touch some_test_dsym_file.dSYM
Nunc malesuada, nisi at ultricies lobortis, odio diam rhoncus urna,
sed scelerisque enim ipsum eget quam. Nunc ut iaculis sem. Pellentesque
massa odio, sodales nec lacinia nec, rutrum eu neque. Aenean quis neque
magna. Nam quis dictum quam. Proin ut libero tortor. Class aptent taciti
sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos.
Vivamus vehicula fringilla consequat. Curabitur tincidunt est sed magna
congue tristique. Maecenas aliquam nibh eget pellentesque pellentesque.
Quisque gravida cursus neque sed interdum. Proin ornare dapibus
dignissim.
Desired output
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus nibh
nulla, tempor nec dolor ac, eleifend imperdiet diam. Mauris tristique
congue condimentum. Nullam commodo erat fringilla vestibulum tempus.
Aenean mattis varius erat in venenatis. Donec eu tellus urna. Morbi
lacinia vulputate purus, eu egestas tortor varius eget. Curabitur
vitae commodo elit, vitae ullamcorper leo.
Touch some_test_dsym_file.dSYM
Nunc malesuada, nisi at ultricies lobortis, odio diam rhoncus urna,
sed scelerisque enim ipsum eget quam. Nunc ut iaculis sem. Pellentesque
massa odio, sodales nec lacinia nec, rutrum eu neque. Aenean quis neque
magna. Nam quis dictum quam. Proin ut libero tortor. Class aptent taciti
sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos.
Vivamus vehicula fringilla consequat. Curabitur tincidunt est sed magna
congue tristique. Maecenas aliquam nibh eget pellentesque pellentesque.
Quisque gravida cursus neque sed interdum. Proin ornare dapibus
dignissim.
Desired output of echo $dsym
some_test_dsym_file.dSYM
Your code has a lot of dependencies. I will illustrate what I think that you need without using anything beyond standard unix tools.
This runs a command, seq 4, and sends all of its output to stdout and also sends all of its output to another command, sed 's/3/3-processed/', the output of which is captured in a variable, var:
$ exec 3>&1
$ var=$(seq 4 | tee >(cat >&3) | sed 's/3/3-processed/')
1
2
3
4
To illustrate that we successfully captured the output of the sed command:
$ echo "$var"
1
2
3-processed
4
Explanation: var=$(...) captures the output of file handle 1 (stdout) and assigns it to var. Thus, to make the output also appear on stdout, we need to duplicate stdout to another file handle before $(...) redirects it. Thus, we use exec to duplicate stdout as file handle 3. In this way, tee >(cat >&3) sends the output of the command both the original stdout (now called 3) and to file handle 1 which is passed on the the next stage in the pipeline.
So, using your toolchain, try:
exec 5>&1
dsym=$(xcodebuild -scheme "<myscheme>" archive | tee >(cat >&5) | php -r "$code")

List of substitutions in external file

I need to pass a string against an external file that contains a list of substitutions to perform at every occurrence.
The substitution file will look like this (I'm open to suggestions on the structure, it can be a csv, a yaml, etc...)
"ipsum" "foobar"
"elit" ""
"sit amet" "2312"
My ruby code should be implemented like this:
mystring = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aliquam quis elit augue. Nulla tempus magna nec ligula dapibus malesuada. Fusce at orci augue, sit amet suscipit sem. Suspendisse potenti."
newstring = mystring.somemagichappenshere
And the newstring value should be "Lorem foobar dolor 2312, consectetur adipiscing . Aliquam quis augue. Nulla tempus magna nec ligula dapibus malesuada. Fusce at orci augue, 2312 suscipit sem. Suspendisse potenti."
How should I implement that?
Using a csv:
require 'csv'
str = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aliquam quis elit augue. Nulla tempus magna nec ligula dapibus malesuada. Fusce at orci augue, sit amet suscipit sem. Suspendisse potenti."
replacements = "ipsum,foobar
elit,
sit amet,2312"
#construct a hash from the csv:
transform_table = Hash[CSV.parse(replacements)]
#Take the keys from the hash and use them for a regular expression:
re = Regexp.union(transform_table.keys)
#Do all substituions in one go:
p str.gsub(re, transform_table)
It's quite simple
Read the file
Iterate each line in the file and for each entry use mystring.gsub!(find, replace) to replace the value with the substitution

Resources