I have a decompiled stardict dictionary in the form of a tab file
κακός <tab> bad
where <tab> signifies a tabulation.
Unfortunately, the way the words are defined requires the query to include all diacritical marks. So if I want to search for ζῷον, I need to have all the iotas and circumflexes correct.
Thus I'd like to convert the whole file so that the keyword has the diacritic removed. So the line would become
κακος <tab> <h3>κακός</h3> <br/> bad
I know I could read the file line by line in bash, as described here [1]
while read line
do
command
done <file
But what is there any way to automatize the operation of converting the line? I heard about iconv [2] but didn't manage to achieve the desired conversion using it. I'd best like to use a bash script.
Besides, is there an automatic way of transliterating Greek, e.g. using the method Perseus has?
/edit: Maybe we could use the Unicode codes? We can notice that U+1F0x, U+1F8x for x < 8, etc. are all variants of the letter α. This would reduce the amount of manual work. I'd accept a C++ solution as well.
[1] http://en.kioskea.net/faq/1757-how-to-read-a-file-line-by-line
[2] How to remove all of the diacritics from a file?
You can remove diacritics from a string relatively easily using Perl:
$_=NFKD($_);s/\p{InDiacriticals}//g;
for example:
$ echo 'ὦὢῶὼώὠὤ ᾪ' | perl -CS -MUnicode::Normalize -pne '$_=NFKD($_);s/\p{InDiacriticals}//g'
ωωωωωωω Ω
This works as follows:
The -CS enables UTF8 for Perl's stdin/stdout
The -MUnicode::Normalize loads a library for Unicode normalisation
-e executes the script from the command line; -n automatically loops over lines in the input; -p prints the output automatically
NFKD() translates the line into one of the Unicode normalisation forms; this means that accents and diacritics are decomposed into separate characters, which makes it easier to remove them in the next step
s/\p{InDiacriticals}//g removes all characters that Unicoded denotes as diacritical marks
This should in fact work for removing diacritics etc for all scripts/languages that have good Unicode support, not just Greek.
I'm not so familiar with Ancient Greek as I am with Modern Greek (which only really uses two diacritics)
However I went through the vowels and found out which combined with diacritics. This gave me the following list:
ἆἂᾶὰάἀἄ
ἒὲέἐἔ
ἦἢῆὴήἠἤ
ἶἲῖὶίἰἴ
ὂὸόὀὄ
ὖὒῦὺύὐὔ
ὦὢῶὼώὠὤ
I saved this list as a file and passed it to this sed
cat test.txt | sed -e 's/[ἆἂᾶὰάἀἄ]/α/g;s/[ἒὲέἐἔ]/ε/g;s/[ἦἢῆὴήἠἤ]/η/g;s/[ἶἲῖὶίἰἴ]/ι/g;s/[ὂὸόὀὄ]/ο/g;s/[ὖὒῦὺύὐὔ]/υ/g;s/[ὦὢῶὼώὠὤ]/ω/g'
Credit to hungnv
It's a simple sed. It takes each of the options and replaces it with the unmarked character. The result of the above command is:
ααααααα
εεεεε
ηηηηηηη
ιιιιιιι
οοοοο
υυυυυυυ
ωωωωωωω
Regarding transliterating the Greek: the image from your post is intended to help the user type in Greek on the site you took it from using similar glyphs, not always similar sounds. Those are poor transliterations. e.g. β is most often transliterated as v. ψ is ps. φ is ph, etc.
I want to comment lines in the ZPL code, for example:
^XA
^MMT
^LL0531
^PW1280
^LS0
^FT81,528^A0B,29,28^FH\^FDTEXT^FS
// ^FT336,495^A0B,29,33^FH\^FDEAN^FS^FX ----
//^BY3,2,42^FT384,492^BEB,,Y,N Commented lines
//^FD789690466123^FS ----
^PQ1,0,1,Y^XZ
I want this because sometimes my variable is null and do not want to print the barcode.
This is possible? or what the best way to not print the barcode?
The short answer is "Can't be done."
The comment-indicator is ^FX after which characters are ignored - but end-of-comment is any ^ or ~ command which makes ^FX next to useless.
Unless there has been a "block-comment" command added, with a specific start/end-block-comment mnemonic-set, then sorry - you're out-of-luck.
All is not quite lost however.
^XA
^FT336,495^A0B,29,33^FH\^FDEAN^FS^FX
^BY3,2,42^FT384,492^BEB,,Y,N
^FD789690466123^FS
^MMT
^LL0531
^PW1280
^LS0
^FT81,528^A0B,29,28^FH\^FDTEXT^FS
^PQ1,0,1,Y^XZ
will recognise the lines-to-be-commented-out.
^FT336,495^A0B,29,33^FH\^FDEAN^FS^FX
^BY3,2,42^FT384,492^BEB,,Y,N
^FD789690466123^FS
^XA
^MMT
^LL0531
^PW1280
^LS0
^FT81,528^A0B,29,28^FH\^FDTEXT^FS
^PQ1,0,1,Y^XZ
would ignore them, as data between ^XZ and ^XA is disregarded.
I build the line to a string variable in code and put my comments in the concatenation - then send that whole string to the printer the comments will stay behind.
StringBuilder sb = New Stringbuilder("");
sb.append("^XA");
sb.appendLine("^MMT");
sb.appendLine("^LL0531");
// sb.append("this line will be commented out");
// sb.append("this line will be commented out");
// sb.append("this line will be commented out");
sb.appendLine("^PQD,0,1,Y^XZ");
string s = sb.toString();
Something like that. You might use an 'if-else' statement instead of comments to determine if it stays in the string.
One way is to not send the command lines related to the fields you do not want to print. For the example you provided, just eliminate (do not send) the three lines starting with //.
#Mangoo
The short answer is "Can't be done."
The comment-indicator is ^FX after which characters are ignored - but end-of-comment is any ^ or ~ command which makes ^FX next to useless.
Not necessarily. I found ^FX to be very useful when commenting out variables to put in test information. In this case, it is actually useful to have end-of-comment triggered by any ^ or ~ command.
With variables as field data.
^XA^PQ1
^FO12,15^A0N,36,33^FDTitle^FS
^FO210,15^A0N,36,33,^FDInfo^FS
^FO750,15^A0N,165,150^FD|Variable.Number|^FS
^FO90,60^BY4,3.0^BCN,90,N,N,Y,N^FD|Variable.Number|^FS
^XZ
With test info and variables commented out.
^XA^PQ1
^FO12,15^A0N,36,33^FDTitle^FS
^FO210,15^A0N,36,33,^FDInfo^FS
^FO750,15^A0N,165,150^FDTestNumber^FX|Variable.Number|^FS
^FO90,60^BY4,3.0^BCN,90,N,N,Y,N^FDTestNumber^FX|Variable.Number|^FS
^XZ
This makes it possible to use test information while adjusting the format and not losing the original variable names. You can also use this to make informational comments like this:
^FX This is a test label.
^XA^PQ1
^FX This is the title.
^FO12,15^A0N,36,33^FDTitle^FS
^FX This is the info.
^FO210,15^A0N,36,33,^FDInfo^FS
^FX This is the number.
^FO750,15^A0N,165,150^FD|Variable.Number|^FS
^FX This is the barcode.
^FO90,60^BY4,3.0^BCN,90,N,N,Y,N^FD|Variable.Number|^FS
^XZ
Im trying to write my master-degree thesis with latex and that is my first real project with latex, In my thesis I need japanese and some polish characters.
I divided my thesis by submodules. My main module looks like
\documentclass[11pt]{article}
\usepackage{amsmath}
\usepackage{amssymb}
\usepackage{graphicx}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{CJK}
\includeonly{spis_tresci}
\begin{document}
% Definition of title and author
\title{ My Thesis title. }
\author{Mazeryt Freager \\
\\
\begin{CJK*}{UTF8}{min}
一部の日本人のもの
\end{CJK*}
\\ Polish characters are ąćśżźółęń}
\maketitle
\clearpage
\input{Table_of_Contents}
\end{document}
And the above code works perfect.but the problem is in submodule "Table of Contents"
%Also I need utf-8 in file header because Table of Contents include "ś" character in PL
\section{Spis Treści}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{CJK}
%When I add here something more than ASCI code I got into compilation failure
%No mather if it is:
%\begin{CJK*}{UTF8}{min}
%一部の日本人のもの
%\end{CJK*}
%\\ Polish characters are ąćśżźółęń}
abcdefghijklmnoprstuwxyzABCDEFGHIJKLMNOPRSTUWXYZ
%but standard ASCI works
I search a lot about this but I didn't find any solution that works for me
See my answer here on TeX.SX - I'll post it again here for the sake of completeness.
I believe the problem lies in the misconception about the use of the CJK environment - as #egreg said, it can't be enabled and disabled. Just enclose the whole document in one CJK environment and when using CJKutf8 (see here for what difference it makes) utf8 characters using latin script but outside of ASCII will be fine.
Thus your MWE in a fixed version would be:
\documentclass[11pt]{article}
\usepackage{amsmath}
\usepackage{amssymb}
\usepackage{graphicx}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{CJKutf8}
\begin{document}
% Definition of title and author
\begin{CJK*}{UTF8}{min}
\title{My Thesis title.}
\author{Mazeryt Freager\\ \\一部の日本人のもの\\śćóœ}
\maketitle
\clearpage
\input{Table_of_Contents}
\end{CJK*}
\end{document}
with the `Table_of_Contents.tex' having the following contents:
一部の日本人のもの\\
Polish characters are: ąćśżźółęń\\
ASCII: abcdefghijklmnoprstuwxyzABCDEFGHIJKLMNOPRSTUWXYZ
and the output being on the title page:
and
on the first page.
I'm trying to do the following in LaTeX:
\documentclass{article}
\begin{document}
\execute{/usr/local/bin/my-shell-script.sh}
\end{document}
The idea is to execute /usr/local/bin/my-shell-script.sh at the moment of .tex document processing and inject its output into LaTeX stream. Is it possible at all?
PS. It's possible through the package I've made: iexec
I would do something like the following (partially motivated by what Roman suggested): make your LaTeX file be
\documentclass{article}
\begin{document}
\input{scriptoutput.tex}
\end{document}
and generate the file scriptoutput.tex using
/usr/local/bin/my-shell-script.sh > scriptoutput.tex
You could encode this in a makefile if you want to have it run automatically when necessary. Alternatively, you could use the TeX \write18 command,
\documentclass{article}
\immediate\write18{/usr/local/bin/my-shell-script.sh > scriptoutput.tex}
\begin{document}
\input{scriptoutput.tex}
\end{document}
and I think that would automatically run the shell script each time you compile the document. The \immediate is necessary to ensure that the script is run when LaTeX encounters the command, rather than waiting until a page of output is written. (See this question for more on the shipout routine.)
As David pointed out, you can use \write18 to call external programs, then \input the resultant output file. However you will probably want to use \immediate\write18 to make sure the script is executed before calling the \input.
Alternatively, if you use newer versions of pdf(la)tex (after 1.40, I think), you can pipe the output directly into the document, by using a piped input command:
\documentclass{article}
\begin{document}
\input{|"/usr/local/bin/my-shell-script.sh"}
\end{document}
For either method you will need to enable external program calls. For TeXlive distributions, you need to call latex with the -shell-escape option, or for MikTeX, I believe the option is -enable-write18.
You can do this in TeX. This paper (PDF) shows you how to write and execute a virus within TeX. The same principles apply for executing a shell script. However in my opinion it is more practicable to write a Makefile, which runs before your LaTeX run and inserts the result.
On Ubuntu 11.10 GNU/Linux
pdflatex --enable-pipes --shell-escape mytexfile
with
%...
[This section currently is
\input{|"wc kmb-box.tex| tr -s ' ' | cut -d' ' -f 4"}
% 2000 characters are allowed here
\input{kmb-box}
%...
works nicely. ie, this uses wordcount (wc) to report how many characters are in the file kmb-box.tex, which is part of (included in) the document.
(btw If you wanted words rather than characters, just change the number in "-f 4")
Unless it is imperative that the script is run while LaTeX is running I would recommend just using make to run LaTeX and you script.
I have used that approach to add word counting for articles and including statistics on bibliographic references.
Let your script generate a .tex file and include that in you LaTeX source file.
Below is a snippet from one of my Makefiles:
TEX = /usr/texbin/pdflatex
PREVIEW = /usr/bin/open
REPORT = SimMon
REPORT_MASTER = $(REPORT).tex
TEX_OPTIONS = -halt-on-error
SimMon: $(REPORT_MASTER) countRefferedPages
$(TEX) $(TEX_OPTIONS) $(REPORT_MASTER)
#$(PREVIEW) $(REPORT).pdf
countRefferedPages: BibTeXPageCount
cat *.tex | support/BPC/build/Debug/BPC Castle.bib > litteraturTotal.tex
This is how I do it with my own iexec package:
\documentclass{article}
\usepackage{iexec}
\begin{document}
\iexec{/usr/local/bin/my-shell-script.sh}
\end{document}
When the output is not required, I can do just this:
\iexec[quiet]{/usr/local/bin/my-shell-script.sh}
I want to typeset an algorithm in LaTeX. I'm using the algorithmic package and environment to do so. Everything is working great except when I add comments (using \COMMENT), they are output immediately after the statements. I would like for all the comments to be aligned (and offset from the statements). Is there an easy way to do so?
"Reproducing" the PDF output in HTML's pre, I want:
if condition then
something # comment 1
else
something else # comment 2
rather than
if condition then
something # comment 1
else
something else # comment 2
I would do it like this:
\usepackage{eqparbox}
\renewcommand{\algorithmiccomment}[1]{\hfill\eqparbox{COMMENT}{\# #1}}
Note 1: two document compilations are necessary to determine the maximum width of the comment.
Note 2: obviously, this only works for single line comments that aren't too long.
Following on from this idea, here's a complete example in the same sort of way, but also providing a command to have comments that break over lines:
\documentclass{amsbook}
\usepackage{algorithmic,eqparbox,array}
\renewcommand\algorithmiccomment[1]{%
\hfill\#\ \eqparbox{COMMENT}{#1}%
}
\newcommand\LONGCOMMENT[1]{%
\hfill\#\ \begin{minipage}[t]{\eqboxwidth{COMMENT}}#1\strut\end{minipage}%
}
\begin{document}
\begin{algorithmic}
\STATE do nothing \COMMENT{huh?}
\end{algorithmic}
\begin{algorithmic}
\STATE do something \LONGCOMMENT{this is a comment broken over lines}
\end{algorithmic}
\begin{algorithmic}
\STATE do something else \COMMENT{this is another comment}
\end{algorithmic}
\end{document}
if condition then
something \hspace{2in} # comment 1
else
something else \hfill # comment 2
I'm not sure if the hspace and hfill will work inside an environment. I assume that they will.
\hfill will set the comments flush right, while \hspace{space} will give you that much space between your text. good luck.
If you want own indentions for different algorithms, you could do this by including the counter in the redefinition of the comment commands. Here is an example:
\documentclass{amsbook}
\usepackage{algorithmicx,algorithm,eqparbox,array}
\algrenewcommand{\algorithmiccomment}[1]{\hfill// \eqparbox{COMMENT\thealgorithm}{#1}}
\algnewcommand{\LongComment}[1]{\hfill// \begin{minipage}[t]{\eqboxwidth{COMMENT\thealgorithm}}#1\strut\end{minipage}}
\begin{document}
\begin{algorithm}
\begin{algorithmic}
\State{do nothing}\Comment{huh?}
\end{algorithmic}
\caption{Test Alg}
\end{algorithm}
\begin{algorithm}
\begin{algorithmic}
\State{do something} \LongComment{this is a comment broken over lines}
\State{do something else} \Comment{this is another comment}
\end{algorithmic}
\caption{Other Alg}
\end{algorithm}
\end{document}