Apperance of no-break space character in typing - macos

Using TeXShop to typeset LaTeX, I often come across the error Package inputenc error: Character \u8 not set up for use with LaTeX. That, I have learnt, is due to the fact that, for some reason, some spaces become "no-break space"s (U+00A0), which apparently inputenc doesn't like. So this is NOT a LaTeX question, but just one that was brought up by LaTeX. It might be about TeXShop or about I don't know what, but the LaTeX part is definitely solved. So the question is: why does it turn up? Is it a shortcut I am unaware of (I'm om Mac OS X 10.7.5), a TeXShop specific thing, or something else?
PS I'm not sure if the tag is appropriate. Were I not forced to give at least one, I probably would have given none. LaTeX, as stated above, is definitely NOT an appropriate tag for this question. The one I had put was probably more appropriate. Anyway I'll have a look at the list of popular tags (if I find one) and change the tag to the one that seems most appropriate to me.

As Wikipedia states on this page, Alt+Space on Mac inputs the No-Break Space. In speedy typing, I probably inadvertently press Alt while typing the space, resulting in this problem.

Related

How to edit the original string column in poedit?

Hi I am using poedit editor and i am not able to edit the original string column. I want to edit few words on original string column. Thanks in Advance
(Oh well, I'll answer it here as well, for the benefit of the people who may find this post. But you won't like the answer any more than when I replied to you yesterday, when you asked on the Poedit mailing and in personal email to me.)
Short answer is: you can't. Read the introductory sections of the GNU gettext manual -- it explains the basic concepts of gettext translations very well, from both the programmer's and translator's perspectives, and it's clear you don't understand the concept of gettext.
Really, I mean it: please, please, read at the lest the intro part of the manual. The fine folks from the GNU gettext project put a lot of effort into it and if you've spent 5-10 minutes with it, you wouldn't need to ask this question.
Longer version:
Gettext uses text strings (typically in English) in the source code as translation keys. And it has tools to extract the strings and put them into a PO file. This ensures that only strings that are actually used are translated.
Changing the original string (called msgid -- it really is an ID) makes no sense. You would then have a translation of a string that is never used in the source code and so the translation would be guaranteed to never be used. Way to waste the translator's time, wouldn't it?
Want to "edit a few words"? Edit them in the source code. That's the only way that can ever work with gettext.
What Vaclav is saying is very true. If you change the actual source string, the system won't read it.
In Poedit, simply select the string from the long list that you want to edit then you will see that string in the Source Text at the bottom of the screen. Then in the Translation box, enter your preferred wording. Don't forget to include any variable aswell otherwise your change won't work.
What Vaclav is saying is very false.
You can change it. Open the file with notepad. Ctrl + F the original text that you want to change. Change it, and then save it. Then open the file with po edit again, and you will see the changed text.

Windows 'choice' command messing up Ruby 'gets' method

Open up irb and
type gets. It should work fine.
Then try system("choice /c YN") It should work as expected.
Now try gets again, it behaves oddly.
Can someone tell me why this is?
EDIT: For some clarification on the "odd" behavior, it allows me to type for gets, but doesn't show me the characters and I have to press the enter key twice.
Terminal input-output handling is dark and mysterious art. Anyone trying to make colorized output of bash work in windows PowerShell via ssh knows that. (And various shortcutting habits like Ctrl+Backspace only make things worse.)
One of the possible reasons for your problem is special characters handling. Every terminal out there can type characters in number of different modes, and it parses its own output in search for certain character sequences in order to toggle states.
F.e. here one can find ANSI escape code sequences, one of possible supported standards among different kind of terminals.
See there Esc[5;45m? That will make all the following output to blink on magenta background. And there is significantly more stuff like that out there.
So, the answer to your question taken literally is — your choice command messes something with output modes using special escape sequences, and ruby's gets breaks in that quirk special mode of terminal operation.
But more useful will be the link to HighLine gem documentation. Why one might want to implement platform-specific and obtrusive behavior when it is possible to implement the same with about 12 LOC? All the respect for the Gist goes to botimer, I've only stumbled into his code using search.

Utility to Stamp/Watermark Unicode Text Into a PDF

I am looking for a (preferably) command line utility to stamp/watermark unicode text content into a PDF document.
I tried PDF Stamp and a couple of others that I found over the net, but to no avail with Greek characters (e.g. ΓΔΘΛ become ÃÄÈË).
Many thanks for any help!
With sufficiently "odd" characters, you generally need to specify a font and an encoding. I suspect that at least one of the tools you experimented with have the capability to define such things.
Reading their docs, it looks like PDFStamp will let you specify a font, but not an encoding. That doesn't bode well. It might always pick "Identity-H" for system fonts... worth trying.
I must admit, I'm surprised. "Disappointed" even. Have you contacted their email support?
Once upon a time, iText shipped with a number of command line tools that were mostly intended as examples but were none the less useful. I suspect you could dig them out of the SVN archive on sourceforge and get them to build again, if your Java-fu is up to the task. Just be sure to use BaseFont.IDENTITY_H whenever you're given a choice of encodings for a font.

Why do people use plain english as translation placeholders?

This may be a stupid question, but here goes.
I've seen several projects using some translation library (e.g. gettext) working with plain english placeholders. So for example:
_("Please enter your name");
instead of abstract placeholders (which has always been my instinctive preference)
_("error_please_enter_name");
I have seen various recommendations on SO to work with the former method, but I don't understand why. What I don't get is what do you do if you need to change the english wording? Because if the actual text is used as the key for all existing translations, you would have to edit all the translations, too, and change each key. Or don't you?
Isn't that awfully cumbersome? Why is this the industry standard?
It's definitely not proper normalization to do it this way. Are there massive advantages to this method that I'm not seeing?
Yes, you have to alter the existing translation files, and that is a good thing.
If you change the English wording, the translations probably need to change, too. Even if they don't, you need someone who speaks the other language to check.
You prep a new version, and part of the QA process is checking the translations. If the English wording changed and nobody checked the translation, it'll stick out like a sore thumb and it'll get fixed.
The main language is already existent: you don't need to translate it.
Translators have better context with a real sentence than vague placeholders.
The placeholders are just the keys, it's still possible to change the original language by creating a translation for it. Because when the translation doesn't exists, it uses the placeholder as the translated text.
We've been using abstract placeholders for a while and it was pretty annoying having to write everything twice when creating a new function. When English is the placeholder, you just write the code in English, you have meaningful output from the start and don't have to think about naming placeholders.
So my reason would be less work for the developers.
I like your second approach. When translating texts you always have the problem of homonyms. Like 'open' can mean a state of a window but also the verb to perform the action. In other languages these homonyms may not exist. That's why you should be able to add meaning to your placeholders. Best approach is to put this meaning in your text library. If this is not possible on the platform the framework you use, it might be a good idea to define a 'development language'. This language will add meaning to the text entries like: 'action_open' and 'state_open'. you will off course have to put extra effort i translating this language to plain english (or the language you develop for). I have put this philosophy in some large projects and in the long run this saves some time (and headaches).
The best way in my opinion is keeping meaning separate so if you develop your own translation library or the one you use supports it you can do something like this:
_(i18n("Please enter your name", "error_please_enter_name"));
Where:
i18n(text, meaning)
Interesting question. I assume the main reason is that you don't have to care about translation or localization files during development as the main language is in the code itself.
Well it probably is just that it's easier to read, and so easier to translate. I'm of the opinion that your way is best for scalability, but it does just require that extra bit of effort, which some developers might not consider worth it... and for some projects, it probably isn't.
There's a fallback hierarchy, from most specific locale to the unlocalised version in the source code.
So French in France might have the following fallback route:
fr_FR
fr
Unlocalised. Source code.
As a result, having proper English sentences in the source code ensures that if a particular translation is not provided for in step (1) or (2), you will at least get a proper understandable sentence than random programmer garbage like “error_file_not_found”.
Plus, what do you do if it is a format string: “Sorry but the %s does not exist” ? Worse still: “Written %s entries to %s, total size: %d” ?
Quite old question but one additional reason I haven't seen in the answers yet:
You could end up with more placeholders than necessary, thus more work for translators and possible inconsistent translations. However, good editors like Poedit or Gtranslator can probably help with that.
To stick with your example:
The text "Please enter your name" could appear in a different context in a different template (that the developer is most likely not aware of and shouldn't need to be). E.g. it could be used not as an error but as a prompt like a placeholder of an input field.
If you use
_("Please enter your name");
it would be reusable, the developer can be unaware of the already existing key for an error message and would just use the same text intuitively.
However, if you used
_("error_please_enter_name");
in a previous template, developers wouldn't necessarily be aware of it and would make up a second key (most likely according to a predefined wording scheme to not end up in complete chaos), e.g.
_("prompt_please_enter_name");
which then has to be translated again.
So I think that doesn't scale very well. A pre-agreed wording scheme of suffixes/prefixes e.g. for contexts can never be as precise as the text itself I think (either too verbose or too general, beforehand you don't know and afterwards it's difficult to change) and is more work for the developer that's not worth it IMHO.
Does anybody agree/disagree?

How can I undo more than a single character in TextMate?

TextMate may be the best editor out there, but is has a big disadvantage: it undoes each character typed instead of grouping characters. This makes a large undo tedious!
Do you now any hacks, plugins or workarounds to fix this issue?
I know the developer's been promising a fix for years now, and it's something the user community complains about a lot. But I don't think I've seen anything more useful than "hold down Cmd-Z to repeat".
This is not exactly what you're asking for but more a work around because as dnord said, this is a fundamental issue with TextMate and won't be fixed until Allen Odegard decides to fix it.
Have you considered trying one of the clipboard managers out there? At least that way you can clip chunks of text and 'undo' them at will.
I use Jumpcut because it's free and does the job.
Just in case anyone doesn't already know this: Shift-Option-Arrow lets you select things word by word.
If you make a habit of selecting whole words before deleting text, you'll save time when you do and when you undo.

Resources